Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 13201 to 13300 out of 30763 for seed protein

Category restricted to ProteinDomain (x)

0.033s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Name: VirE1
Type: Family
Description: The VirE1 family of proteins is found in bacteria. Proteins in this family are approximately 60 amino acids in length and contain a conserved IELE sequence motif.VirE1 is an acidic chaperone protein which binds to VirE2, a ssDNA binding protein [ ]. These proteins are virulence factors of the plant pathogen Agrobacterium tumefaciens.
Protein Domain
Name: DJBP, EF-hand domain
Type: Domain
Description: This domain is found in DJ binding protein DJBP.This domain is found in DJ binding protein DJBP. DJBP or EF-hand calcium-binding domain-containing protein 6 is a DJ-1-binding protein that negatively regulates the androgen receptor by recruiting the histone deacetylase complex. Protein DJ-1 antagonises this inhibition by abrogation of this complex [ ].
Protein Domain
Name: DVNP family
Type: Family
Description: DVNPs (dinoflagellate-viral-nucleoproteins) are a group of viral-derived proteins found in dinoflagellates. It has been hypothesized that DVNPs could have been transferred from viruses to dinoflagellate progenitors with canonical chromatin and eventually replaced the majority of histones as chromatin packaging proteins [ ]. This entry includes dinoflagellate viral nucleoprotein 5 (DNVP5) and related proteins.
Protein Domain
Name: Aconitase B
Type: Family
Description: Aconitase (aconitate hydratase; ) is an iron-sulphur protein that contains a [4Fe-4S]-cluster and catalyses the interconversion of isocitrate and citrate via a cis-aconitate intermediate. Aconitase functions in both the TCA and glyoxylate cycles, however unlike the majority of iron-sulphur proteins that function as electron carriers, the [4Fe-4S]-cluster of aconitase reacts directly with an enzyme substrate. In eukaryotes there is a cytosolic form (cAcn) and a mitochondrial form (mAcn) of the enzyme. In bacteria there are also 2 forms, aconitase A (AcnA) and B (AcnB). Several aconitases are known to be multi-functional enzymes with a second non-catalytic, but essential function that arises when the cellular environment changes, such as when iron levels drop [ , ]. Eukaryotic cAcn and mAcn, and bacterial AcnA have the same domain organisation, consisting of three N-terminal alpha/beta/alpha domains, a linker region, followed by a C-terminal 'swivel' domain with a beta/beta/alpha structure (1-2-3-linker-4), although mAcn is smaller than cAcn. However, bacterial AcnB has a different organisation: it contains an N-terminal HEAT-like domain, followed by the 'swivel' domain, then the three alpha/beta/alpha domains (HEAT-4-1-2-3) [].Eukaryotic cAcn enzyme balances the amount of citrate and isocitrate in the cytoplasm, which in turn creates a balance between the amount of NADPH generated from isocitrate by isocitrate dehydrogenase with the amount of acetyl-CoA generated from citrate by citrate lyase. Fatty acid synthesis requires both NADPH and acetyl-CoA, as do other metabolic processes, including the need for NADPH to combat oxidative stress. The enzymatic form of cAcn predominates when iron levels are normal, but if they drop sufficiently to cause the disassembly of the [4Fe-4S]-cluster, then cAcn undergoes a conformational change from a compact enzyme to a more open L-shaped protein known as iron regulatory protein 1 (IRP1; or IRE-binding protein 1, IREBP1) [, ]. As IRP1, the catalytic site and the [4Fe-4S]-cluster are lost, and two new RNA-binding sites appear. IRP1 functions in the post-transcriptional regulation of genes involved in iron metabolism - it binds to mRNA iron-responsive elements (IRE), 30-nucleotide stem-loop structures at the 3' or 5' end of specific transcripts. Transcripts containing an IRE include ferritin L and H subunits (iron storage), transferrin (iron plasma chaperone), transferrin receptor (iron uptake into cells), ferroportin (iron exporter), mAcn, succinate dehydrogenase, erythroid aminolevulinic acid synthetase (tetrapyrrole biosynthesis), among others. If the IRE is in the 5'-UTR of the transcript (e.g. in ferritin mRNA), then IRP1-binding prevents its translation by blocking the transcript from binding to the ribosome. If the IRE is in the 3'-UTR of the transcript (e.g. transferrin receptor), then IRP1-binding protects it from endonuclease degradation, thereby prolonging the half-life of the transcript and enabling it to be translated [ ].IRP2 is another IRE-binding protein that binds to the same transcripts as IRP1. However, since IRP1 is predominantly in the enzymatic cAcn form, it is IRP2 that acts as the major metabolic regulator that maintains iron homeostasis [ ]. Although IRP2 is homologous to IRP1, IRP2 lacks aconitase activity, and is known only to have a single function in the post-transcriptional regulation of iron metabolism genes []. In iron-replete cells, IRP2 activity is regulated primarily by iron-dependent degradation through the ubiquitin-proteasomal system.Bacterial AcnB is also known to be multi-functional. In addition to its role in the TCA cycle, AcnB was shown to be a post-transcriptional regulator of gene expression in Escherichia coli and Salmonella enterica [ , ]. In S. enterica, AcnB initiates a regulatory cascade controlling flagella biosynthesis through an interaction with the ftsH transcript, an alternative RNA polymerase sigma factor. This binding lowers the intracellular concentration of FtsH protease, which in turn enhances the amount of RNA polymerase sigma32 factor (normally degraded by FtsH protease), and sigma32 then increases the synthesis of chaperone DnaK, which in turn promotes the synthesis of the flagellar protein FliC. AcnB regulates the synthesis of other proteins as well, such as superoxide dismutase (SodA) and other enzymes involved in oxidative stress.This entry represents bacterial aconitase B (AcnB) enzymes, which can switch between aconitase enzyme activity and post-translational gene regulation. An iron-mediated dimerisation mechanism may be responsible for switching AcnB between its catalytic and regulatory roles, as dimerisation requires iron while mRNA binding is inhibited by iron.
Protein Domain
Name: ABC transporter, F420-0 import, periplasmic substrate-binding protein, predicted
Type: Family
Description: ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [ ].The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [, , ].The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly β-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel β-sheet of armI by a two-fold axis [ , , , , , ].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions [ ]. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [, ]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [, , ].This entry represents a small clade of ABC-type transporter periplasmic substrate-binding proteins encoded as part of a three gene cassette along with a permease ( ) and an ATPase ( ). The organisms containing this cassette are all Actinobacteria and contain numerous proteins requiring the coenzyme F420. The model in this entry was defined based on five such organisms, four of which are lacking all F420 biosynthetic capability save the final side-chain polyglutamate attachment step (via the gene cofE: ). In Jonesia denitrificans DSM 20603 and marine actinobacterium PHSC20C1 this cassette is in an apparent operon with the cofE gene and, in PHSC20C1, also with a F420-dependent glucose-6-phosphate dehydrogenase ( ). Based on these observations this periplasmic substarte-binding protein is predicted to be a component of an F420-0 (that is, F420 lacking only the polyglutamate tail) transporter.
Protein Domain
Name: Helper component proteinase
Type: Family
Description: This enrry represents the potyvirus helper component protease found in genome polyproteins of potyviruses. It is is a cysteine peptidase belonging to the MEROPS peptidase family C6 (clan CA). The helper component-proteinase is required for aphid transmission.A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families []. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid, N-ethylmaleimide or p-chloromercuribenzoate. Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [ ].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [ ]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [ ]. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.
Protein Domain
Name: Peptidase C21
Type: Domain
Description: This entry is found in cysteine peptidases belong to the MEROPS peptidase family C21 (tymovirus endopeptidase family, clan CA). The type example is tymovirus endopeptidase (turnip yellow mosaic virus). The noncapsid protein expressed from ORF-206 of turnip yellow mosaic virus (TYMV) is autocatalytically processed by a papain-like protease, producing N-terminal 150kDa and C-terminal 70kDa proteins.A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families [ ]. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid, N-ethylmaleimide or p-chloromercuribenzoate. Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [ ].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [ ]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [ ]. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.
Protein Domain
Name: Peptidase C28, foot-and-mouth virus L-proteinase
Type: Domain
Description: This group of cysteine peptidases belong to MEROPS peptidase family C28 (clan CA).The protein fold of the peptidase unit for members of this family resembles that of papain.The leader peptidase of Foot-and-mouth disease virus cleaves itself from the growing polyprotein and also cleaves the host translation initiation factor 4GI (eIF4G), thus inhibiting 5'-cap dependent translation [ ].A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families [ ]. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid, N-ethylmaleimide or p-chloromercuribenzoate. Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [ ].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [ ]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [ ]. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.
Protein Domain
Name: Aconitase, mitochondrial-like
Type: Family
Description: Aconitase (aconitate hydratase; ) is an iron-sulphur protein that contains a [4Fe-4S]-cluster and catalyses the interconversion of isocitrate and citrate via a cis-aconitate intermediate. Aconitase functions in both the TCA and glyoxylate cycles, however unlike the majority of iron-sulphur proteins that function as electron carriers, the [4Fe-4S]-cluster of aconitase reacts directly with an enzyme substrate. In eukaryotes there is a cytosolic form (cAcn) and a mitochondrial form (mAcn) of the enzyme. In bacteria there are also 2 forms, aconitase A (AcnA) and B (AcnB). Several aconitases are known to be multi-functional enzymes with a second non-catalytic, but essential function that arises when the cellular environment changes, such as when iron levels drop [ , ]. Eukaryotic cAcn and mAcn, and bacterial AcnA have the same domain organisation, consisting of three N-terminal alpha/beta/alpha domains, a linker region, followed by a C-terminal 'swivel' domain with a beta/beta/alpha structure (1-2-3-linker-4), although mAcn is smaller than cAcn. However, bacterial AcnB has a different organisation: it contains an N-terminal HEAT-like domain, followed by the 'swivel' domain, then the three alpha/beta/alpha domains (HEAT-4-1-2-3) [].Eukaryotic cAcn enzyme balances the amount of citrate and isocitrate in the cytoplasm, which in turn creates a balance between the amount of NADPH generated from isocitrate by isocitrate dehydrogenase with the amount of acetyl-CoA generated from citrate by citrate lyase. Fatty acid synthesis requires both NADPH and acetyl-CoA, as do other metabolic processes, including the need for NADPH to combat oxidative stress. The enzymatic form of cAcn predominates when iron levels are normal, but if they drop sufficiently to cause the disassembly of the [4Fe-4S]-cluster, then cAcn undergoes a conformational change from a compact enzyme to a more open L-shaped protein known as iron regulatory protein 1 (IRP1; or IRE-binding protein 1, IREBP1) [ , ]. As IRP1, the catalytic site and the [4Fe-4S]-cluster are lost, and two new RNA-binding sites appear. IRP1 functions in the post-transcriptional regulation of genes involved in iron metabolism - it binds to mRNA iron-responsive elements (IRE), 30-nucleotide stem-loop structures at the 3' or 5' end of specific transcripts. Transcripts containing an IRE include ferritin L and H subunits (iron storage), transferrin (iron plasma chaperone), transferrin receptor (iron uptake into cells), ferroportin (iron exporter), mAcn, succinate dehydrogenase, erythroid aminolevulinic acid synthetase (tetrapyrrole biosynthesis), among others. If the IRE is in the 5'-UTR of the transcript (e.g. in ferritin mRNA), then IRP1-binding prevents its translation by blocking the transcript from binding to the ribosome. If the IRE is in the 3'-UTR of the transcript (e.g. transferrin receptor), then IRP1-binding protects it from endonuclease degradation, thereby prolonging the half-life of the transcript and enabling it to be translated [ ].IRP2 is another IRE-binding protein that binds to the same transcripts as IRP1. However, since IRP1 is predominantly in the enzymatic cAcn form, it is IRP2 that acts as the major metabolic regulator that maintains iron homeostasis [ ]. Although IRP2 is homologous to IRP1, IRP2 lacks aconitase activity, and is known only to have a single function in the post-transcriptional regulation of iron metabolism genes []. In iron-replete cells, IRP2 activity is regulated primarily by iron-dependent degradation through the ubiquitin-proteasomal system.Bacterial AcnB is also known to be multi-functional. In addition to its role in the TCA cycle, AcnB was shown to be a post-transcriptional regulator of gene expression in Escherichia coli and Salmonella enterica [ , ]. In S. enterica, AcnB initiates a regulatory cascade controlling flagella biosynthesis through an interaction with the ftsH transcript, an alternative RNA polymerase sigma factor. This binding lowers the intracellular concentration of FtsH protease, which in turn enhances the amount of RNA polymerase sigma32 factor (normally degraded by FtsH protease), and sigma32 then increases the synthesis of chaperone DnaK, which in turn promotes the synthesis of the flagellar protein FliC. AcnB regulates the synthesis of other proteins as well, such as superoxide dismutase (SodA) and other enzymes involved in oxidative stress.This entry represents mitochondrial aconitase (mAcn), as well as close homologues such as certain bacterial aconitase A (AcnA) enzymes.
Protein Domain
Name: EDG-5 sphingosine 1-phosphate receptor
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].Lysophospholipids (LPs), such as lysophosphatidic acid (LPA), sphingosine 1-phosphate (S1P) and sphingosylphosphorylcholine (SPC), have long been known to act as signalling molecules in addition to their roles as intermediates in membrane biosynthesis[ ]. They have roles in the regulation of cell growth, differentiation, apoptosis and development, and have been implicated in a wide range of pathophysiological conditions, including: blood clotting, corneal wounding, subarachinoid haemorrhage, inflammation and colitis []. A number of G protein-coupled receptors bind members of the lysophopholipid family - these include: the cannabinoid receptors; platelet activating factor receptor; OGR1, an SPC receptor identified in ovarian cancer cell lines; PSP24, an orphan receptor that has been proposed to bind LPA; and at least 8 closely related receptors, the EDG family, that bind LPA and S1P [].S1P is released from activated platelets and is also produced by a number of other cell types in response to growth factors and cytokines [ ]. It is proposed to act both as an extracellular mediator and as an intracellularsecond messenger. The cellular effects of S1P include growth related effects, such as proliferation, differentiation, cell survival and apoptosis, and cytoskeletal effects, such as chemotaxis, aggregation, adhesion, morphological change and secretion. The molecule has been implicated in control of angiogenesis, inflammation, heart-rate and tumour progression, and may play an important role in a number of disease states, such as atherosclerosis, and breast and ovarian cancer [ ]. Recently, 5 G protein-coupled receptors have been identified that act as high affinity receptors for S1P, and also as low affinity receptors for the related lysophospholipid, SPC []. EDG-1, EDG-3, EDG-5 and EDG-8 share a high degree of similarity, and are also referred to as lpB1, lpB3, lpB2 and lpB4, respectively. EDG-6 is referred to as lpC1, reflecting its more distant relationship to the other S1P receptors.EDG-5 is expressed abundantly in the heart and lung and at lower levels in the adult brain. It is also expressed strongly in the embryonic brain [, ]. Binding of S1P to EDG-5 activates G proteins of the Gi and Gq classes. G12 and G13 proteins are also constitutively activated by the receptor. These couplings produce a wide range of cellular effects, including: increased cyclic AMP and calcium levels, activation of MAP kinases and actinrearrangement [ , ]. The receptor may have a role in neuronal development and, in zebrafish, has been found to be involved in the control of cell migration during development and organogenesis of the heart [].
Protein Domain
Name: EDG-3 sphingosine 1-phosphate receptor
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].Lysophospholipids (LPs), such as lysophosphatidic acid (LPA), sphingosine 1-phosphate (S1P) and sphingosylphosphorylcholine (SPC), have long been known to act as signalling molecules in addition to their roles as intermediates in membrane biosynthesis []. They have roles in the regulation of cell growth, differentiation, apoptosis and development, and have been implicated in a wide range of pathophysiological conditions, including: blood clotting, corneal wounding, subarachinoid haemorrhage, inflammation and colitis []. A number of G protein-coupled receptors bind members of the lysophopholipid family - these include: the cannabinoid receptors; platelet activating factor receptor; OGR1, an SPC receptor identified in ovarian cancer cell lines; PSP24, an orphan receptor that has been proposed to bind LPA; and at least 8 closely related receptors, the EDG family, that bind LPA and S1P [].S1P is released from activated platelets and is also produced by a number of other cell types in response to growth factors and cytokines [ ]. It is proposed to act both as an extracellular mediator and as an intracellularsecond messenger. The cellular effects of S1P include growth related effects, such as proliferation, differentiation, cell survival and apoptosis, and cytoskeletal effects, such as chemotaxis, aggregation, adhesion, morphological change and secretion. The molecule has been implicated in control of angiogenesis, inflammation, heart-rate and tumour progression, and may play an important role in a number of disease states, such as atherosclerosis, and breast and ovarian cancer [ ]. Recently, 5 G protein-coupled receptors have been identified that act as high affinity receptors for S1P, and also as low affinity receptors for the related lysophospholipid, SPC []. EDG-1, EDG-3, EDG-5 and EDG-8 share a high degree of similarity, and are also referred to as lpB1, lpB3, lpB2 and lpB4, respectively. EDG-6 is referred to as lpC1, reflecting its more distant relationship to the other S1P receptors.EDG-3 is expressed at highest levels in the heart, kidney, placenta and liver of humans, with lower levels found in the lung []. In mouse, highest levels are found in the heart, lung, kidney and spleen, with lower levels in the brain, thymus, muscle and testis []. The receptor has also been found in rat Schwann cells, mouse embryonic brain and breast cancer cells []. Binding of S1P to EDG-3 leads to activation of Gi and Gq classes of G proteins. G12 and G13 can also be constitutively activated by the receptor []. These G proteins produce a range of effects, including: inhibition or activation or adenylyl cylase, MAP kinase activation, serum response element activation and phospholipase C activation, leading to cell proliferation and survival [, ].
Protein Domain
Name: Potassium channel, inwardly rectifying, Kir3.2
Type: Family
Description: Potassium channels are the most diverse group of the ion channel family [ , ]. They are important in shaping the action potential, and in neuronal excitability and plasticity []. The potassium channel family is composed of several functionally distinct isoforms, which can be broadly separated into 2 groups []: the practically non-inactivating 'delayed' group and the rapidly inactivating 'transient' group.These are all highly similar proteins, with only small amino acid changes causing the diversity of the voltage-dependent gating mechanism, channel conductance and toxin binding properties. Each type of K +channel is activated by different signals and conditions depending on their type of regulation: some open in response to depolarisation of the plasma membrane; others in response to hyperpolarisation or an increase in intracellular calcium concentration; some can be regulated by binding of a transmitter, together with intracellular kinases; while others are regulated by GTP-binding proteins or other second messengers [ ]. In eukaryotic cells, K+channels are involved in neural signalling and generation of the cardiac rhythm, act as effectors in signal transduction pathways involving G protein-coupled receptors (GPCRs) and may have a role in target cell lysis by cytotoxic T-lymphocytes [ ]. In prokaryotic cells, they play a role in the maintenance of ionic homeostasis [].All K +channels discovered so far possess a core of alpha subunits, each comprising either one or two copies of a highly conserved pore loop domain (P-domain). The P-domain contains the sequence (T/SxxTxGxG), which has been termed the K +selectivity sequence. In families that contain one P-domain, four subunits assemble to form a selective pathway for K +across the membrane. However, it remains unclear how the 2 P-domain subunits assemble to form a selective pore. The functional diversity of these families can arise through homo- or hetero-associations of alpha subunits or association with auxiliary cytoplasmic beta subunits. K +channel subunits containing one pore domain can be assigned into one of two superfamilies: those that possess six transmembrane (TM) domains and those that possess only two TM domains. The six TM domain superfamily can be further subdivided into conserved gene families: the voltage-gated (Kv) channels; the KCNQ channels (originally known as KvLQT channels); the EAG-like K +channels; and three types of calcium (Ca)-activated K +channels (BK, IK and SK) [ ]. The 2TM domain family comprises inward-rectifying K+channels. In addition, there are K +channel alpha-subunits that possess two P-domains. These are usually highly regulated K +selective leak channels. Inwardly-rectifying potassium channels (Kir) are the principal class of two-TM domain potassium channels. They are characterised by the property of inward-rectification, which is described as the ability to allow large inward currents and smaller outward currents. Inwardly rectifying potassium channels (Kir) are responsible for regulating diverse processes including: cellular excitability, vascular tone, heart rate, renal salt flow, and insulin release [ ]. To date, around twenty members of this superfamily have been cloned, which can be grouped into six families by sequence similarity, and these are designated Kir1.x-6.x [, ].Cloned Kir channel cDNAs encode proteins of between ~370-500 residues, both N- and C-termini are thought to be cytoplasmic, and the N terminus lacks a signal sequence. Kir channel alpha subunits possess only 2TM domains linked with a P-domain. Thus, Kir channels share similarity with the fifth and sixth domains, and P-domain of the other families. It is thought that four Kir subunits assemble to form a tetrameric channel complex, which may be hetero- or homomeric [ ].The Kir3.x channel family is gated by G-proteins following G-protein coupled receptor (GPCR) activation. They are widely distributed inneuronal, atrial, and endocrine tissues and play key roles in generating late inhibitory postsynaptic potentials, slowing the heart rate andmodulating hormone release. They are directly activated by G-protein beta-gamma subunits released from G-protein heterotrimers of the G(i/o)family upon appropriate receptor stimulation.Kir3.2 is thought to associate with Kir3.1 to form Kir channel heteromers in heart tissue. In central neurones, Kir3.2 homomers may exist, althoughthese may contain combinations of the three splice variants of Kir3.2 that have been identified []. Weaver mice, which suffer neurological andreproductive deficits, have a point mutation in the gene encoding Kir3.2. This lies in the pore-forming domain of the channel, and as a result theylose their selectivity for K+, allowing Na+ to pass through the channel pore [].
Protein Domain
Name: DnaJ homologue, subfamily C, member 28, conserved domain
Type: Domain
Description: This entry represents a family of proteins that may have a role in protein folding or as a chaperone. DnaJ is a member of the J-protein family, which are defined by the presence of a J domain that can regulate the activity of 70kDa heat-shock proteins []. Some of the proteins in this entry contain a J domain.
Protein Domain
Name: Phycocyanin, alpha subunit
Type: Family
Description: This family represent the phycocyanin alpha subunit. Homologous phyobiliproteins of the phycobilisome include phycocyanin alpha chain and the allophycocyanin and phycoerythrin alpha and beta chains. Not included are the closely related phycoerythrocyanin alpha subunit sequences.Phycocyanin is the major phycobiliprotein in the phycobilisome (PBS) rod, including the light-harvesting photosynthetic bile pigment-protein and the tetrapyrrole chromophore-protein from the phycobiliprotein complex [].
Protein Domain
Name: Spike glycoprotein, Alphacoronavirus
Type: Family
Description: Coronovirus enter target cells through fusion of viral and cellular membranesmediated by the viral envelope glycoprotein S. Trimers of Coronovirus S glycoprotein constitute the typical viral spikes. The precursor S protein is processed by host cell furin or furin-like protease to yield the mature S1 and S2 proteins [ ].This entry represents the spike glycoprotein from Alphacoronavirus.
Protein Domain
Name: Tantalus-like
Type: Domain
Description: This entry consist of an alpha+beta fold domain found in metazoan proteins such as tantalus from Drosophila []. Tantalus is a potential cofactor involved in sensory organ development. It binds the chromatin protein additional sex combs (Asx) and also binds DNA in vitro []. Proteins containing this domain also include proline-rich protein 14 (PRR14) and related proteins from mammals.
Protein Domain
Name: Herpesvirus envelope glycoprotein N domain
Type: Domain
Description: This entry represents a conserved region found in a number of viral proteins: BLRF1, U46, 53, and UL73, collectively known as glycoprotein N. These UL73-like envelope glycoproteins, which associate in a high molecular mass complex with their counterpart protein gM, induce neutralizing antibody responses in the host. These glycoproteins are highly polymorphic, particularly in the N-terminal region [ ].
Protein Domain
Name: Kinase associated domain 1 (KA1)
Type: Domain
Description: Members of the KIN2/PAR-1/MARK kinase subfamily are conserved from yeast to human and share the same domain organisation: an N-terminal kinase domain ( ) and a C-terminal kinase associated domain 1 (KA1). Some members of the KIN1/PAR-1/MARK family also contain an UBA domain ( ). Members of this kinase subfamily are involved in various biological processes such as cell polarity, cell cycle control, intracellular signalling, microtubule stability and protein stability [ ]. The function of the KA1 domain is not yet known but several studies strongly suggest that it is involved in protein localisation. In addition, it has been reported that this C-terminal region acts as an autoinhibitory domain for the N-terminal kinase domain [].Some proteins known to contain a KA1 domain are listed below:Mammalian MAP/microtubule affinity-regulating kinases (MARK 1,2,3). They regulate polarity in neuronal cell models and appear to function redundantly in phosphorylating MT-associated proteins and in regulating MT stability [ ].Mammalian maternal embryonic leucine zipper kinase (MELK). It phosphorylates ZNF622 and may contribute to its redirection to the nucleus. It may be involved in the inhibition of spliceosome assembly during mitosis.Caenorhabditis elegans and drosophila PAR-1 protein. It is required for establishing polarity in embryos where it is asymmetrically distributed [ ].Fungal Kin1 and Kin2 protein kinases involved in regulation of exocytosis. They localise to the cytoplasmic face of the plasma membrane [ ].The KA1 domain comprises about 50 amino acid residues and end in the highly conserved Glu-Leu-Lys-Leu motif, termed the ELKL motif which forms a concave surface surrounded by positively charged residues, being important for the KA1 domain function. This domain adopts a compact α+β structure with a β-α-β-β-β-β-α topology [ ].
Protein Domain
Name: Zinc finger, RING-H2-type
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This domain constitutes a conserved region found in proteins that participate in diverse functions relevant to chromosome metabolism and cell cycle control [ ].The domain contains 8 cysteine/ histidine residues which are proposed to be the conserved residues involved in zinc binding.
Protein Domain
Name: Dihydroorotase, conserved site
Type: Conserved_site
Description: This group contains a number of protein families, example are:Archaeal and bacterial dihydroorotase ( ) (DHOase) Allantoinase ( ) Dihydroorotase belongs to MEROPS peptidase family M38 (clan MJ), where it is classified as a non-peptidase homologue. DHOase catalyses the third step in the de novobiosynthesis of pyrimidine, the conversion of ureidosuccinic acid (N-carbamoyl-L-aspartate) into dihydroorotate. Dihydroorotase binds a zinc ion which is required for its catalytic activity [ ].In bacteria, DHOase is a dimer of identical chains of about 400 amino-acid residues (gene pyrC) [ ]. In higher eukaryotes, DHOase is part of a large multi-functional protein known as 'rudimentary' in Drosophila melanogaster and CAD in mammals and which catalyzes the first three steps of pyrimidine biosynthesis []. The DHOase domain is located in the central part of this polyprotein. In yeasts, DHOase is encoded by a monofunctional protein (gene URA4). However, a defective DHOase domain [] is found in a multifunctional protein (gene URA2) that catalyzes the first two steps of pyrimidine biosynthesis.The comparison of DHOase sequences from various sources shows [ ] that there are two highly conserved regions. The first located in the N-terminal extremity contains two histidine residues suggested [] to be involved in binding the zinc ion. The second is found in the C-terminal part. Members of this family of proteins are predicted to adopt a TIM barrel fold [].Allantoinase ( ) is the enzyme that hydrolyzes allantoin into allantoate. In yeast (gene DAL1) [ ], it is the first enzyme in the allantoin degradation pathway; in amphibians [] and fishs it catalyzes the second step in the degradation of uric acid. The sequence of allantoinase is evolutionary related to that of DHOases.
Protein Domain
Name: Zinc finger, GRF-type
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few [ ]. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This presumed zinc-binding domain is found in a variety of DNA-binding proteins. It seems likely that this domain is involved in nucleic acid binding. It is named GRF after three conserved residues in the centre of the alignment of the domain. This zinc finger may be related to .
Protein Domain
Name: Phytocyanin domain
Type: Domain
Description: Among the blue copper proteins with a single type I (or "blue") mononuclear copper site, the plant-specific phytocyanins constitute a distinct subfamilythat can be further subdivided into the families of uclacyanins, stellacyanins, plantacyanins, and early nodulins. Stellacyanins have a blue coppercoordinated by two His, one Cys and one Gln. In plantacyanins and uclacyanins, the ligands of the type-I Cu sites are two His, one Cys and one Met [, , , ]. Early nodulins lack amino acid residues that coordinate Cu, so they are believed to be involved in unknown processes without binding Cu []. Phytocyanins are found in chloropasts of higher plants.The phytocyanin domain has a core of seven polypeptide strands arranged as a β-sandwich comprising two β-sheets, β-sheet I and β-sheet II. β-sheet I consists of three β-strands and β-sheet IIconsists of four β-strands. A disulfide bridge close the metal centre is characteristic for phytocyanins, in contrast to azurins, pseudoazurins, andplastocyanins, where a disulfide bond is located on the distal side of the β-barrel. This disuldide bridge may play a crucial role in maintaining thetertiary structure of the protein and/or the formation of the copper binding centre because one of the His ligands of copper is followed directly by abridging Cys residue [ , , , ]. Some members of this family (P93328) may not bind copper due to the lack of key residues. Some proteins known to contain a phytocyanin domain are listed below:Cucumber basic protein (CBP).Spinach basic protein (SBP).Cucumber stellacyanin (CST).Zucchini mavicyanin.Horseradish umecyanin [ , ]. Some of the proteins in this family are allergens. The allergens in this family include allergens with the following designations: Amb a 3.
Protein Domain
Name: Zinc finger, C3HC4 RING-type
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. The C3HC4 type zinc-finger (RING finger) is a cysteine-rich domain of 40 to 60 residues that coordinates two zinc ions, and has the consensus sequence: C-X2-C-X(9-39)-C-X(1-3)-H-X(2-3)-C-X2-C-X(4-48)-C-X2-C where X is any amino acid [ ]. Many proteins containing a RING finger play a key role in the ubiquitination pathway [].
Protein Domain
Name: Transcription regulator AsnC/Lrp, ligand binding domain
Type: Domain
Description: The many bacterial transcription regulation proteins which bind DNA through a 'helix-turn-helix' motif can be classified into subfamilies on the basis ofsequence similarities. One such family is the AsnC/Lrp subfamily [ ]. The Lrp family of transcriptional regulators appears to be widely distributed among bacteria andarchaea, as an important regulatory system of the amino acid metabolism and related processes [ ]. Members of the Lrp family are small DNA-binding proteins with molecular masses of around 15kDa. Target promoters often contain anumber of binding sites that typically lack obvious inverted repeat elements, and to which binding is usually co-operative. LrpA from Pyrococcus furiosus is the first Lrp-like protein to date of which a three-dimensional structurehas been solved. In the crystal structure LrpA forms an octamer consisting of four dimers. The structure revealed that the N-terminal part of the protein consists of ahelix-turn-helix (HTH) domain, a fold generally involved in DNA binding. The C terminus of Lrp-like proteins has a β-fold, where the two α-helices are located at one side of the four-stranded antiparallel β-sheet.LrpA forms a homodimer mainly through interactions between the β-strands of this C-terminal domain, and an octamer through further interactions between the second α-helix and fourth β-strandof the motif. Hence, the C-terminal domain of Lrp-like proteins appears to be involved in ligand-response and activation [].This entry represents the C-terminal regulatory ligand binding domain of the transcription regulator AsnC/Lrp. Structurally this domain has a dimeric alpha/beta barrel fold [ , ]. This domain binds almost exclusively amino acids, but also 4-hydroxyphenylpyruvate and kynurenine (Matilla et. al., FEMS Microbiology Reviews, fuab043, 45, 2021, 1. https://doi.org/10.1093/femsre/fuab043).
Protein Domain
Name: Photosystem II PsbM
Type: Family
Description: Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll 'a' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane [ , , ]. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection []. This entry represents the low molecular weight transmembrane protein PsbM found in PSII. PsbM is one of the most hydrophobic proteins in the thylakoid membrane. The function of this protein is unknown.
Protein Domain
Name: Pertactin, central region
Type: Domain
Description: Bordetella pertussis is a Gram-negative, aerobic coccobacillus that causes pertussis (whooping cough), especially in young children []. Once present in the lungs, the bacterium attaches to ciliated pulmonary epithelial cells via a collection of outer membrane proteins, all of which are virulence factors. Pertactin, or P69 protein, is one of these virulence factors. Pertactin and filamentous haemagglutinin have been identified as Bordetella adhesins []. Both proteins contain an arg-gly-asp (RGD) motif that promotes binding to integrins, known to be important in cell mobility and development. Theproduction of most Bordetella virulence factors (including pertactin) is controlled by a two-component signal transduction system, comprising theBvgA regulator and the BvgS sensor [ ]. Pertactin shares a high level of similarity with other Bordetella adhesins, such as BrkA. The protein isfirst produced as a 93kDa precursor. Upon secretion into the extracellular environment, a 30kDa domain at the C terminus remains in the outer membrane,while the mature 60.4kDa pertactin molecule is released [ ].The crystal structure of mature pertactin has been determined to 2.5A resolution by means of X-ray diffraction. The fold is characterised by a 16-stranded parallel β-helix, with a V-shaped cross-section. Several between-strand amino-acid repeats form internal and external ladders. The helical structure is interrupted by several protruding loops that contain motifs associated with the activity of the protein. One such sequence - [GGXXP]5 - appears directly after the RGD motif, and may mediate interaction with epithelial cells. The C-terminal region of P.69 pertactin contains a [PQP]5 motif loop, which contains the major immunoprotective epitope [].The superfamily also includes immunoglobulin A1 protease and adhesion penetration protein HAP.
Protein Domain
Name: Serine/threonine-protein kinase N, C2 domain
Type: Domain
Description: PKN is a lipid-activated serine/threonine kinase. It is a member of the protein kinase C (PKC) superfamily, but lacks a C1 domain. There are at least 3 different isoforms of PKN (PRK1/PKNalpha/PAK1; PKNbeta, and PRK2/PAK2/PKNgamma). The C-terminal region contains the Ser/Thr type protein kinase domain, while the N-terminal region of PKN contains three antiparallel coiled-coil (ACC) finger domains which are relatively rich in charged residues and contain a leucine zipper-like sequence. These domains binds to the small GTPase RhoA. Following these domains is a C2-like domain. Its C-terminal part functions as an auto-inhibitory region. PKNs are not activated by classical PKC activators such as diacylglycerol, phorbol ester or Ca2+, but instead are activated by phospholipids and unsaturated fatty acids [ ].The C2 domain was first identified in PKC. C2 domains fold into an 8-standed β-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions [ , , , ].
Protein Domain
Name: NHR2-like
Type: Domain
Description: Transcriptional activation and repression are required for control of cell proliferation and differentiation during embryonic development and homeostasis in the adult organism. Perturbations of these processes can lead to the development of cancer [ ]. The Eight-Twenty-One (ETO) gene product is able to form complexes with corepressors and deacetylases, such as nuclear receptor corepressor (N-CoR), which repress transcription when recruited by transcription factors []. The ETO gene derives its name from its association with many cases of acute myelogenous leukaemia (AML), in which a reciprocal translocation, t(8;21), brings together a large portion of the ETO gene from chromosome eight and part of the AML1 gene from chromosome 21. The human ETO gene family currently comprises three major subfamilies: ETO/myeloid transforming gene on chromosome 8 (MTG8); myeloid transforming gene related protein-1 (MTGR1) and myeloid transforming gene on chromosome 16 (MTG16). ETO proteins are composed of four evolutionarily conserved domains termed nervy homology regions (NHR) 1-4. NHR1 is thought to stabilise the formation of high molecular weight complexes, but is not directly responsible for repressor activity. NHR2 and its flanking sequence comprise the core repressor domain, which mediates 50% of the wild type repressor activity. Furthermore, there is evidence that the amphipathic helical structure of NHR2 promotes the formation of ETO/AML1 homodimers []. NHR3 and NHR4 have been shown to act in concert to bind N-CoR. NHR4 contains two zinc finger motifs, which are thought to play a role in protein interactions rather than DNA binding []. This entry represents the NHR2 (Nervy homology 2) domain found in ETO proteins. It mediates oligomerisation and protein-protein interactions, forming an α-helical tetramer [ ].
Protein Domain
Name: Beta-secretase BACE2
Type: Family
Description: One of the major neuropathological hallmarks of Alzheimer's disease (AD) is the progressive formation in the brain of insoluble amyloid plaques and vascular deposits consisting of beta-amyloid protein (beta-APP) [ ]. Production of beta-APP requires proteolytic cleavage of the large type-1 transmembrane (TM) protein amyloid precursor protein (APP) []. This process is performed by a variety of enzymes known as secretases. To initiate beta-APP formation, beta-secretase cleaves APP to release a soluble N-terminal fragment (APPsBeta) and a C-terminal fragment that remains membrane bound. This fragment is subsequently cleaved by gamma-secretase to liberate beta-APP.Several independent studies identified a novel TM aspartic protease as the major beta-secretase [ , , ]. This protein, termed memapsin 2 or beta-site APP cleaving enzyme 1 (BACE1), shares 64% amino acid sequence similarity with a second enzyme, termed BACE2. Together, BACE1 and BACE2 define a novel family of aspartyl proteases []. Both enzymes share significant sequence similarity with other members of the pepsin family of aspartyl proteases and contain the two characteristic D(T/S)G(T/S) motifs that form the catalytic site. However, by contrast with other aspartyl proteases, BACE1 and BACE2 are type I TM proteins. Each protein comprises a large lumenal domain containing the active centre, a single TM domain and a small cytoplasmic tail.BACE2, also termed Asp1 and memapsin 1, was initially identified though Expressed Sequence Tag (EST) database searching. In vitro enzymatic assays with peptide substrates have demonstrated that BACE2 cleaves beta-secretase substrates in a similar fashion to BACE1 []. The BACE2 mRNA is expressed in the central nervous system and many peripheral tissues, although its expression level in neurons is substantially lower than that of BACE1 [].
Protein Domain
Name: PEBP-like superfamily
Type: Homologous_superfamily
Description: The PEBP (PhosphatidylEthanolamine-Binding Protein) superfamily is a highly conserved group of proteins that have been identified in numerous tissues in a wide variety of organisms, including bacteria, yeast, nematodes, plants, drosophila and mammals. The various functions described for members of this family include lipid binding, neuronal development [ ], serine protease inhibition [], the control of the morphological switch between shoot growth and flower structures [], and the regulation of several signalling pathways such as the MAP kinase pathway [], and the NF-kappaB pathway []. The control of the latter two pathways involves the PEBP protein RKIP, which interacts with MEK and Raf-1 to inhibit the MAP kinase pathway, and with TAK1, NIK, IKKalpha and IKKbeta to inhibit the NF-kappaB pathway. Other PEBP-like proteins that show strong structural homology to PEBP include Escherichia coli YBHB and YBCL, the Rattus norvegicus (Rat) neuropeptide HCNP, and Antirrhinum majus (Garden snapdragon) protein centroradialis (CEN). Structures have been determined for several members of the PEBP-like family, all of which show extensive fold conservation. The structure consists of a large central β-sheet flanked by a smaller β-sheet on one side, and an alpha helix on the other. Sequence alignments show two conserved central regions, CR1 and CR2, that form a consensus signature for the PEBP family. These two regions form part of the ligand-binding site, which can accommodate various anionic groups. The N- and C-terminal regions are the least conserved, and may be involved in interactions with different protein partners. The N-terminal residues 2-12 form the natural cleavage peptide HCNP involved in neuronal development. The C-terminal region is deleted in plant and bacterial PEBP homologues, and may help control accessibility to the active site.
Protein Domain
Name: Transcriptional repressor NF-X1, R3H domain
Type: Domain
Description: This entry includes the R3H domain of the X1 box binding protein (NF-X1) and related proteins. Human NF-X1 is a transcription factor that regulates the expression of class II major histocompatibility complex (MHC) genes [ , ]. The Drosophila homologue shuttle craft (STC) has been shown to be a DNA- or RNA-binding protein required for proper axon guidance in the central nervous system [].The R3H domain is a conserved sequence motif found in proteins from a diverse range of organisms including eubacteria, green plants, fungi and various groups of metazoans, but not in archaea and Escherichia coli. The domain is named R3H because it contains an invariant arginine and a highly conserved histidine, that are separated by three residues. It also displays a conserved pattern of hydrophobic residues, prolines and glycines. It can be found alone, in association with AAA domain or with various DNA/RNA binding domains like DSRM, KH, G-patch, PHD, DEAD box, or RRM. The functions of these domains indicate that the R3H domain might be involved in polynucleotide-binding, including DNA, RNA and single-stranded DNA [].The 3D structure of the R3H domain has been solved. The fold presents a small motif, consisting of a three-stranded antiparallel β-sheet, against which two α-helices pack from one side. This fold is related to the structures of the YhhP protein and the C-terminal domain of the translational initiation factor IF3. Three conserved basic residues cluster on the same face of the R3H domain and could play a role in nucleic acid recognition. An extended hydrophobic area at a different site of the molecular surface could act as a protein-binding site [ ].
Protein Domain
Name: SWIB/MDM2 domain superfamily
Type: Homologous_superfamily
Description: The SWI/SNF family of complexes, which are conserved from yeast to humans, are ATP-dependent chromatin-remodelling proteins that facilitate transcription activation [ , , ]. The mammalian complexes are made up of 9-12 proteins called BAFs (BRG1-associated factors). The BAF60 family have at least three members: BAF60a, which is ubiquitous, BAF60b and BAF60c, which are expressed in muscle and pancreatic tissues, respectively. BAF60b is present in alternative forms of the SWI/SNF complex, including complex B (SWIB), which lacks BAF60a. The SWIB domain is a conserved region found within the BAF60b proteins [], and can be found fused to the C terminus of DNA topoisomerase in Chlamydia. This domain is also found in the Saccharomyces cerevisiae SNF12 protein, the eukaryotic initiation factor 2 (eIF2) []and the Arabidopsis thaliana At1g31760 protein [].MDM2 is an oncoprotein that acts as a cellular inhibitor of the p53 tumour suppressor by binding to the transactivation domain of p53 and suppressing its ability to activate transcription [ ]. p53 acts in response to DNA damage, inducing cell cycle arrest and apoptosis. Inactivation of p53 is a common occurrence in neoplastic transformations. The core of MDM2 folds into an open bundle of four helices, which is capped by two small 3-stranded β-sheets. It consists of a duplication of two structural repeats. MDM2 has a deep hydrophobic cleft on which the p53 α-helix binds; p53 residues involved in transactivation are buried deep within the cleft of MDM2, thereby concealing the p53 transactivation domain.The SWIB and MDM2 domains are homologous and share a common fold. The core of this domain is composed of four helices arranged in an open bundle, capped by two small 3-stranded β-sheets.
Protein Domain
Name: Photosystem II PsbM superfamily
Type: Homologous_superfamily
Description: Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll 'a' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane [ , , ]. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection []. This entry represents the low molecular weight transmembrane protein PsbM found in PSII. PsbM is one of the most hydrophobic proteins in the thylakoid membrane. The function of this protein is unknown.
Protein Domain
Name: Glutaredoxin-like, plant II
Type: Family
Description: Glutaredoxins [, , ], also known as thioltransferases (disulphide reductases), are small proteins of approximately one hundred amino-acid residues which utilise glutathione and NADPH as cofactors. Oxidized glutathione is regenerated by glutathione reductase. Together these components compose the glutathione system [].Glutaredoxin functions as an electron carrier in the glutathione-dependent synthesis of deoxyribonucleotides by the enzyme ribonucleotide reductase. Like thioredoxin (TRX), which functions in a similar way, glutaredoxin possesses an active centre disulphide bond [ ]. It exists in either a reduced or an oxidized form where the two cysteine residues are linked in an intramolecular disulphide bond. It contains a redox active CXXC motif in a TRX fold and uses a similar dithiol mechanism employed by TRXs for intramolecular disulfide bond reduction of protein substrates. Unlike TRX, GRX has preference for mixed GSH disulfide substrates, in which it uses a monothiol mechanism where only the N-terminal cysteine is required. The flow of reducing equivalents in the GRX system goes from NADPH ->GSH reductase ->GSH ->GRX ->protein substrates [ , , , ]. By altering the redox state of target proteins, GRX is involved in many cellular functions including DNA synthesis, signal transduction and the defense against oxidative stress.Glutaredoxin has been sequenced in a variety of species. On the basis of extensive sequence similarity, it has been proposed [ ] that Vaccinia virus protein O2L is most probably a glutaredoxin. Finally, it must be noted that Bacteriophage T4 thioredoxin seems also to be evolutionary related. In position 5 of the pattern T4 thioredoxin has Val instead of Pro.This group of glutaredoxin-like proteins is apparently limited to plants. Multiple isoforms are found in Arabidopsis thaliana (Mouse-ear cress) and Oryza sativa(Rice).
Protein Domain
Name: MCM domain
Type: Domain
Description: Proteins shown to be required for the initiation of eukaryotic DNA replication share a highly conserved domain of about 210 amino-acid residues [ , , ]. The latter shows some similarities [] with that of various other families of DNA-dependent ATPases. Eukaryotes seem to possess a family of eight proteins that contain this domain. They were first identified in yeast where most of them have a direct role in the initiation of chromosomal DNA replication by interacting directly with autonomously replicating sequences (ARS). They were thus called 'minichromosome maintenance proteins' with gene symbols prefixedby MCM. These six proteins are: MCM2, also known as cdc19 (in S.pombe).MCM3, also known as DNA polymerase alpha holoenzyme-associated protein P1, RLF beta subunit or ROA.MCM4, also known as CDC54, cdc21 (in S.pombe) or dpa (in Drosophila).MCM5, also known as CDC46 or nda4 (in S.pombe).MCM6, also known as mis5 (in S.pombe).MCM7, also known as CDC47 or Prolifera (in A.thaliana).MCM8, also known as as REC (in Drosophila).MCMThese proteins are evolutionarily related and belong to the AAA+ superfamily. They contain the Mcm family domain, which includes motifs that are required for ATP hydrolysis (such as the Walker A and B, and R-finger motifs). Mcm2-7 forms a hexameric complex which is the replicative helicase involved in replication initiation and elongation, whereas Mcm8 and Mcm9 from and separate one, conserved among many eukaryotes except yeast and C. elegans. Mcm8/9 complex play a role during replication elongation or recombination, being involved in the repair of double-stranded DNA breaks and DNA interstrand cross-links by homologous recombination. Drosophila is the only organism that has MCM8 without MCM9, involved in meiotic recombination [ , ].
Protein Domain
Name: Translation elongation factor EF1A, eukaryotic/archaeal
Type: Family
Description: Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome [ , , ]. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.EF1A (also known as EF-1alpha or EF-Tu) is a G-protein. It forms a ternary complex of EF1A-GTP-aminoacyltRNA. The binding of aminoacyl-tRNA stimulates GTP hydrolysis by EF1A, causing a conformational change in EF1A that causes EF1A-GDP to detach from the ribosome, leaving the aminoacyl-tRNA attached at the A-site. Only the cognate aminoacyl-tRNA can induce the required conformational change in EF1A through its tight anticodon-codon binding [ , ]. EF1A-GDP is returned to its active state, EF1A-GTP, through the action of another elongation factor, EF1B (also known as EF-Ts or EF-1beta/gamma/delta).This entry represents EF1A proteins from in eukaryotic (eEF1alpha) and archaeal (aEF1alpha) organisms, these proteins being more closely related to one another than to EF1A (or EF-Tu) from bacteria ( ). Archaeal EF1-alpha is not only involved in translation elongation. It interacts with Pelota, a mRNA surveillance protein involved in no-go mRNA decay and non-stop mRNA decay; and with RF1, a tRNA-mimicking protein which recognises stop codons and catalyses polypeptide-chain release. Through these interactions archaeal EF1-alpha also has a role in translational termination and mRNA surveillance pathways [ ].
Protein Domain
Name: Zinc finger, HypF-type
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. Proteins of the HypF family are involved in the maturation and regulation of hydrogenase [ ]. In the N terminus they appear to have two zinc finger domains that are similar to those found in the DnaJ chaperone [].
Protein Domain
Name: Glutaredoxin-like protein, actinobacteria
Type: Family
Description: Glutaredoxins [ , , ], also known as thioltransferases (disulphide reductases), are small proteins of approximately one hundred amino-acid residues which utilise glutathione and NADPH as cofactors. Oxidized glutathione is regenerated by glutathione reductase. Together these components compose the glutathione system [].Glutaredoxin functions as an electron carrier in the glutathione-dependent synthesis of deoxyribonucleotides by the enzyme ribonucleotide reductase. Like thioredoxin (TRX), which functions in a similar way, glutaredoxin possesses an active centre disulphide bond [ ]. It exists in either a reduced or an oxidized form where the two cysteine residues are linked in an intramolecular disulphide bond. It contains a redox active CXXC motif in a TRX fold and uses a similar dithiol mechanism employed by TRXs for intramolecular disulfide bond reduction of protein substrates. Unlike TRX, GRX has preference for mixed GSH disulfide substrates, in which it uses a monothiol mechanism where only the N-terminal cysteine is required. The flow of reducing equivalents in the GRX system goes from NADPH ->GSH reductase ->GSH ->GRX ->protein substrates [ , , , ]. By altering the redox state of target proteins, GRX is involved in many cellular functions including DNA synthesis, signal transduction and the defense against oxidative stress.Glutaredoxin has been sequenced in a variety of species. On the basis of extensive sequence similarity, it has been proposed [ ] that Vaccinia virus protein O2L is most probably a glutaredoxin. Finally, it must be noted that Bacteriophage T4 thioredoxin seems also to be evolutionary related. In position 5 of the pattern T4 thioredoxin has Val instead of Pro.This family of glutaredoxin-like proteins is limited to the Actinobacteria and contains the conserved CxxC motif.
Protein Domain
Name: Zinc finger/thioredoxin putative
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents a region, which contains a CXXCX(19)CXXC motif suggestive of both zinc fingers and thioredoxin, usually found at the N terminus of prokaryotic proteins. One partially characterised gene, agmX, is among a large set in Myxococcus whose interruption affects adventurous gliding motility [ ].
Protein Domain
Name: Zinc finger, TRAF-type, N-terminal
Type: Homologous_superfamily
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This superfamily represents the N terminus of TRAF zinc finger domains, found in different proteins including mammalian signal transducers associated with the cytoplasmic domain of the 75kDa tumour necrosis factor receptor [ ]. There is no information related with the specific function of this N-terminal subdomain.
Protein Domain
Name: Biogenesis of lysosome-related organelles complex 1 subunit 5
Type: Family
Description: Lysosomes are membrane-bound organelles found in animals that are involved in degradation of endogenous and exogenous macromolecules [ ]. Lysosome-related organelles occur in specific cell types and fulfil specialised functions e.g. melanosomes which synthesise and store pigments in higher eukaryotes. Lysosome biogenesis is linked to the to the secretory and endocytic pathways for protein and lipid trafficking.One of the protein complexes involved in this process is biogenesis of lysosome-related organelles complex 1 (BLOC-1). This complex consists of seven distinct subunits: dysbindin, pallidin, muted, snapin, cappuccino, and BLOS1-3 subunits [ ]. Apart from BLOCS3 all of these subunits are predicted to form coiled-coil structures. BLOC-1 can be found in the cytosol and also associated with membranes. Protein interaction studies suggest that this complex interacts with a number of proteins including syntaxin, filametous actin, dystrobrevin and mysopryn, though the molecluar function of this complex is not yet known. Mutations in BLOC-1 subunits are associated with Hermansky-Pudlak syndrome - a disorder characterised by deficiencies in melanosomes, platelet-dense granules and other lysosome-related organelles. Cells that lack lysosome-related organelles express BLOC-1 but do not appear to need it for lysosome biosynthesis. In concert with the AP-3 complex, the BLOC-1 complex is required to target membrane protein cargos into vesicles assembled at cell bodies for delivery into neurites and nerve terminals [].This group represents the subunit 5 (also known as the muted subunit) of BLOC-1 [ ]. In mice, defects in this protein are the cause of the Muted (mu) mutant, which is characterised by light eyes at birth, hypopigmentation of the coat, platelet storage pool deficiency and lysosomal hyposecretion [].
Protein Domain
Name: Peptidase S26A, signal peptidase I, lysine active site
Type: Active_site
Description: Signal peptidases (SPases) [ ] (also known as leader peptidases) remove the signal peptides from secretory proteins. In prokaryotes three types of SPases are known: type I (gene lepB) which is responsible for the processing of the majority of exported pre-proteins; type II (gene lsp) which only processlipoproteins, and a third type involved in the processing of pili subunits. SPase I (EC 3.4.21.89) is an integral membrane protein that is anchored in the cytoplasmic membrane by one (in Bacillus subtilis) or two (in Escherichia coli) N-terminal transmembrane domains with the main part of the protein protuding in the periplasmic space. Two residues have been shown [ , ] to be essential for the catalytic activity of SPase I: a serine and an lysine.SPase I is evolutionary related to the yeast mitochondrial inner membrane protease subunit 1 and 2 (genes IMP1 and IMP2) which catalyse the removal ofsignal peptides required for the targeting of proteins from the mitochondrial matrix, across the inner membrane, into the inter-membrane space [].In eukaryotes the removal of signal peptides is effected by an oligomeric enzymatic complex composed of at least five subunits: the signal peptidasecomplex (SPC). The SPC is located in the endoplasmic reticulum membrane. Two components of mammalian SPC, the 18 Kd (SPC18) and the 21 Kd (SPC21) subunitsas well as the yeast SEC11 subunit have been shown [ ] to share regions ofsequence similarity with prokaryotic SPases I and yeast IMP1/IMP2. This entry represents the putative active site lysine located in S26 peptidases (SPase I and IMP1/2). This active site lysine is not conserved in the SPC subunits.
Protein Domain
Name: Timeless
Type: Family
Description: The timeless gene in Drosophila melanogasteris involved in circadian rhythm control [ ]. Drosophila contains two paralogs, dTIM and dTIM2, acting in clock/photoreception and chromosome integrity/photoreception respectively. The mammalian TIMELESS (TIM) protein, originally identified based on its similarity to Drosophila dTIM, interacts with the clock proteins dCRY and dPER and is essential for circadian rhythm generation and photo-entrainment in the fly []. However, phylogenetic sequence analysis has demonstrated that dTIM2 is likely to be the orthologue of mammalian TIM and other widely conserved TIM-like proteins in eukaryotes []. These proteins include Saccharomyces cerevisiae Tof1, Schizosaccharomyces pombe Swi1, and Caenorhabditis elegans TIM. These proteins are not involved in the core clock mechanism, but instead play important roles in chromosome integrity, efficient cell growth and/or development [, ], with the exception of dTIM-2, that has an additional function in retinal photoreception [].Saccharomyces cerevisiae Tof1 is a subunit of a replication-pausing checkpoint complex (Tof1-Mrc1-Csm3) that acts at the stalled replication fork to promote sister chromatid cohesion after DNA damage, facilitating gap repair of damaged DNA [ , ]. Schizosaccharomyces pombe Swi1 and Swi3 form the fork protection complex that coordinates leading- and lagging-strand synthesis and stabilizes stalled replication forks []. In humans timeless forms a stable complex with its partner protein Tipin. The Timeless-Tipin complex has been reported to travel along with the replication fork during unperturbed DNA replication. Moreover, the Timeless-Tipin-Claspin complex contributes to full activation of the ATR-Chk1 signaling pathway through the recruitment of Chk1 to arrested replication forks for sufficient ATR-mediated phosphorylation. It also interacts with PARP-1, and this interaction is required for efficient homologous recombination repair [ ].
Protein Domain
Name: Epithelial sodium channel, conserved site
Type: Conserved_site
Description: The apical membrane of many tight epithelia contains sodium channels that are primarily characterised by their high affinity to the diuretic blockeramiloride [ , , , ]. These channels mediate the first step of active sodiumreabsorption essential for the maintenance of body salt and water homeostasis []. In vertebrates, the channels control reabsorption ofsodium in kidney, colon, lung and sweat glands; they also play a role in taste perception.Members of the epithelial Na +channel (ENaC) family fall into four subfamilies, termed alpha, beta, gamma and delta []. The proteins exhibitthe same apparent topology, each with two transmembrane (TM) spanning segments, separated by a large extracellular loop. In most ENaC proteinsstudied to date, the extracellular domains are highly conserved and contain numerous cysteine residues, with flanking C-terminal amphipathic TM regions,postulated to contribute to the formation of the hydrophilic pores of the oligomeric channel protein complexes. It is thought that the well-conservedextracellular domains serve as receptors to control the activities of the channels.Vertebrate ENaC proteins are similar to degenerins of Caenorhabditis elegans []: deg-1, del-1, mec-4, mec-10 and unc-8. These proteins can be mutated to cause neuronal degradation, and are also thought to form sodium channels.Structurally, the proteins that belong to this family consist of about 510 to 920 amino acid residues. They are made of an intracellular N terminusregion followed by a transmembrane domain, a large extracellular loop, a second transmembrane segment and a C-terminal intracellular tail [].For this entry the signature corresponds to the beginning of a conserved cysteine-rich region (there are nine conserved cysteines in a domain of about 65 residues), located at the C-terminal part of the extracellular loop.
Protein Domain
Name: Tuftelin interacting protein, N-terminal domain
Type: Domain
Description: This domain is found in septin and tuftelin-interacting protein (STIP) and tuftelin-interacting protein 11 (TFIP11). STIP is essential for embryogenesis in Caenorhabditis elegans [ ]. Tuftelin-interacting protein 11 is a component of the spliceosome involved in spliceosome disassembly []. TFIP11 was first identified in a yeast two-hybrid screening as a protein interacting with tuftelin, one of the presumed enamel matrix proteins [].
Protein Domain
Name: Type IV secretion system, VirB5
Type: Family
Description: This entry contains VirB5, a protein that is involved in the type IV DNA secretion systems typified by the Agrobacterium Ti plasmid vir system where it interacts with several other proteins essential for proper pilus formation [ ]. VirB5 is homologous to the IncN (N-type) conjugation system protein TraC [] as well as the P-type protein TrbJ and the F-type protein TraE [].
Protein Domain
Name: Apolipophorin-III
Type: Family
Description: This family consists of several insect apolipoprotein-III sequences. Exchangeable apolipoproteins constitute a functionally important family of proteins that play critical roles in lipid transport and lipoprotein metabolism. Apolipophorin III (apoLp-III) is a prototypical exchangeable apolipoprotein found in many insect species that functions in transport of diacylglycerol (DAG) from the fat body lipid storage depot to flight muscles in the adult life stage [].
Protein Domain
Name: Liprin-alpha, SAM domain repeat 3
Type: Domain
Description: SAM (sterile alpha motif) domain repeat 3 of liprin-alpha proteins is a protein-protein interaction domain. Liprin-alpha proteins contain three copies (repeats) of SAM domain. They may form heterodimers with liprin-beta proteins through their SAM domains. They participate in mammary gland development and in axon guidance; in particular, liprin-alpha is involved in formation of the presynaptic active zone [ , ].
Protein Domain
Name: Caskin1/2, SAM repeat 1
Type: Domain
Description: This is the SAM (sterile alpha motif) domain repeat 1 of caskin 1 and caskin 2 proteins; it is a protein-protein interaction domain. Caskin has two tandem SAM domains. Caskin protein is known to interact with membrane-associated guanylate kinase CASK, and apparently may play a role in neural development, synaptic protein targeting, and regulation of gene expression [ , , ].
Protein Domain
Name: Caskin1/2, SAM repeat 2
Type: Domain
Description: This is the SAM (sterile alpha motif) domain repeat 2 of caskin 1 and caskin 2 proteins; it is a protein-protein interaction domain. Caskin has two tandem SAM domains. Caskin protein is known to interact with membrane-associated guanylate kinase CASK, and may play a role in neural development, synaptic protein targeting, and regulation of gene expression [ , , ].
Protein Domain
Name: Liprin-alpha, SAM domain repeat 2
Type: Domain
Description: SAM (sterile alpha motif) domain repeat 2 of liprin-alpha proteins is a protein-protein interaction domain. Liprin-alpha proteins contain three copies (repeats) of SAM domain. They may form heterodimers with liprin-beta proteins through their SAM domains. They participate in mammary gland development and in axon guidance; in particular, liprin-alpha is involved in formation of the presynaptic active zone [ , ].
Protein Domain
Name: Signal-induced proliferation-associated 1-like protein, C-terminal
Type: Domain
Description: This domain is found in C-terminal of the signal-induced proliferation-associated 1-like (SIPA1L) proteins, including SIPA1L 1-3 from humans. SIPA1Ls shares protein sequence similarity with signal-induced proliferation-associated protein 1 (SIPA1), which is a GTPase activator for the nuclear Ras-related regulatory proteins Rap1 and Rap2 [ , ]. In rats SIPA1L1 (also known as SPAR) has been identified as a Rap-specific GTPase-activating protein [].
Protein Domain
Name: Fungal transcription factor
Type: Family
Description: Proteins in family are transcription factors typically found in fungi, including acriflavine sensitivity control protein [ ], arginine metabolism regulation protein II [], lysine biosynthesis regulatory protein LYS14 [], and others []. DibT is part of the gene cluster that mediates the biosynthesis of pestalotiollide B which is part of dibenzodioxocinones, a novel class of inhibitors against cholesterol ester transfer protein (CEPT)[, ].
Protein Domain
Name: Peripherally associated ATOM36
Type: Family
Description: This entry represents the trypanosome peripherally associated ATOM36 protein (pATOM36) which is an essential component of the outer mitochondrial membrane protein import system that interacts with ATOM (archaic translocase of the outer mitochondrial membrane), being involved in the assembly and/or membrane insertion of proteins included in the ATOM complex. This protein also promotes protein translocation into the mitochondrial matrix [ ].
Protein Domain
Name: Peptidase S54, rhomboid, metazoan
Type: Family
Description: This entry represents metazoan rhomboid proteins ( ), integral membrane proteins thought to be involved in regulated intra-membrane proteolysis and the subsequent release of functional polypeptides from their membrane anchors [ ]. Rhomboid proteins cleave type-1 transmembrane domains using a catalytic triad composed of serine, histidine and asparagine contributed by different transmembrane domains. These proteins belong to the S54 peptidase family of proteins.
Protein Domain
Name: Prolactin-releasing peptide receptor
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].Hypothalamic peptide hormones regulate secretion of anterior pituitary hormones, such as growth hormone, follicle stimulating hormone, luteinising hormone and thyrotropin. A novel bioactive peptide has been identified from bovine hypothalamus and found to increase prolactin secretion from the anterior pituitary [ ]. This peptide - prolactin-releasing peptide (PrRP) - is a member of the structurally related RF-amide family, which includes neuropeptide FF []. The peptide exists in two forms: a 31-amino acid form and a truncated 20-amino acid form []. PrRP has been found in the medulla oblongata, hypothalamus and pituitary, as well as in a number of other tissues. This distribution suggests the peptide may have other roles in addition to prolactin release [].The receptor for PrRP was identified to be an orphan receptor, previously known as GPR10 [ ]. This receptor is expressed in the central nervous system with highest levels in the pituitary. Expression has also been detected in the cerebellum, brainstem, hypothalamus, thalamus and spinal cord in rat []. Binding of PrRP to the receptor results in activation of extracellular signal-related kinase (ERK) in a mainly pertussis toxin sensitive manner, suggesting coupling to Gi/o proteins []. PrRP can also cause increases in intracellular calcium and activation of c-Jun N-terminal protein kinase (JNK) in a pertussis toxin insensitive manner, indicating that the receptor can also couple to Gq proteins, depending on the cell type in which it is expressed [].
Protein Domain
Name: Concanavalin A-like lectin/glucanase domain superfamily
Type: Homologous_superfamily
Description: This superfamily represents the concanavalin A-like domain, which has a sandwich structure of 12-14 β-strands in two sheets with a complex topology. Proteins containing this domain include: Legume lectinsGlycosyl hydrolases family 16Galectin (animal S-lectin)Laminin G-like modulePentraxinClostridium neurotoxinsExotoxin AVibrio cholera sialidaseLeech intramolecular trans-sialidaseGlycosyl hydrolase family 7Xylanase/endoglucanaseCalnexin/calreticulinLectin leg-likevp4 sialic acid binding proteinTrypanosoma sialidaseThrombospondinHypothetical protein YesUAlginate lyaseGlycosyl hydrolases family 32Peptidase A4Alpha-L-arabinofuranosidase BSPRY domain containing proteinsBeta-D-xylosidaseSO2946-likeMAM domain containing proteins
Protein Domain
Name: Pilus formation type IVB, outer membrane PilN
Type: Family
Description: Several related protein families encode outer membrane pore proteins for type II secretion, type III secretion, and type IV pilus formation. Proteins in this entry appear to be secretins for pilus formation, although they are quite different from PilQ. Members include the PilN lipoprotein of the plasmid R64 thin pilus, a type IV pilus. Some proteins may be examples of bundle-forming pilus B (bfpB).
Protein Domain
Name: LBH domain
Type: Domain
Description: LBH (limb-bud and heart) protein may act as a transcriptional activator in mitogen-activated protein kinase signalling pathway to mediate cellular functions. It has been shown to regulate cardiac gene expression by modulating the combinatorial activities of key cardiac transcription factors, as well as their individual functions in cardiogenesis in mice [ ]. Proteins containing this domain include Protein LBH, and LBH domain-containing protein 1 (LBHD1) from humans.
Protein Domain
Name: Replication P
Type: Family
Description: This family consists of several Bacteriophage lambda replication protein P like proteins. The bacteriophage lambda P protein promoters replication of the phage chromosome by recruiting a key component of the cellular replication machinery to the viral origin. Specifically, P protein delivers one or more molecules of Escherichia coli DnaB helicase to a nucleoprotein structure formed by the lambda O initiator at the lambda replication origin [ ].
Protein Domain
Name: Glycine cleavage system T protein, bacteria
Type: Family
Description: The glycine cleavage system (GCS) is a multienzyme system composed of proteins P, H, T, and L, that catalyses the reversible oxidation of glycine. The T protein is an aminomethyl transferase that catalyses the following reaction: (6S)-tetrahydrofolate + S-aminomethyldihydrolipoylprotein = (6R)-5,10-methylenetetrahydrofolate + NH3+ dihydrolipoylprotein The glycine cleavage system is found in bacteria and the mitochondria of eukaryotes. This entry represents the T-protein from bacteria [].
Protein Domain
Name: CstA, N-terminal domain
Type: Domain
Description: Escherichia coli induces the synthesis of at least 30 proteins at the onset of carbon starvation, two-thirds of which are positively regulated by the cyclic AMP (cAMP) and cAMP receptor protein (CRP) complex. Proteins in this entry include carbon starvation protein CstA, which is a predicted membrane protein that may be involved in peptide utilisation []. This entry represents the N-terminal domain of CstA.
Protein Domain
Name: UPF0313, N-terminal
Type: Domain
Description: This domain tends to occur to the N terminus of radical SAM domain in hypothetical bacterial proteins. Proteins in this entry are radical SAM proteins, they catalyse diverse reactions, including unusual methylations, isomerization, sulphur insertion, ring formation, anaerobic oxidation and protein radical formation. Evidence exists that these proteins generate a radical species by reductive cleavage of S:-adenosylmethionine (SAM) through an unusual Fe-S centre [ , ].
Protein Domain
Name: Phosphoprotein, C-terminal domain, viral
Type: Homologous_superfamily
Description: This superfamily represents the C-terminal domain of the phosphoprotein from vesiculoviruses, which are ssRNA negative-strand rhabdoviruses. It is known as the phosphoprotein or P protein [ , ]. This protein may be part of the RNA dependent RNA polymerase complex []. The phosphorylation states of this protein may regulate the transcription and replication complexes [].Structurally, the C-terminal domain consists of 2 strands and 5 helices.
Protein Domain
Name: OPI10 family
Type: Family
Description: OPI10 family members include protein Hikeshi from vertebrates and protein Opi10 from yeasts. Hikeshi protein acts as a specific nuclear import carrier for HSP70 proteins following heat-shock stress; it acts by mediating the nucleoporin-dependent translocation of ATP-bound HSP70 proteins into the nucleus [ ].Budding yeast Opi10 is a repressor of the phospholipid biosynthetic genes and specifically binds PA in the endoplasmic reticulum [ ].
Protein Domain
Name: Pup--protein ligase
Type: Family
Description: Pupylation is a novel protein modification system found in some bacteria [ ]. This entry represents a family of proteins involved in this system. Pup ligases, such as PafA, conjugate the prokaryotic ubiquitin-like protein Pup to lysine residues in target proteins, marking them for degradation by the proteasome []. It has been suggested that proteins in this entry are related to gamma-glutamyl-cysteine synthetases [].
Protein Domain
Name: Chromophore lyase CpcT/CpeT
Type: Family
Description: This entry represents the CpcT/CpeT biliprotein lyase, which has been shown to covalently attach chromophores to cystiene residue(s) of phycobiliproteins [ , ]. These proteins contain a conserved motif PYR in the amino terminal half of the protein that may be functionally important. In the chromatically adapting cyanobacterium Fremyella diplosiphon, the proteins have been shown to be induced by green light, as part of the cpeCDESTR operon [].
Protein Domain
Name: PSRP-3/Ycf65 superfamily
Type: Homologous_superfamily
Description: This small acidic protein is found in 30S ribosomal subunit of cyanobacteria and plant plastids. In plants it has been named plastid-specific ribosomal protein 3 (PSRP-3), and in cyanobacteria it is named Ycf65. Plastid-specific ribosomal proteins may mediate the effects of nuclear factors on plastid translation. The acidic PSRPs are thought to contribute to protein-protein interactions in the 30S subunit, and are not thought to bind RNA [].
Protein Domain
Name: Nucleocapsid protein, arenaviridae
Type: Family
Description: Arenaviridae are single stranded RNA viruses. The arenaviridae S RNAs that have been characterised include conserved terminal sequences, an ambisense arrangement of the coding regions for the precursor glycoprotein (GPC) and nucleocapsid (N) proteins and an intergenic region capable of forming a base-paired "hairpin"structure. The mature glycoproteins that result are G1 and G2 and the N protein [ ].This family represents the nucleocapsid protein that encapsulates the viral ssRNA [ ].
Protein Domain
Name: NUCL, RNA recognition motif 1
Type: Domain
Description: This entry represents the RNA recognition motif 1 (RRM1) of a group of plant nucleolin-like proteins, including nucleolin 1 (also termed protein nucleolin like 1) and nucleolin 2 (also termed protein nucleolin like 2, or protein parallel like 1). They play roles in the regulation of ribosome synthesis and in the growth and development of plants [ , ]. Like yeast nucleolin, nucleolin-like proteins possess two RNA recognition motifs (RRMs).
Protein Domain
Name: NUCL, RNA recognition motif 2
Type: Domain
Description: This entry represents the RNA recognition motif 2 (RRM2) of a group of plant nucleolin-like proteins, including nucleolin 1 (also termed protein nucleolin like 1) and nucleolin 2 (also termed protein nucleolin like 2, or protein parallel like 1). They play roles in the regulation of ribosome synthesis and in the growth and development of plants [ , ]. Like yeast nucleolin, nucleolin-like proteins possess two RNA recognition motifs (RRMs).
Protein Domain
Name: Bacteriophage P2, LysB
Type: Family
Description: This entry is represented by Bacteriophage P2, LysB. The characteristics of the protein distribution suggest prophage matches in addition to the phage matches.Members of this protein family are phage lysis regulatory proteins, including the well-studied protein LysB (lysis protein B) of Enterobacteria phage P2 (Bacteriophage P2). For members of this family, genes are found in phage or in prophage regions of bacterial genomes, typically near a phage lysozyme or phage holin.
Protein Domain
Name: Yippee family
Type: Family
Description: The Yippee-like (YPEL) family proteins share homology to drosophila Yippee, a zinc finger binding protein. YPEL proteins are found in essentiallly all theeukaryotes and hence they must play important roles in the maintenance of life. Subcellular localization of all YPEL proteins to the centrosomes andthe mitotic apparatus suggest their role in the mitosis-associated function. YPEL proteins contain a Yippee domain, which is a putative zinc-finger-like,metal-binding domain [ , , , ].
Protein Domain
Name: FlgO domain
Type: Domain
Description: This entry represents a domain found in the FlgO protein. Mutation of this protein in Vibrio cholerae has been shown to reduce motility. FlgO is an outer membrane protein that localises throughout the membrane and not at the flagellar pole. Although FlgO and FlgP do not specifically localize to the flagellum, they are required for flagellar stability. Proteins in this family mostly contain an N-terminal lipoprotein attachment motif [ ].
Protein Domain
Name: Cytochrome b-c1 complex, subunit 6
Type: Family
Description: The ubiquinol-cytochrome C reductase complex (cytochrome bc1 complex) is a respiratory multienzyme complex [ ]. The bc1 complex contains 11 subunits; 3 respiratory subunits (cytochrome B, cytochrome C1, Rieske protein), 2 core proteins and 6 low molecular weight proteins. This family represents the 'hinge' protein of the complex, subunit 6, which is thought to mediate formation of the cytochrome c1 and cytochrome c complex. Proteins in this entry from an α-helical hairpin.
Protein Domain
Name: Agenet domain, plant type
Type: Domain
Description: This entry represents an agenet domain found in EMSY-like (AtEML) proteins, which have possible roles in chromatin regulation and are related to the BRCA2-interacting human oncoprotein EMSY [ ]. Proteins containing this domain also include MRG2 (AT1G02740) from Arabidopsis and PHD finger protein 20-like protein 1 (PHF20L1) from animals. MRG2 binds to the FLOWERING LOCUS T locus and elevates the expression in an H3K36me3-dependent manner []. The function of PHF20L1 is not clear.
Protein Domain
Name: Outer membrane protein, bacterial
Type: Domain
Description: Most of the bacterial outer membrane proteins in this group are porin-like integral membrane proteins (such as ompA) [ ], but some are small lipid-anchored proteins (such as pal) []. They are present in the outer membrane of many Gram-negative organisms []. The C-terminal half of these proteins and is well conserved. The N-terminal half is variable although some of the proteins in this group have the OmpA-like transmembrane domain at the N terminus.
Protein Domain
Name: Undecaprenyl-diphosphatase UppP
Type: Family
Description: This is a family of small, highly hydrophobic proteins. Over-expression of this protein in Escherichia coli is associated with bacitracin resistance [ ], and the protein was originally proposed to be an undecaprenol kinase called bacA. BacA protein, however, does not show undecaprenol phosphokinase activity []. It is now known to be an undecaprenyl pyrophosphate phosphatase () and is renamed UppP. It is not the only protein associated with bacitracin resistance [ , ].
Protein Domain
Name: Ecto-NOX disulfide-thiol exchanger, RNA recognition motif
Type: Domain
Description: This entry represents the conserved RNA recognition motif (RRM) in ECTO-NOX proteins (ENOX). ENOX proteins are growth-related cell surface proteins that catalyse both hydroquinone or NADH oxidation and protein disulfide-thiol interchange [ ]. The two enzymatic activities oscillate with a period length of 24 minutes and play a role in control of the ultradian cellular biological clock [, ]. ENOX proteins may play roles in cancer, cellular time-keeping, growth, aging and neurodegenerative diseases [].
Protein Domain
Name: Chordopoxvirus A30L
Type: Family
Description: This family consists of several short Chordopoxvirus proteins which are homologous to the A30L protein of Vaccinia virus. The vaccinia virus A30L protein is required for the association of electron-dense, granular, proteinaceous material with the concave surfaces of crescent membranes, an early step in viral morphogenesis. A30L is known to interact with the G7L protein and it has been shown that the stability of each is dependent on its association with the other [ ].
Protein Domain
Name: Clc protein-like
Type: Family
Description: Clc proteins are a nine-member gene family of chloride channels that have diverse roles in the plasma membrane and in intracellular organelles, especially membrane excitability and the maintenance of osmotic balance [ , ]. These proteins have been widely related with a variety of human diseases ranging from degeneration of the retina to lung cancer and epilepsy [, ]. This protein family includes Clc-like protein 2/5 from Caenorhabditis elegans and similar proteins from animals.
Protein Domain
Name: Microtubule associated protein, tubulin-binding repeat
Type: Repeat
Description: Microtubules consist of tubulins as well as a group of additional proteins collectively known as the Microtubule Associated Proteins (MAP). MAP's havebeen classified into two classes: high molecular weight MAP's and Tau protein. The Tau proteins promote microtubule assembly and stabilisemicrotubules.The C-terminal region of these proteins contains three or four tandem repeats of about thirty amino acid residues which is implicated in tubulin binding and which seem to have a stiffening effect on microtubules.
Protein Domain
Name: Hydrophobin
Type: Family
Description: The surface of many fungal spores is covered by a hydrophobic sheath, the rodlet layer, whose main component is a protein known as the rodlet protein [, ]. The rodlet proteins of Neurospora crassa (gene eas) and Emericella nidulans (gene rodA) are evolutionary related to proteins found in the cell wall of fruiting bodies of the mushroom Schizophyllum commune (Bracket fungus) [ ].Collectively, these low-molecular-weight, cysteine-rich (eight conserved cysteines), hydrophobic proteins, are known as hydrophobins.
Protein Domain
Name: Probable [Fe-S]-dependent transcriptional repressor
Type: Family
Description: Bacteria commonly utilise a unique type of transporter, called Feo, to specifically acquire the ferrous (Fe2+) form of iron from their environment. Enterobacterial Feo systems are composed of three proteins: FeoA, a small, soluble SH3-domain protein probably located in the cytosol; FeoB, a large protein which likely functions as the Fe2+ permease; and FeoC, a small protein apparently functioning as an [Fe-S]-dependent transcriptional repressor of the feoABC operon [].This entry represents the FeoC protein.
Protein Domain
Name: Phage tail protein-like superfamily
Type: Homologous_superfamily
Description: This entry represents bacteriophage lambda, GpU, a minor tail protein. GpU plays an essential role in tail assembly by capping the rapidly polymerizing tail once it has reached its requisite length and serving as the interaction surface for the completion protein [ ]. The characteristics of the protein distribution suggest prophage matches in addition to the phage matches.This entry also includes some uncharacterised proteins from bacteria, such as Gp37 and putative cytoplasmic protein STM4215.
Protein Domain
Name: Tom37, C-terminal domain
Type: Domain
Description: The TOM37 protein is one of the outer membrane proteins that make up the TOM complex for guiding cytosolic mitochondrial β-barrel proteins from the cytosol across the outer mitochondrial membrane into the intramembrane space. In conjunction with TOM70 it guides peptides without an MTS into TOM40, the protein that forms the passage through the outer membrane [ ]. It has homology with Metaxin-1, also part of the outer mitochondrial membrane β-barrel protein transport complex [].
Protein Domain
Name: IMS import disulfide relay-system, CHCH-CHCH-like Cx9C
Type: Domain
Description: CX9C is the first half of a twin Cx9C motif in eukaryotic proteins. The function of this motif is to import nuclear-encoded mitochondrial intermembrane-space-proteins into the IMS (intermembrane space), as these latter lack a mitochondrial targeting sequence. The Cx9C proteins have a disulfide-bonded alpha-hairpin conformation. Cx9C-containing proteins are thus putative substrates for the Mia40-dependent thiol-disulfide exchange mechanism that carries out an oxidative folding process resulting in the proteins being trapped in the IMS [ ].
Protein Domain
Name: PdxT/SNO family, conserved site
Type: Conserved_site
Description: The term vitamin B6 is used to refer collectively to the compound pyridoxine and its vitameric forms, pyridoxal, pyridoxamine, and their phosphorylatedderivatives. Vitamin B6 is required by all organisms and plays an essential role as a co-factor for enzymatic reactions. Plants, fungi, bacteria,archaebacteria, and protists synthetize vitamin B6. Animals and some highly specialised obligate pathogens obtain it nutritionally. Vitamin B6 has twodistinct biosynthetic pathways, which do not coexist in any organism. The pdxA/pdxJ pathway, that has been extensively characterised in Escherichiacoli, is found in the gamma subdivision of the proteobacteria. A second pathway of vitamin B6 synthesis involving the pdxS/SNZ andpdxT/SNO protein families, which are completely unrelated in sequence to the pdxA/pdxJ proteins, is found in plants, fungi, protists, archaebacteria, andmost bacteria. PdxS/SNZ and pdxT/SNO proteins form a complex which serves as a glutamine amidotransferase to supply ammonia as a source of the ring nitrogen of vitaminB6 [ ]. PdxT/SNO and pdxS/SNZ appear to encode respectively the glutaminasesubunit, which produces ammonia from glutamine, and the synthase subunit, which combines ammonia with five- and three-carbon phosphosugars to formvitamin B6 [ ].The pdxT/SNO family belongs to the triad glutamine aminotransferase fold,characterised by a conserved Cys-His-Glu active site [ ]. Two regions arehighly conserved across all taxa, the PGGEST motif and the FHPE(LT) motif [ ].PdxT/SNO proteins are an alpha/beta three-layer sandwich containing a seven- stranded twisted mixed parallel β-sheet flanked by a six α-helices onthe N-terminal stretch of the sheet, four on one side and two on the other [ ].Proteins belonging to the pdxT/SNO family include:Bacillus subtilis glutamine amidotransferase subunit pdxT Haemophilus influenzae glutamine amidotransferase subunit pdxT Methanococcus jannaschii glutamine amidotransferase subunit pdxTYeast probable glutamine amidotransferase SNO1Yeast probable glutamine amidotransferase SNO2Yeast probable glutamine amidotransferase SNO3These are hydrophilic proteins of about 19 to 25 Kd.This entry represents a conserved site containing the PGGEST motif.
Protein Domain
Name: Photosystem I Ycf4, assembly
Type: Family
Description: Photosystem I (PSI) is a large protein complex embedded within the photosynthetic thylakoid membrane formed by a core complex, peripheral light-harvesting complexes (LHCIs) and cofactors. It consists of 15 core and 4 LHCI subunits, ~150 chlorophylls (a and b) molecules, 2 phylloquinones, and 3 Fe4S4-clusters [ ]. The three dimensional structure of the PSI complex has been resolved at 2.5 A [], which allows the precise localisation of each cofactor. PSI together with photosystem II (PSII) catalyses the light-induced steps in oxygenic photosynthesis - a process found in cyanobacteria, eukaryotic algae (e.g. red algae, green algae) and higher plants.To date, three thylakoid proteins involved in the stable accumulation of PSI have been identified: BtpA ( ) [ ], Ycf3 [, ], and Ycf4 []. Because translation of the psaA and psaB mRNAs encoding the two reaction centre polypeptides, of PSI and PSII respectively, is not affected in mutant strains lacking functional ycf3 and ycf4, the products of these two genes appear to act at a post-translational step of PSI biosynthesis. The BtpA protein appears to act at the level of PSI stabilisation []. It is an extrinsic membrane protein located on the cytoplasmic side of the thylakoid membrane [, ]. Homologs of BtpA are found in the crenarchaeota and euryarchaeota, where their function remains unknown. The Ycf4 protein is firmly associated with the thylakoid membrane, presumably through a transmembrane N-terminal domain [, ]. Together with Ycf3, it forms a core PSI assembly apparatus as an auxiliary factor. The Ycf3 is a TPR-containing protein loosely associated with the thylakoid membrane and, with its interacting partner Y3IP1, facilitates the assembly of reaction centre subunits. Ycf4 facilitates the integration of peripheral PSI subunits and LHCIs into the PSI reaction centre subcomplex [ ].
Protein Domain
Name: Photosystem I Ycf3, assembly
Type: Family
Description: Photosystem I (PSI) is a large protein complex embedded within the photosynthetic thylakoid membrane formed by a core complex, peripheral light-harvesting complexes (LHCIs) and cofactors. It consists of 15 core and 4 LHCI subunits, ~150 chlorophylls (a and b) molecules, 2 phylloquinones, and 3 Fe4S4-clusters [ ]. The three dimensional structure of the PSI complex has been resolved at 2.5 A [], which allows the precise localisation of each cofactor. PSI together with photosystem II (PSII) catalyses the light-induced steps in oxygenic photosynthesis - a process found in cyanobacteria, eukaryotic algae (e.g. red algae, green algae) and higher plants.To date, three thylakoid proteins involved in the stable accumulation of PSI have been identified: BtpA ( ) [ ], Ycf3 [, ], and Ycf4 []. Because translation of the psaA and psaB mRNAs encoding the two reaction centre polypeptides, of PSI and PSII respectively, is not affected in mutant strains lacking functional ycf3 and ycf4, the products of these two genes appear to act at a post-translational step of PSI biosynthesis. The BtpA protein appears to act at the level of PSI stabilisation []. It is an extrinsic membrane protein located on the cytoplasmic side of the thylakoid membrane [, ]. Homologs of BtpA are found in the crenarchaeota and euryarchaeota, where their function remains unknown. The Ycf4 protein is firmly associated with the thylakoid membrane, presumably through a transmembrane N-terminal domain [, ]. Together with Ycf3, it forms a core PSI assembly apparatus as an auxiliary factor. The Ycf3 is a TPR-containing protein loosely associated with the thylakoid membrane and, with its interacting partner Y3IP1, facilitates the assembly of reaction centre subunits. Ycf4 facilitates the integration of peripheral PSI subunits and LHCIs into the PSI reaction centre subcomplex [].
Protein Domain
Name: Chaperonin Cpn60/GroEL
Type: Family
Description: The assembly of proteins has been thought to be the sole result of properties inherent in the primary sequence of polypeptides themselves. In some cases, however, structural information from other protein molecules is required for correct folding and subsequent assembly into oligomers [ ]. These `helper' molecules are referred to as molecular chaperones, a subfamily of which are the chaperonins []. They are required for normal cell growth (as demonstrated by the fact that no temperature sensitive mutants for the chaperonin genes can be found in the temperature range 20 to 43 degrees centigrade []), and are stress-induced, acting to stabilise or protect disassembled polypeptides under heat-shock conditions []. This entry represents the 60kDa chaperonin (Cpn60), its bacterial homologue groEL and RuBisCO subunit-binding protein []), which are mainly present in bacteria and eukaryots.The 60kDa form of chaperonin is the immunodominant antigen of patients with Legionnaire's disease [ ], and is thought to play a role in the protection of the Legionella spp. bacteria from oxygen radicals within macrophages. This hypothesis is based on the finding that the cpn60 gene is upregulated in response to hydrogen peroxide, a source of oxygen radicals. Cpn60 has also been found to display strong antigenicity in many bacterial species [], and has the potential for inducing immune protection against unrelated bacterial infections. The RuBisCO subunit binding protein (which has been implicated in the assembly of RuBisCO) and cpn60 have been found to be evolutionary homologues, the RuBisCO subunit binding protein having the C-terminal Gly-Gly-Met repeat found in all bacterial cpn60 sequences. Although the precise function of this repeat is unknown, it is thought to be important as it is also found in 70kDa heat-shock proteins []. The crystal structure of Escherichia coli GroEL has been resolved to 2.8A [].
Protein Domain
Name: Translation elongation factor, IF5A, hypusine site
Type: PTM
Description: Translation initiation factor 5A (IF-5A) was previously reported to be involved in the first step of peptide bond formation in translation; however more recent work implicates it as a universally conserved translation elongation factor [ ].eIF5A is a cofactor for the Rev and Rex transactivator proteins of human immunodeficiency virus-1 and T-cell leukaemia virus I, respectively [ , , ]. IF-5A is the sole protein in eukaryotes and archaea to contain the unusual amino acid hypusine (Ne-(4-amino-2-hydroxybutyl)lysine) that is an absolute functional requirement. The first step in the post-translational modification of lysine to hypusine is catalyzed by the enzyme deoxyhypusine synthase, the structure of which has been reported. Hypusine is derived from lysine by the post-translational addition of a butylamino group (from spermidine) to the ε-amino group of lysine. The hypusine group is essential to the function of eIF-5A. A hypusine-containing protein has been found in archaebacteria such as Sulfolobus acidocaldarius or Methanocaldococcus jannaschii (Methanococcus jannaschii); this protein is highly similar to eIF-5A and could play a similar role in protein biosynthesis. The signature for eIF-5A is centred on the hypusine residue. The crystal structure of IF-5A from the archaeon Pyrobaculum aerophilum has been determined to 1.75 A. Unmodified P. aerophilum IF-5A is found to be a beta structure with two domains and three separate hydrophobic cores. The lysine (Lys42) that is post-translationally modified by deoxyhypusine synthase is found at one end of the IF-5A molecule in a turn between beta strands beta4 and beta5; this lysine residue is freely solvent accessible. The C-terminal domain is found to be homologous to the cold-shock protein CspA of Escherichia coli, which has a well characterised RNA-binding fold, suggesting that IF-5A is involved in RNA binding [ ].
Protein Domain
Name: Peptidase S49
Type: Domain
Description: Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes [ ]. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Many families of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases [].Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base [ ]. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ].This group of serine peptidases belong to MEROPS peptidase family S49 (protease IV family, clan S-). The predicted active site serine for members of this family occurs in a transmembrane domain. The domain defines sequences in viruses, archaea, bacteria and plants. These sequences are variously annotated in the different taxonomic groups, examples are:Viruses: capsid proteinArchaea: proteinase IV homologueBacteria: proteinase IV, sohB, SppA, pfaP, putative proteasePlants: SppA, protease IVThis group also contains proteins classified as non-peptidase homologues that either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity of peptidases. Related proteins, non-peptidase homologues and unclassified S49 members are also to be found in .
Protein Domain
Name: Zinc finger, C5HC2-type
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents a predicted zinc finger with eight potential zinc ligand binding residues. This domain is found in Jumonji [ ], and may have a DNA binding function. The mouse jumonji protein is required for neural tube formation, and is essential for normal heart development. It also plays a role in the down-regulation of cell proliferation signalling.
Protein Domain
Name: Zinc finger, FYVE-related
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. The FYVE zinc finger is named after four proteins that it has been found in: Fab1, YOTB/ZK632.12, Vac1, and EEA1. The FYVE finger has been shown to bind two Zn 2+ions [ ]. The FYVE finger has eight potential zinc coordinating cysteine positions. Many members of this family also include two histidines in a motif R+HHC+XCG, where + represents a charged residue and X any residue.
Protein Domain
Name: NHL repeat
Type: Repeat
Description: The NHL repeat, named after NCL-1, HT2A and Lin-41, is a conserved structural motif present in a large number of eukaryotic and prokaryotic proteins. For example, the repeat is found in a variety of enzymes of the copper type II, ascorbate-dependent monooxygenase family which catalyse the C terminus alpha-amidation of biological peptides [ ] and in a large family of growth regulators. Many NHL-containing proteins have additional domains such as a RING finger, a B-box zinc finger or a coiled-coil motif. In many, it occurs in tandem arrays, for example in the ringfinger β-box, coiled-coil (RBCC) eukaryotic growth regulators [] or the 'Brain Tumor' protein (Brat) [, ].The NHL repeats are also found in serine/threonine protein kinase (STPK) in diverse range of pathogenic bacteria. These STPK are transmembrane receptors with a intracellular N-terminal kinase domain and extracellular C-terminal sensor domain. In the STPK PknD from Mycobacterium tuberculosis, the sensor domain forms a rigid, six-bladed b-propeller composed of NHL repeats with a flexible tether to the transmembrane domain [ ].The NHL domain is a six-bladed β-propeller, with the blades arrayed in a radial fashion around a central axis, and each blade composed of a highly twisted four stranded antiparallel β-sheet [ ]. The innermost strand of each blade is labeled 'a' and the outermost strand, 'd'. Like in other β-propellers the sequence repeats are offset with respect to the blades of the propeller, such that any given 40-residue NHL repeat spans strands 'b-d' of one propeller blade and strand 'a' of the subsequent blade. This offset ensures circularization of the propeller because the last strand of the final sequence repeat acts as an innermost strand a of the blade that harbors strands 'b-d' from the first sequence repeat. According to structural model analysis, the NHL domain could be involved in protein-protein interaction [].
Protein Domain
Name: BAG domain
Type: Domain
Description: BAG domains are present in Bcl-2-associated athanogene 1 and silencer of death domains. The BAG proteins are modulators of chaperone activity, they bind to HSP70/HSC70 proteins and promote substrate release. The proteins have anti-apoptotic activity and increase the anti-cell death function of BCL-2 induced by various stimuli. BAG-1 binds to the serine/threonine kinase Raf-1 or Hsc70/Hsp70 in a mutually exclusive interaction. BAG-1 promotes cell growth by binding to and stimulating Raf-1 activity. The binding of Hsp70 to BAG-1 diminishes Raf-1 signalling and inhibits subsequent events, such as DNA synthesis, as well as arrests the cell cycle. BAG-1 has been suggested to function as a molecular switch that encourages cells to proliferate in normal conditions but become quiescent under a stressful environment [ , ].BAG-family proteins contain a single BAG domain, except for human BAG-5 which has four BAG repeats [ ]. The BAG domain is a conserved region located at the C terminus of the BAG-family proteins that binds the ATPase domain of Hsc70/Hsp70. The BAG domain is evolutionarily conserved, and BAG domain containing proteins have been described and/or proven in a variety of organisms including Mus musculus (Mouse), Xenopus spp., Drosophila spp., Bombyx mori (Silk moth), Caenorhabditis elegans, Saccharomyces cerevisiae (Baker's yeast), Schizosaccharomyces pombe (Fission yeast), and Arabidopsis thaliana (Mouse-ear cress).The BAG domain has 110-124 amino acids and is comprised of three anti-parallel α-helices, each approximately 30-40 amino acids in length. The first and second helices interact with the serine/threonine kinase Raf-1 and the second and third helices are the sites of the BAG domain interaction with the ATPase domain of Hsc70/Hsp70. Binding of the BAG domain to the ATPase domain is mediated by both electrostatic and hydrophobic interactions in BAG-1 and is energy requiring.
Protein Domain
Name: Chaperonin Cpn60/GroEL/TCP-1 family
Type: Family
Description: The assembly of proteins has been thought to be the sole result of properties inherent in the primary sequence of polypeptides themselves. In some cases, however, structural information from other protein molecules is required for correct folding and subsequent assembly into oligomers [ ]. These 'helper' molecules are referred to as molecular chaperones, a subfamily of which are the chaperonins [], which include 10kDa and 60kDa proteins. These are found in abundance in prokaryotes, chloroplasts and mitochondria. They are required for normal cell growth (as demonstrated by the fact that no temperature sensitive mutants for the chaperonin genes can be found in the temperature range 20 to 43 degrees centigrade []), and are stress-induced, acting to stabilise or protect disassembled polypeptides under heat-shock conditions [].The 10kDa chaperonin (Cpn10) and its bacterial homologue groES, exist as a ring-shaped oligomer of between 6 to 8 identical subunits, whereas the 60kDa chaperonin (Cpn60) and its bacterial homologue groEL, form a structure comprising 2 stacked rings, each ring containing 7 identical subunits [ ]. These ring structures assemble by self-stimulation in the presence of Mg2+-ATP. The Cpn10 and Cpn60 oligomers also require Mg2+-ATP in order to interact to form a functional complex, although the mechanism of this interaction is as yet unknown []. This chaperonin complex is essential for the correct folding and assembly of polypeptides into oligomeric structures, of which the chaperonins themselves are not a part []. The binding of Cpn10 to Cpn60 inhibits the weak ATPase activity of Cpn60.TCP-1 (t-complex polypeptide 1) is a subunit of the hetero-oligomeric complex CCT (chaperonin containing TCP- 1) present in the eukaryotic cytosol. It is a member of the chaperonin family which includes GroEL, 60kDa heat shock protein (Hsp60), Rubisco subunit binding protein (RBP) and thermophilic factor 55 (TF55) [ ]. This entry represents GroEL, Cpn60, TCP-1 and similar proteins found in bacteria, eukaryots and archaea.
Protein Domain
Name: Glutaredoxin, GrxC
Type: Family
Description: Glutaredoxins [ , , ], also known as thioltransferases (disulphide reductases), are small proteins of approximately one hundred amino-acid residues which utilise glutathione and NADPH as cofactors. Oxidized glutathione is regenerated by glutathione reductase. Together these components compose the glutathione system [].Glutaredoxin functions as an electron carrier in the glutathione-dependent synthesis of deoxyribonucleotides by the enzyme ribonucleotide reductase. Like thioredoxin (TRX), which functions in a similar way, glutaredoxin possesses an active centre disulphide bond [ ]. It exists in either a reduced or an oxidized form where the two cysteine residues are linked in an intramolecular disulphide bond. It contains a redox active CXXC motif in a TRX fold and uses a similar dithiol mechanism employed by TRXs for intramolecular disulfide bond reduction of protein substrates. Unlike TRX, GRX has preference for mixed GSH disulfide substrates, in which it uses a monothiol mechanism where only the N-terminal cysteine is required. The flow of reducing equivalents in the GRX system goes from NADPH ->GSH reductase ->GSH ->GRX ->protein substrates [ , , , ]. By altering the redox state of target proteins, GRX is involved in many cellular functions including DNA synthesis, signal transduction and the defense against oxidative stress.Glutaredoxin has been sequenced in a variety of species. On the basis of extensive sequence similarity, it has been proposed [ ] that Vaccinia virus protein O2L is most probably a glutaredoxin. Finally, it must be noted that Bacteriophage T4 thioredoxin seems also to be evolutionary related. In position 5 of the pattern T4 thioredoxin has Val instead of Pro.This subfamily of bacterial glutaredoxins (GRX) includes Escherichia coli Grx1 (GrxC1) and Grx3 (GrxC). GrxC appears to have a secondary role in reducing ribonucleotide reductase (in the absence of GrxA) possibly indicating a role in the reduction of other protein disulphides [ , ].
Protein Domain
Name: CULT domain
Type: Domain
Description: The cereblon protein, originally identified in a screen for mutations causing mild mentalretardation, is a major target of thalidomide and its derivatives, and is responsible for the teratogenic effects of the drug. Cereblon owes its name toits involvement in brain development and to its Lon N-terminal domain. Cereblon proteins occur throughout eukaryotes, however not in fungi. Cereblon is a cofactor of damaged DNA-binding protein 1 (DDB1), whichacts as the central component of an E3 ubiquitin ligase complex and regulates the selective degradation of key proteins in DNA repair, replication andtranscription. Binding of thalidomide to a C-terminal region in cereblon alters the E3 ubiquitin ligase activity of the complex, which may in turncause its teratogenic effects. The thalidomide-binding region of cereblon is a conserved domain, CULT (for Cereblon domain of Unknown activity, bindingcellular Ligands and Thalidomide), carrying several invariant cysteine and tryptophan residues. The CULT domain is also found as the sole domain in afamily of secreted proteins from animals and in a family of bacterial proteins occurring primarily in gamma-proteobacteria. Given the invariant nature of theCULT domain between animals and bacteria, a natural ligand universal to all domains of life seems plausible. The nature of the binding pocket, an aromaticcage of three tryptophan residues, suggests a role in the recognition of cationic ligands [, , , ].The CULT domain is a member of the beta-tent fold, which consists of two four-stranded, antiparallel β-sheets that are oriented at an approximately right angle and pinned together at the top via a structural zinc ion. The thalidomide binding site is formed within the larger, C-terminal β-sheet. A third of the domain, including the thalidomide binding pocket, only folds upon ligand binding [ , , , ].
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom