G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Neuropeptide receptors are present in very small quantities in the cell
and are embedded tightly in the plasma membrane. The neuropeptides exhibita high degree of functional diversity through both regulation of peptide
production and through peptide-receptor interaction []. The mammaliantachykinin system consists of 3 distinct peptides: substance P, substance
K and neuromedin K. All possess a common spectrum of biological activities,including sensory transmission in the nervous system and contraction/
relaxation of peripheral smooth muscles, and each interacts with aspecific receptor type.
In the brain, high concentrations of the NK1 receptor are found in striatum,
olfactory bulb, dendate gyrus, locus coeruleus and spinal chord []. Inperipheral tissues NK1 receptors are found in smooth muscle (e.g., ileum
and bladder), enteric neurons, secretory glands (e.g. parotid), cells ofthe immune system and vascular endothelium. NK1 receptors activate the
phosphoinositide pathway through a pertussis-toxin-insensitive G-protein [].
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Prostanoids (prostaglandins (PG) and thromboxanes (TX)) mediate a wide variety of actions and play important physiological roles in the cardiovascular and immune systems, and in pain sensation in peripheral systems. PGI2 and TXA2 have opposing actions, involving regulation of the interaction of platelets with the vascular endothelium, while PGE2, PGI2 and PGD2 are powerful vasodilators and potentiate the action of various autocoids to induce plasma extravasation and pain sensation. To date, evidence for at least 5 classes of prostanoid receptor has been obtained. However, identification of subtypes and their distribution is hampered by expression of more than one receptor within a tissue, coupled with poor selectivity of available agonists and antagonists.IP receptors induce relaxation in a range of smooth muscles, including
blood vessels, and potently inhibit platelet activation. The receptorsactivate adenylyl cyclase through G-proteins.
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].EP3 receptors mediate contraction in a wide range of smooth muscles,
including gastrointestinal and uterine. They also inhibit neurotransmitter release in central and autonomic nerves through a presynaptic action,and inhibit secretion in glandular tissues (e.g., acid secretion from
gastric mucosa, and sodium and water reabsorption in the kidney). mRNAis found in high levels in the kidney and uterus, and in lower levels in
the brain, thymus, lung, heart, stomach and spleen. The receptors activateadenylate cyclase via an uncharacterised G-protein, probably of the Gi/Go
class.Sequence analysis shows the EP3 receptors to fall into distinct classes,
based on their N- and C-terminal and loop signatures. For convenience,these classes have been designated types 1 to 3.
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Neuromedin U is a neuropeptide, first isolated from porcine spinal cord and
expressed widely in the gastrointestinal, genitourinary and central nervoussystems [
]. Neuromedin U has potent contractile activity on smooth muscle and this activity is believed to reside within the C-terminal portion of the peptide, which is highly conserved between species. Other roles
for the peptide include: regulation of blood flow and ion transport in the intestine, regulation of adrenocortical function and increased blood
pressure []. The roles of neuromedin U in the central nervous system
are poorly understood, but may include: regulation of food intake,neuroendocrine control, modulation of dopamine actions and involvement in
neuropsychiatric disorders. Two G protein-coupled receptor subtypes,with differing expression patterns, have been identified and shown to bind
neuromedin U.
This entry represents the alpha subunit of 2-oxoacid:acceptor oxidoreductases, which catalyse CoA-dependent oxidative decarboxylation of 2-oxoacids. Proteins in this family contain a thiamine diphosphate (TPP) binding domain typical of flavodoxin/ferredoxin oxidoreductases (
), and an N-terminal domain similar to the gamma subunit of the same group of oxidoreductases (
). The genes encoding the proteins are always found in association with a neighbouring gene for a beta subunit (
) which also occurs in a four-subunit (alpha/beta/gamma/ferredoxin) version of the system. This pair of genes is not consistently observed in proximity to any electron acceptor genes, but is found next to putative ferredoxins or ferredoxin-domain proteins in Aromatoleum aromaticum EbN1, Bradyrhizobium japonicum USDA 110, Frankia sp CcI3, Rhodoferax ferrireducens T118, Rhodopseudomonas palustris BisB5, Os, Sphingomonas wittichii RW1 and Streptomyces clavuligerus. Other potential acceptors are also sporadically observed in close proximity including ferritin-like proteins, reberythrin, peroxiredoxin and a variety of other flavin and iron-sulfur cluster-containing proteins.
The phylogenetic distribution of this family encompasses archaea, a number of deeply-branching bacterial clades and only a small number of firmicutes and proteobacteria. The enzyme from Sulfolobus has been characterised with respect to its substrate specificity [] which is described as wide, encompassing various 2-oxoacids such as 2-oxoglutarate, 2-oxobutyrate and pyruvate. Halobacterium salinarum (and most halophilic archaea) contains two 2-oxoacid:ferredoxin oxidoreductases. One (Kor) functions as an oxoglutarate-ferredoxin oxidoreductase [], whereas the other (Por) has been characterised as a pyruvate-ferredoxin oxidoreductase [,
,
].The enzyme from Hydrogenobacter thermophilus has been shown to have a high specificity towards 2-oxoglutarate [
,
] and is one of the key enzymes in the reverse TCA cycle in this organism. Furthermore, considering its binding of coenzyme A (CoA), it can be reasonably inferred that the product of the reaction is succinyl-CoA. The genes for this enzyme in Prevotella intermedia 17, Persephonella marina EX-H1 and Picrophilus torridus DSM 9790 are in close proximity to a variety of TCA cycle genes. Persephonella marina and P. torridus [] are believed to encode complete TCA cycles, and none of these contains the lipoate-based 2-oxoglutarate dehydrogenase (E1/E2/E3) system. That system is presumed to be replaced by this one. In fact, the lipoate system is absent in most organisms possessing a member of this family, providing additional circumstantial evidence that many of these enzymes are capable of acting as 2-oxoglutarate dehydrogenases and supporting flux through TCA cycles in either the forward or reverse directions.
Mediator of RNA polymerase II transcription subunit 15
Type:
Family
Description:
The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins. The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11. The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation.
The head module contains: MED6, MED8, MED11, SRB4/MED17, SRB5/MED18, ROX3/MED19, SRB2/MED20 and SRB6/MED22. The middle module contains: MED1, MED4, NUT1/MED5, MED7, CSE2/MED9, NUT2/MED10, SRB7/MED21 and SOH1/MED31. CSE2/MED9 interacts directly with MED4. The tail module contains: MED2, PGD1/MED3, RGR1/MED14, GAL11/MED15 and SIN4/MED16. The CDK8 module contains: MED12, MED13, CCNC and CDK8. Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.The proteins in this entry represent subunit Med15 of the Mediator complex. They contain a single copy of the approximately 70 residue ARC105 domain. The ARC105 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, ARC105 is a critical transducer of gene activation signals that control early metazoan development [
].
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Vasopressin and oxytocin are members of the neurohypophyseal hormone family
found in all mammalian species. They are present in high levels in theposterior pituitary. Vasopressin has an essential role in the control of
the water content of the body, acting in the kidney to increase water andsodium absorption. In higher concentrations, vasopressin stimulates
contraction of vascular smooth muscle, stimulates glycogen breakdown in theliver, induces platelet activation, and evokes release of corticotrophin
from the anterior pituitary. Vasopressin and its analogues are usedclinically to treat diabetes insipidus.In the periphery, the V1A receptor is found in high levels in vascular
smooth muscle, myometrium and the bladder where it mediates contraction.In the CNS, V1 sites are distributed widely and are found in lateral septal
nucleus, hippocampus, superior collicular, substantia nigra and centralgrey matter. The receptors activate phosphoinositide metabolism through
a pertussis-toxin-insensitive G-protein, probably of the Gq/G11 class.
This entry includes a group of RhoGEFs, including Kalirin and Triple functional domain protein (TRIO) from mammals. Kalirin and TRIO are encoded by separate genes in mammals and by a single one in invertebrates. Kalirin and TRIO share the same complex multidomain structure and display several splice variants. They are implicated in secretory granule (SG) maturation and exocytosis [
,
]. The longest Kalirin and TRIO proteins have a Sec14 domain, a stretch of spectrin repeats, a RhoGEF(DH)/PH cassette (also called GEF1), an SH3 domain, a second RhoGEF(DH)/PH cassette (also called GEF2), a second SH3 domain, Ig/FNIII domains, and a kinase domain. The first RhoGEF(DH)/PH cassette catalyses exchange on Rac1 and RhoG while the second RhoGEF(DH)/PH cassette is specific for RhoA. Kalirin and TRIO are closely related to p63RhoGEF and have PH domains of similar function. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner [
,
].TRIO contains a protein kinase domain and two guanine nucleotide exchange factor (GEF) domains [
]. These functional domains suggest that it may play a role in signalling pathways controlling cell proliferation []. TRIO may form a complex with LAR transmembrane protein tyrosine phosphatase (PT-Pase), which localises to the ends of focal adhesions and plays an important part in coordinating cell-matrix and cytoskeletal rearrangements necessary for cell migration []. Its expression is associated with invasive tumour growth and rapid tumour cell proliferation in urinary bladder cancer [].Kalirin (
) promotes the exchange of GDP by GTP and stimulates the activity of specific Rho GTPases [
]. There are several Kalirin isoforms in humans and mice. Each Kalirin isoform is composed of a unique collection of domains and may have different functions []. In rat, isoforms 1 and 7 are necessary for neuronal development and axonal outgrowth, while isoform 6 is required for dendritic spine formation []. In humans, the major isoform of Kalirin in the adult brain is Kalirin-7, which plays a critical role in spine formation/synaptic plasticity. Kalirin-7 has been linked to neuropsychiatric and neurological diseases such as Alzheimer's, Huntingtin's, ischemic stroke, schizophrenia, depression, and cocaine addiction [,
,
].This entry represents the second SH3 domain present in Kalirin and TRIO.
This entry includes a group of RhoGEFs, including Kalirin and Triple functional domain protein (TRIO) from mammals. Kalirin and TRIO are encoded by separate genes in mammals and by a single one in invertebrates. Kalirin and TRIO share the same complex multidomain structure and display several splice variants. They are implicated in secretory granule (SG) maturation and exocytosis [
,
]. The longest Kalirin and TRIO proteins have a Sec14 domain, a stretch of spectrin repeats, a RhoGEF(DH)/PH cassette (also called GEF1), an SH3 domain, a second RhoGEF(DH)/PH cassette (also called GEF2), a second SH3 domain, Ig/FNIII domains, and a kinase domain. The first RhoGEF(DH)/PH cassette catalyses exchange on Rac1 and RhoG while the second RhoGEF(DH)/PH cassette is specific for RhoA. Kalirin and TRIO are closely related to p63RhoGEF and have PH domains of similar function. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner [,
].TRIO contains a protein kinase domain and two guanine nucleotide exchange factor (GEF) domains [
]. These functional domains suggest that it may play a role in signalling pathways controlling cell proliferation []. TRIO may form a complex with LAR transmembrane protein tyrosine phosphatase (PT-Pase), which localises to the ends of focal adhesions and plays an important part in coordinating cell-matrix and cytoskeletal rearrangements necessary for cell migration []. Its expression is associated with invasive tumour growth and rapid tumour cell proliferation in urinary bladder cancer [].Kalirin (
) promotes the exchange of GDP by GTP and stimulates the activity of specific Rho GTPases [
]. There are several Kalirin isoforms in humans and mice. Each Kalirin isoform is composed of a unique collection of domains and may have different functions []. In rat, isoforms 1 and 7 are necessary for neuronal development and axonal outgrowth, while isoform 6 is required for dendritic spine formation []. In humans, the major isoform of Kalirin in the adult brain is Kalirin-7, which plays a critical role in spine formation/synaptic plasticity. Kalirin-7 has been linked to neuropsychiatric and neurological diseases such as Alzheimer's, Huntingtin's, ischemic stroke, schizophrenia, depression, and cocaine addiction [,
,
].This entry represents the first pleckstrin homology (PH) domain present in Kalirin and TRIO.
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Melanin-concentrating hormone (MCH) is a cyclic peptide originally
identified in teleost fish []. In fish, MCH is released from thepituitary and causes lightening of skin pigment cells through pigment
aggregation. In mammals, MCH is predominantly expressed in thehypothalamus, and functions as a neurotransmitter in the control of a range
of functions. A major role of MCH is thought to be in the regulation offeeding: injection of MCH into rat brains stimulates feeding; expression of
MCH is upregulated in the hypothalamus of obese and fasting mice; and micelacking MCH are lean and eat less. MCH and alpha melanocyte-stimulating
hormone (alpha-MSH) have antagonistic effects on a number of physiologicalfunctions. Alpha-MSH darkens pigmentation in fish and reduces feeding in
mammals, whereas MCH increases feeding [].Two G protein-coupled receptors, MCH1 and MCH2, have recently been
identified as receptors for the melanin-concentrating hormone.
Adenylyl cyclase class-4/guanylyl cyclase, conserved site
Type:
Conserved_site
Description:
Guanylate cyclases (
) [
,
,
,
] catalyze the formation of cyclic GMP (cGMP) from GTP. cGMP acts as an intracellular messenger, activating cGMP-dependent kinases and regulating CGMP-sensitive ion channels. The role of cGMP as a second messenger in vascular smooth muscle relaxation and retinal photo-transduction is well established. Guanylate cyclase is found both in the soluble and particular fraction of eukaryotic cells. The soluble and plasma membrane-bound forms differ in structure, regulation and other properties. Most currently known plasma membrane-bound forms are receptors for small polypeptides. The topology of such proteins is the following: they have a N-terminal extracellular domain which acts as the ligand binding region, then a transmembrane domain, followed by a large cytoplasmic C-terminal region that can be subdivided into two domains: a protein kinase-like domain that appears important for proper signalling and a cyclase catalytic domain. This topology is schematically represented below. +-----------------------xxxxx----------------------+------------+
| Ligand-binding XXXXX Protein Kinase like | Cyclase |+-----------------------xxxxx----------------------+------------+
Extracellular Transmembrane CytoplasmicThe known guanylate cyclase receptors are: The sea-urchins receptors for speract and resact, which are small peptides that stimulate sperm motility and metabolism. The receptors for natriuretic peptides (ANF). Two forms of ANF receptors with guanylate cyclase activity are currently known: GC-A (or ANP-A) which seems specific to atrial natriuretic peptide (ANP), and GC-B (or ANP-B) which seems to be stimulated more effectively by brain natriuretic peptide (BNP) than by ANP. The receptor for Escherichia coli heat-stable enterotoxin (GC-C). The endogenous ligand for this intestinal receptor seems to be a small peptide called guanylin. Retinal guanylate cyclase (retGC) which probably plays a specific functional role in the rods and/or cones of photoreceptors. It is not known if this protein acts as receptor, but its structure is similar to that of the other plasma membrane-bound GCs. The soluble forms of guanylate cyclase are cytoplasmic heterodimers. The two subunits, alpha and beta are proteins of from 70 to 82 Kd which are highly related. Two forms of beta subunits are currently known: beta-1 which seems to be expressed in lung and brain, and beta-2 which is more abundant in kidney and liver. The membrane and cytoplasmic forms of guanylate cyclase share a conserved domain which is probably important for the catalytic activity of the enzyme. Such a domain is also found twice in the different forms of membrane-bound adenylate cyclases (also known as class-III) [
,
] from mammals, slime mold or Drosophila.
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Melanin-concentrating hormone (MCH) is a cyclic peptide originally
identified in teleost fish []. In fish, MCH is released from thepituitary and causes lightening of skin pigment cells through pigment
aggregation. In mammals, MCH is predominantly expressed in thehypothalamus, and functions as a neurotransmitter in the control of a range
of functions. A major role of MCH is thought to be in the regulation offeeding: injection of MCH into rat brains stimulates feeding; expression of
MCH is upregulated in the hypothalamus of obese and fasting mice; and micelacking MCH are lean and eat less. MCH and alpha melanocyte-stimulating
hormone (alpha-MSH) have antagonistic effects on a number of physiologicalfunctions. Alpha-MSH darkens pigmentation in fish and reduces feeding in
mammals, whereas MCH increases feeding [].Two G protein-coupled receptors, MCH1 and MCH2, have recently been
identified as receptors for the melanin-concentrating hormone.
The large subunit of [NiFe]-hydrogenase, as well as other nickel metalloenzymes, is synthesized as a precursor devoid of the metalloenzyme active site. This precursor then undergoes a complex post-translational maturation process that requires a number of accessory proteins [,
,
]. Members of the HypF family are accessory proteins involved in hydrogenase maturation. They contain the following domains: acylphosphatase, zinc fingers (2 repeats), a YrdC-like domain, and a C-terminal domain with a putative O-carbamoyltransferase motif.The presence of CO and CN- ligands of the active site iron atoms is essential for [NiFe]-hydrogenase enzyme activity []. Both ligands have been suggested to originate from carbamoylphosphate [], which is required for maturation of [NiFe]-hydrogenases [
]. Escherichia coli HypF interacts with carbamoylphosphate as a substrate and releases inorganic phosphate []. In addition, HypF also cleaves ATP into AMP and pyrophosphate in the presence of carbamoylphosphate. This, and the fact that HypF catalyzes a carbamoylphosphate-dependent pyrophosphate ATP exchange reaction, suggest that the protein catalyzes the activation of carbamoylphosphate [].The mechanism of action of HypF, as well as of its individual domains, is not yet clear. Mutations in any of the three major signature motifs, the acylphosphatase, the zinc fingers, and the O-carbamoyltransferase motif, can block carbamoylphosphate phosphatase activity. This indicates an integrated cooperativity between these domains in the cleavage reaction [].The N-terminal acylphosphatase (ACP) domain is thought to support the conversion of carbamoylphosphate into CO and CN- [
,
]. Biochemical results demonstrating its ACP activity are not available [,
]. ACPs are small enzymes that specifically catalyze the hydrolysis of carboxyl-phosphate bonds in acylphosphates, including carbamoylphosphate []. Zinc fingers have been implicated in bivalent cation binding or as part of a chaperone domain interacting with the large subunit precursor, but experimental studies on such a function are lacking thus far. The YrdC-like domain is present in protein families with regulatory functions and has been implicated in RNA binding []. It is not clear what function it may have in members of the HypF family. A C-terminal domain is distantly related to peptidase M22, but contains a conserved O-carbamoyltransferase motif required for the carbamoylphosphate phosphatase activity []. The function of this domain is not clear.Nomenclature note: the following names are used as synonyms of HypF: HupY in Azotobacter chroococcum, HupN in Rhizobium leguminosarum, HydA in E. coli. In other organisms, these names are used to designate various "hydrogenase cluster"proteins unrelated to the members of this family.
Sodium proton exchangers (NHEs) constitute a large family of integral membrane protein transporters that are responsible for the counter-transport of protons and sodium ions across lipid bilayers [
,
]. These proteins are found in organisms across all domains of life. In archaea, bacteria, yeast and plants, these exchangers provide increased salt tolerance by removing sodium in exchanger for extracellular protons. In mammals they participate in the regulation of cell pH, volume, and intracellular sodium concentration, as well as for the reabsorption of NaCl across renal, intestinal, and other epithelia [,
,
,
]. Human NHE is also involved in heart disease, cell growth and in cell differentiation []. The removal of intracellular protons in exchange for extracellular sodium effectively eliminates excess acid from actively metabolising cells. In mammalian cells, NHE activity is found in both the plasma membrane and inner mitochondrial membrane. To date, nine mammalian isoforms have been identified (designated NHE1-NHE9) [,
]. These exchangers are highly-regulated (glyco)phosphoproteins, which, based on their primary structure, appear to contain 10-12 membrane-spanning regions (M) at the N terminus and a large cytoplasmic region at the C terminus. The transmembrane regions M3-M12 share identity with other members of the family. The M6 and M7 regions are highly conserved. Thus, this is thought to be the region that is involved in the transport of sodium and hydrogen ions. The cytoplasmic region has little similarity throughout the family. There is some evidence that the exchangers may exist in the cell membrane as homodimers, but little is currently known about the mechanism of their antiport [].This entry represents Sodium/hydrogen exchanger 2/4 (NHE-2/4) and similar proteins from vertebrates. NHE-2 is found preferentially in the gastrointestinal tract
and the kidney and is also much less sensitive to the inhibitory diureticamiloride than the more ubiquitous NHE1. The targeting of NHE2 in polarised
epithelial cells is controversial, some studies reporting basolateral, andothers reporting apical localisation. When transfected into mutagenised
cells devoid of endogenous NHE activity, NHE2 is capable of regulating pH,cellular volume, and proliferation, in a manner similar to NHE1. In humans, NHE-2 has been related to ulcerative colitis, colon cancer and cerebral edema formation [
,
,
]. NHE-4 may play a specialized role in the kidney in rectifying cell volume in response to extreme fluctuations of hyperosmolar-stimulated cell shrinkage [
]. It Is relatively amiloride and ethylisopropylamiloride (EIPA) insensitive []. Variants of this protein have been related to eczema [].
Ca2+ ions are unique in that they not only carry charge but they are also the most widely used of diffusible second messengers. Voltage-dependent Ca2+ channels (VDCC) are a family of molecules that allow cells to couple electrical activity to intracellular Ca2+ signalling. The opening and closing of these channels by depolarizing stimuli, such as action potentials, allows Ca2+ ions to enter neurons down a steep electrochemical gradient, producing transient intracellular Ca2+ signals. Many of the processes that occur in neurons, including transmitter release, gene transcription and metabolism are controlled by Ca2+ influx occurring simultaneously at different cellular locales. The pore is formed by the alpha-1 subunit which incorporates the conduction pore, the voltage sensor and gating apparatus, and the known sites of channel regulation by second messengers, drugs, and toxins []. The activity of this pore is modulated by four tightly-coupled subunits: an intracellular beta subunit; a transmembrane gamma subunit; and a disulphide-linked complex of alpha-2 and delta subunits, which are proteolytically cleaved from the same gene product. Properties of the protein including gating voltage-dependence, G protein modulation and kinase susceptibility can be influenced by these subunits.Voltage-gated calcium channels are classified as T, L, N, P, Q and R, and are distinguished by their sensitivity to pharmacological blocks, single-channel conductance kinetics, and voltage-dependence. On the basis of their voltage activation properties, the voltage-gated calcium classes can be further divided into two broad groups: the low (T-type) and high (L, N, P, Q and R-type) threshold-activated channels.The voltage-dependent calcium channel gamma (VDCCG) subunit family consists
of at least 8 members, which share a number of common structural features[
]. Each member is predicted to possess 4 transmembrane domains, with intracellular N- and C-termini. The first extracellular loop contains a highly conserved N-glycosylation site and a pair of conserved cysteine residues. The C-terminal 7 residues of VDCCG-2, -3, -4 and -8 are also conserved andcontain a consensus site for phosphorylation by cAMP and cGMP-dependent
protein kinases, and a target site for binding by PDZ domain proteins [].The VDCCG-1 subunit is a 25kDa protein expressed exclusively in skeletal muscle cells, where it functions as a dihydropyridine-sensitive, L-type
calcium channel subunit []. The modulatory properties of VDCCG-1 subunits have been investigated using heterologous expression systems. Coexpressionof VDCCG-1 subunits with L-type or P/Q-type channels induces moderate
changes in activation and inactivation properties, and modification of thepeak current amplitude of these channels [
,
].
This entry represents a structural domain with a beta-Grasp fold that is found in molybdopterinsynthase subunit MoaD [
], as well as in the thiamin biosynthesis sulphur carrier protein ThiS [].ThiS (thiaminS) is a 66 aa protein involved in sulphur transfer. ThiS is coded in the thiCEFSGH operon in Escherichia coli. ThiS proteins have two conserved Glycines at the COOH terminus. Thiocarboxylate is formed at the last G in the activation process. Sulphur is transferred from ThiI to ThiS in a reaction catalysed by IscS [
]. MoaD, a protein involved in sulphur transfer during molybdopterin synthesis, is about the same length and shows limited sequence similarity to ThiS. Both have the conserved GG at the COOH end.
This entry represents ferritin and structurally related proteins. Ferritin is a major non-haem iron storage protein in animal, plants and microorganisms [
]. Iron is required by most organisms, but is potentially toxic due to its reactivity, which is counteracted by sequestering it into ferritin. Ferritin consists of a 4-helical bundle core, and contains a bimetal-ion centre in the middle of the bundle. Other proteins with this structure include: haem-containing bacteriferritins; rubrerythrin, which appears to have a role in anaerobic detoxification pathway for reactive oxygen species []; Dps (DNA-binding proteins from starved cells) used in bacteria for iron storage-detoxification; and CRD1 (AcsF), which is required for the maintenance of photosystem I [].
This domain consists of several eukaryotic suppressor of forked (Suf) like proteins. The Drosophila melanogaster suppressor of forked [Su(f)] protein shares homology with the Saccharomyces cerevisiae RNA14 protein and the 77kDa subunit of Homo sapiens cleavage stimulation factor, which are proteins involved in mRNA 3' end formation. This suggests a role for Su(f) in mRNA 3' end formation in Drosophila. The su(f) gene produces three transcripts; two of them are polyadenylated at the end of the transcription unit, and one is a truncated transcript, polyadenylated in intron 4. It is thought that su(f) plays a role in the regulation of poly(A) site utilisation and the GU-rich sequence is important for this regulation to occur [].
The PRC-barrel is an all beta barrel domain found in photosystem reaction centre subunit H of the purple bacteria and RNA metabolism proteins of the RimM group. PRC-barrels are approximately 80 residues long, and found widely represented in bacteria, archaea and plants. This domain is also present at the carboxyl terminus of the pan-bacterial protein RimM, which is involved in ribosomal maturation and processing of 16S rRNA. A family of small proteins conserved in all known euryarchaea are composed entirely of a single stand-alone copy of the domain [
].This superfamily identifies the PRC-barrel domain and related domains, including the CKK domain, which is found at the C terminus of calmodulin-regulated spectrin-associated (or CAMSAP) proteins.
The sterol-sensing domain (SSD) is an around 180 residues long cluster of five
membrane-spanning segments. The SSD domain is conserved across phyla andconfers sensitivity to regulation by sterol. Although the SSD domain appears
to function as a regulatory domain involved in linking vesicle trafficking andprotein localization with such varied processes as cholesterol homeostasis,
cell signalling and cytokinesis, its exact mode of action is not clear. It isnot known whether it interacts with sterols, such as cholesterol, or whether
it interacts with another-sterol regulated protein. Alternatively, the SSD mayinteract with lipids other than cholesterol [
,
,
].In addition to the proteins above, the SSD is also found in a number of
bacterial drug resistance proteins.
This entry include a group of proteins involved in chromatin remodelling, including Vps72 (vacuolar protein sorting-associated protein 72) from budding yeasts. Vps72 is a Htz1-binding component of the SWR1 complex, which is required for the incorporation of the histone variant H2AZ into chromatin [
]. It is also required for vacuolar protein sorting in budding yeasts [].The Vps72 homologue from animals, YL-1, is a deposition-and-exchange histone chaperone specific for H2AZ1, specifically chaperones H2AZ1 and deposits it into nucleosomes. It is component of the SRCAP and Tip60 complexes, and mediates the ATP-dependent exchange of histone H2AZ1/H2B dimers for nucleosomal H2A/H2B, leading to transcriptional regulation of selected genes by chromatin remodeling [
,
].
This entry represents the N-terminal domain of SS18 and related proteins.
SSXT (also known as SS18) appears to function synergistically with RBM14 as a transcriptional coactivator [
].The SSXT protein is involved in synovial sarcoma in humans. A SYT-SSX fusion gene resulting from the chromosomal translocation t(X;18) (p11;q11) is characteristic of synovial sarcomas. This translocation fuses the SSXT (SYT) gene from chromosome 18 to either of two homologous genes at Xp11, SSX1 or SSX2 [
].The SS18 family also includes SS18-like proteins 1 and 2, and GRF1-interacting factors from plants [
]. SS18-like protein 1 is a transcriptional activator which is required for calcium-dependent dendritic growth and branching in cortical neurons [,
].
The following proteins share a conserved region called the MENTAL (MLN64 N-terminal) domain, composed of four transmembrane helices with three short
intervening loops [,
,
]:Animal MLN64 (metastatic lymph node 64), a late endosomal membrane protein
containing a carboxyl-terminal cholesterol binding START domain (). It is probably involved in intracellular cholesterol
transport.Mammalian MENTHO (MLN64 N-terminal domain homologue), a late endosomal
protein containing only the MENTAL domain. It is probably involved incellular cholesterol homoeostasis.
The ~170-amino acid MENTAL domain mediates MLN64 and MENTHO homo- and hetero-interactions, targets both proteins to late endosomes and binds cholesterol.
The MENTAL domain might serve to maintain cholesterol at the membrane of lateendosomes prior to its shuttle to cytoplasmic acceptor(s) through the START
domain.
This entry represents the N-terminal conserved site of the CAP protein. Structurally, CAP is a protein of 474 to 551 residues, which consist of two domains separated by a proline-rich hinge. In budding and fission yeasts the CAP protein is a bifunctional protein whose N-terminal domain binds to adenylyl cyclase, thereby enabling that enzyme to be activated by upstream regulatory signals, such as Ras. The N terminus also catalyses cofilin-mediated severing of actin filaments [
]. The C-terminal domain plays a role in recycling cofilin-bound, ADP-actin monomers [].CAP is conserved in higher eukaryotic organisms. Although the role in Ras signalling does not extend beyond yeasts, the actin regulation function is conserved in all eukaryotes [
].
This entry represents the C-terminal domain found in DNA/pantothenate metabolism flavoproteins, which affects synthesis of DNA and pantothenate metabolism. These proteins contain ATP, phosphopantothenate, and cysteine binding sites. The structure of this domain has been determined in human phosphopantothenoylcysteine (PPC) synthetase [] and as the PPC synthase domain (CoaB) from the Escherichia coli coenzyme A bifunctional protein CoaBC []. This domain adopts a 3-layer alpha/beta/alpha fold with mixed β-sheets, which topologically resembles a combination of Rossmann-like and ribokinase-like folds. The structure of these proteins predicts a ping pong mechanism with initial formation of an acyladenylate intermediate, followed by release of pyrophosphate and attack by cysteine to form the final products PPC and AMP.
This family consists of several eukaryotic Aph-1 proteins. Aph-1 is an essential subunit of the gamma-secretase complex, an endoprotease complex that catalyses the intramembrane proteolysis of Notch, beta-amyloid precursor protein, and other substrates as part of a new signalling paradigm and as a key step in the pathogenesis of Alzheimer's disease [
]. It is thought that the presenilin heterodimer comprises the catalytic site and that a highly glycosylated form of nicastrin associates with it. Aph-1 and Pen-2, two membrane proteins genetically linked to gamma-secretase, associate directly with presenilin and nicastrin in the active protease complex. Co-expression of all four proteins leads to marked increases in presenilin heterodimers, full glycosylation of nicastrin, and enhanced gamma-secretase activity [].
Budding yeast Gdt1 is a Golgi-localized calcium transporter required for stress-induced calcium signalling and protein glycosylation [
]. Its human homologue, TMEM165, may be a Golgi Ca2(+)/H(+) antiporter []. Defects in the human protein TMEM165 cause a subtype of Congenital Disorders of Glycosylation []. In Arabidopsis , this protein is variously known as CCHA1 (a chloroplast-localized potential Ca(2+)/H(+) antiporter), chloroplastic PAM71 (photosynthesis affected mutant 71), and GDT1-like protein 1, chloroplastic. It has been reported to be a putative chloroplast-localized Ca(2+)/H(+) antiporter with critical functions in the regulation of PSII and in chloroplast Ca(2+) and pH homeostasis []. It has also been suggested that it may function in Mn(2+) uptake into thylakoids, ensuring optimal PSII performance [].
The S1 domain was originally identified in ribosomal protein S1 but is found in a large number of proteins involved in RNA metabolism. It belongs to the OB-fold family. The structure of the S1 RNA-binding domain from the Escherichia coli polynucleotide phosphorylase has been determined using NMR methods and consists of a five-stranded antiparallel beta barrel. Conserved residues on one face of the barrel and adjacent loops form the putative RNA-binding site [
,
].The structure of the S1 domain is very similar to that of cold shock proteins. This suggests that they may both be derived from an ancient nucleic acid-binding protein [
].This entry does not include translation initiation factor IF-1 S1 domains.
Remorins are plant-specific plasma membrane-associated proteins. In tobacco remorin co-purifies with lipid rafts. Most remorins have a variable, proline-rich N-half and a more conserved C-half that is predicted to form coiled coils. Consistent with this, circular dichroism studies have demonstrated that much of the protein is α-helical. Remorins exist in plasma membrane preparations as oligomeric structures and form filaments in vitro. The proteins can bind polyanions including the extracellular matrix component oligogalacturonic acid (OGA). In vitro, remorin in plasma membrane preparations is phosphorylated (principally on threonine residues) in the presence of OGA and thus co-purifies with a protein kinases(s). The biological functions of remorins are unknown but roles as components of the membrane/cytoskeleton are possible [
].
The K homology domain is a common RNA-binding motif present in one or multiple copies in both prokaryotic and eukaryotic regulatory proteins. The KH motifs may act cooperatively to bind RNA in the case of multiple motifs, or independently in the case of single KH motif proteins. Prokaryotic (pKH) and eukaryotic (eKH) KH domains share a KH-motif, but have different topologies. The pKH domain has been found in a number of proteins, including the N-terminal domain of the S3 ribosomal protein [
], the C-terminal domain of Era GTPase [] and the two C-terminal domains of the NusA transcription factor []. The structure of the pKH domain consists of a two-layer α/β fold in the arrangement α/β(2)/α/β.
Histidine-tRNA ligase (also known as histidyl-tRNA synthetase) (
) is an alpha2 dimer that belongs to class IIa. Every completed genome includes a histidine-tRNA ligase. Apparent second copies from Bacillus subtilis, Synechocystis sp. (strain PCC 6803), and Aquifex aeolicus are slightly shorter, more closely related to each other than to other hisS proteins, and not demonstrated to act as histidine-tRNA ligases (see
). The
regulatory protein kinase GCN2 of Saccharomyces cerevisiae (YDR283c), and related proteins from other species designated eIF-2 alpha kinase, have a domain closely related to histidine-tRNA ligase that may serve to detect and respond to uncharged tRNA(his), an indicator of amino acid starvation, but these regulatory proteins are not orthologous.
This family contains the Syd protein that has been implicated in the Sec-dependent transport of polypeptides across the inner membrane in bacteria. Syd has been shown to bind the SecY subunit of membrane-embedded SecYEG heterotrimer (also known as core translocon or SecY complex) which is a conserved protein-conducting channel essential for the biogenesis of most of the secretory and integral membrane proteins [
]. The SecY-binding site of Syd is a conserved concave and electronegative groove that forms interactions with the electropositive loops of the SecY subunit []. Syd is also known to verify the proper assembly of the SecY complex in the membrane by interfering with protein translocation only when the channel displays abnormal SecY-SecE associations [].
Methylguanine DNA methyltransferase, ribonuclease-like domain
Type:
Domain
Description:
Synonym(s): 6-O-methylguanine-DNA methyltransferase, O-6-methylguanine-DNA-alkyltransferaseThe repair of DNA containing O6-alkylated
guanine is carried out by DNA-[protein]-cysteine S-methyltransferase (
). The major mutagenic and carcinogenic effect of methylating agents in DNA is the formation of O6-alkylguanine. The
alkyl group at the O-6 position is transferred to a cysteine residue in theenzyme [
]. This is a suicide reaction since the enzyme is irreversibly inactivatedand the methylated protein accumulates as a dead-end product. Most, but not
all of the methyltransferases are also able to repair O-4-methylthymine. DNA-[protein]-cysteine S-methyltransferases are widely distributed and are found in various prokaryotic and eukaryotic sources [
].This group of proteins are characterised by having an N-terminal ribonuclease-like domain associated with 6-O-methylguanine DNA methyltransferase activity (
).
Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. Upon cell stimulation, diacylglycerol kinase (DGK) converts DAG into phosphatidate, initiating the resynthesis of phosphatidylinositols and attenuating protein kinase C activity [
,
]. Most vertebrate species contain 10 different DGK isozymes. The catalytic domain constitutes the single largest sequence element within the DGK proteins that is commonly and uniquely shared by all family members []. DGK proteins can be classified into five classes based on the presence or absence of specific functional domains. DGKs play an important role in controlling diverse cellular processes including development, cell division and proliferation, neuronal and immune responses, vascular traffic, apoptosis and cytoskeletal reorganization [].
This entry represents the catalytic domain found in TOPK, which belongs to a superfamily that contains other protein kinases, such as RIO kinases, aminoglycoside phosphotransferase, choline kinase and phosphoinositide 3-kinase.Lymphokine-activated killer T-cell-originated protein kinase (TOPK), also called PDZ-binding kinase (PBK), is activated at the early stage of mitosis and plays a critical role in cytokinesis [
]. It partly functions as a mitogen-activated protein kinase (MAPK) kinase and is capable of phosphorylating p38, JNK1, and ERK2. TOPK also plays a role in DNA damage sensing and repair through its phosphorylation of histone H2AX [,
]. It contributes to cancer development and progression by downregulating the function of tumour suppressor p53 and reducing cell-cycle regulatory proteins [,
].
This entry represents the transposon-transfer assisting protein (TTRAP). TTRAP are small bacterial proteins largely from Clostrium difficile. From comparative and other structural studies of the PDB:
(
), it has been suggested that this family is required for interacting with other proteins in order to facilitate the transfer of the transposon CTn4 between different bacterial species. The
structure comprises an α-helical fold of four α-helices leading to the production of two clefts, the larger of which displays two highly conserved residues in close proximity, Glu-8 and Lys-48. The gene concerned is part of an operon within transposon CTn4, and is expressed alongside a putative DNA primase, a DNA topoisomerase and conjugal transfer proteins [
].
This superfamily contains the Syd protein that has been implicated in the Sec-dependent transport of polypeptides across the inner membrane in bacteria. Syd has been shown to bind the SecY subunit of membrane-embedded SecYEG heterotrimer (also known as core translocon or SecY complex), which is a conserved protein-conducting channel essential for the biogenesis of most of the secretory and integral membrane proteins []. The SecY-binding site of Syd is a conserved concave and electronegative groove that forms interactions with the electropositive loops of the SecY subunit []. Syd is also known to verify the proper assembly of the SecY complex in the membrane by interfering with protein translocation only when the channel displays abnormal SecY-SecE associations [].
This entry represents the second of five domains on the Kluyveromyces lactis Ndc10 protein [
]. Each subunit of the Ndc10 dimer binds a separate fragment of DNA, suggesting that Ndc10 stabilises a DNA loop at the centromere. Proteins containing this domain also include Cbf2 from S. cerevisiae. S. cerevisiae Ndc10 (also known as Cbf2) is a component of the centromere DNA-binding protein complex CBF3, which is essential for chromosome segregation and movement of centromeres along microtubules [,
,
]. The structure of the protein shows that it can be divided in the two α-helical N- and C-lobes. The N-lobe consists of an antiparallel four-helix bundle with a central β-sheet which is involved in DNA binding [].
Multidrug and toxic compound extrusion family, eukaryotic
Type:
Family
Description:
The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. MATE has also been identified as a large multigene family in plants, where the proteins are linked to disease resistance. A number of family members are involved in the synthesis of peptidoglycan components in bacteria [
,
].This subfamily, which is restricted to eukaryotes, contains vertebrate solute transporters responsible for secretion of cationic drugs across the brush border membranes, yeast proteins located in the vacuole membrane, and plant proteins involved in disease resistance and iron homeostatis under osmotic stress [
,
,
].
This family represents DUF34/metal-binding proteins (previously known as GTP cyclohydrolase 1 type 2) from bacteria.This entry includes the DUF34/metal-binding protein/NIF3 proteins, which are widely distributed across superkingdoms. They were previously annotated as GTP cyclohydrolase 1 type 2 [
] and, recently, through a comprehensive literature review and integrative bioinformatic analyses it was revealed that annotations for these members are misleading as they were based on a single set of in vitro results examining the NIF3 homolog of Helicobacter pylori []. Actually, they have varied phenotypes with the unifying functional role as metal-binding proteins [].NIF3 interacts with the yeast transcriptional coactivator Ngg1p which is part of the ADA complex, the exact function of this interaction is unknown [
,
].
Serine/threonine-protein phosphatase with EF-hands
Type:
Family
Description:
This well-conserved family of animal EF-hand-containing serine/threonine protein phosphatases functions in diverse sensory neurons. The protein comprises an IQ calmodulin-binding region, a metallo-phosphoesterase domain (
) of the serine/threonine-specific protein phosphatase and bis(5-nucleosyl)-tetraphosphatase type, and a C-terminal extension with three calcium-binding EF-hand motifs.
The prototype protein is the product of the fruit fly rdgC gene, which is required to prevent light-induced retinal degeneration [
]. There are two homologous mammalian genes, PPEF1 and PPEF2. Although no essential role has been identified, PPEF2 is expressed in retinal rod photoreceptors, where it assumes several splice forms, and in the pineal gland. PPEF1 is expressed in photosensory neurons and inner ear cells in the developing mouse [].
This entry represents the RNA recognition motif 1 (RRM1) of SRSF1 (also knwon as ASF/SF2). SRSF1 is a member of the SR (serine/arginine) protein family of splicing regulators. Besides mRNA splicing, it is also involved in regulating mRNA transcription, stability and nuclear export, NMD, and translation, as well as protein sumoylation. It has been identified as a proto-oncogene. SRSF1 contains two RRMs, which are required for efficient RNA binding and splicing [
]. The SR protein family is a group of mRNA metabolism regulators. Its members have a modular domain with one or two RNA-recognition motifs (RRMs) and a C-terminal RS domain comprising multiple Arg-Ser dipeptide repeats. There are 12 human SR proteins [
].
Knottins are small proteins characterised by a cystine-knot [
]. They constitute a large family of structurally related peptides with diverse biological functions, including inhibitors, anti-microbial peptides and toxins []. Their structure is composed of a disulfide-bound fold and contains β-hairpin with two adjacent disulfides.The scorpion toxin-like domain is found in a subgroup of metazoan knottins mainly from the arthropoda, which include the antibacterial defensins [
] and the scorpion alpha-neurotoxins [,
]. The plant sequences include members of the gamma-thionin family, which are plant defensins that have no antifungal activity. Other members are insect alpha-amylase inhibitors, cysteine-rich antifungal proteins and proteins annotated as proteinase inhibitors; those that are characterised belong to MEROPS inhibitor family I18, clan I.
Proteins of the surface presentation of antigen (SpoA) group are involved in a secretory pathway responsible for the surface presentation of invasion plasmid antigen needed for the entry of Salmonella and other species into mammalian cells [
,
].They could play a role in preserving the translocation competence of the IPA antigens and are required for secretion of the three IPA proteins [].The SpoA structure is composed of a segment-swapped dimer forming two identical conjoint barrels fold and is topologically similar to the FMN-binding split barrel.This entry represents the C-terminal region of Flagellar motor switch proteins FliN and FliM and similar proteins mainly found in bacteria. This domain seems to play a key role in flagellation [
].
The tsx gene of Escherichia coli encodes an outer membrane protein, Tsx, which constitutes the receptor for colicin K and Bacteriophage T6, and functions as a substrate-specific channel for nucleosides and deoxy-nucleosides [
]. The protein contains 294 amino acids, the first 22 of which are characteristic of a bacterial signal sequence peptide. The putative mature form of Tsx contains 272 residues with a calculated Mr of 31418. The Tsx sequence shows an even distribution of charged residues and lacks extensive hydrophobic stretches []. Tsx shows no significant similarities to the channel-forming proteins OmpC, OmpF, PhoE and LamB from the E. coli outer membrane.This entry also contains related proteins of unknown function.
This RNA binding domain is found at the N terminus of transcriptional antitermination proteins such as BglG, SacY and LicT. These proteins control the expression of sugar metabolising operons in Gram-positive and Gram-negative bacteria. This domain has been called the CAT (Co-AntiTerminator) domain. It binds as a dimer [
] to short Ribonucleotidic Anti-Terminator (RAT) hairpin, each monomer interacting symmetrically with both strands of the RAT hairpin []. In the full-length protein, CAT is followed by two phosphorylatable PTS regulation domains that modulate the RNA binding activity of CAT. Upon activation, the dimeric proteins bind to RAT targets in the nascent mRNA, thereby preventing abortive dissociation of the RNA polymerase from the DNA template [].
This entry represents the SH2 domain of SH2B2.SH2B adapter protein 2 (SH2B2, also known as APS) belongs to the SH2B family of adapter proteins [
]. SH2B2 is involved in multiple signalling pathways, including cytokine signalling and insulin receptor signalling [,
]. It also binds to Vav3, a member of the vav proto-oncogene family, and increases its activity [].SH2B family contains three members of adaptor proteins: SH2B1, 2 and 3 [
]. Typical SH2B proteins contain a SH2 (Src homology 2) and a PH (pleckstrin homology) domain. They serve as adaptors involved in signalling by the receptors for growth factors, such as insulin-like growth factor 1, platelet-derived growth factor and nerve growth factor [].
The flagellar motor switch in Escherichia coli and Salmonella typhimurium regulates the direction of
flagellar rotation and hence controls swimming behaviour. The switch is a complexapparatus that responds to signals transduced by the chemotaxis sensory signalling
system during chemotactic behaviour []. Theswitch complex comprises at least three proteins - FliG, FliM and FliN. It has been
shown that FliG interacts with FliM, FliM interacts with itself, and FliM interacts withFliN [
]. The proteinsare not particularly hydrophobic and may be peripheral to the membrane, possibly mounted
on the basal body M ring [,
].This entry represents the flagellar motor switch protein FliN. Longer proteins in which this region is a C-terminal domain typically are designated FliY.
The structure of TusA (also known YhhP and SirA) consists of an α/β sandwich with a β-α-β-α-β(2) fold, comprising a mixed four-stranded β-sheet stacked against two α-helices, both of which are nearly parallel to the strands of the β-sheet [
]. Several uncharacterised bacterial proteins (73 to 81 amino-acid residues in length) that contain a well-conserved region in their N-terminal region show structural similarity to the TusA protein, including the E. coli protein YedF (), and other members of the UPF0033 family.
NOTE: TusA was previously known as SirA, but should not be confused with the sporulation inhibitor of replication protein SirA (
) or with the LuxR/UhpA family response regulator
, also known as SirA.
Halocyanins are blue (type I) copper redox proteins found in halophilic archaea such as Natronomonas pharaonis (Natronobacterium pharaonis). Halocyanin from from N. pharaonis has been characterised and shown to be a small blue copper protein with a molecular mass of about 15.5kDa [
,
]. This protein, which was named halocyanin, contains one Cu2+, with a copper-binding site containing two His, one Met, and one Cys as probable ligands. It is probable that halocyanin is a peripheral membrane protein, which serves as a mobile electron carrier.This entry represents the copper-binding domain of halocyanins. This domain is present only once in some halocyanins and is duplicated in others. It is not found in plastocyanins or certain divergent paralogs of halocyanin.
S-bacillithiolation is the formation of mixed disulfide bonds between protein thiols and the general thiol reductant bacillithiol (BSH) under oxidative stress. BSH is an equivalent of glutathione (GSH) in Firmicutes [
,
,
]. This protein family includes the bacilliredoxins BrxA and BrxB (previously known as YphP and YqiW, respectively) and uncharacterised proteins from bacteria. BrxA and BrxB debacillithiolate (remove BSH) the S-bacillithiolated OhrR (OhrR-SSB) and in vivo NaOCl-generated S-bacillithiolated MetE (MetE-SSB). Both proteins are involved in maintaining redox homeostasis in response to disulfide stress conditions [
]. The crystal structure of BrxA shows similarity with a thioredoxin fold. It has a CXC motif that plays a key role in catalytic activity [].
This entry represents the CS p23-like domain found in dyslexia susceptibility 1 (DYX1) candidate 1 (C1) protein, DYX1C1, also known as dynein axonemal assembly factor 4 (DNAAF4). The human gene encoding this protein is a positional candidate gene for developmental dyslexia (DD), it is located on 15q21.3 by the DYX1 DD susceptibility locus (15q15-21) [
]. Independent association studies have reported conflicting results [,
,
]. However, association of short-term memory, which plays a role in DD, with a variant within the DYX1C1 gene has been reported []. Mutations in DYX1C1 have been found in patients with primary ciliary dyskinesia []. Most proteins belonging to this group contain a C-terminal tetratricopeptide repeat (TPR) protein binding region [].
This is the Kringle-like domain (KLD) found in Melanocyte protein PMEL, Transmembrane glycoprotein NMB (GPNMB) and Transmembrane protein 130 (TMEM130), which is downstream the PKD domain. It contains six highly conserved cysteine residues that form the disulphide bonds of mature PMEL dimers and promotes PMEL functional amyloid formation [
,
]. This domain is perfectly conserved in PMEL and GPNMB which suggests that GPNMB also forms dimers. PMEL and GPNMB, together with TMEM130 (its most ancient paralogue), have a conserved domain architecture and have been recently described as the PKAT (PKD- and KLD-Associated Transmembrane) protein family []. PMEL and GPNMB share overlapping phenotypes and disease associations, such as melanin-based pigmentation, cancer, neurodegenerative disease and glaucoma.
This entry represents a group of plant cyclin-dependent kinase inhibitors (CKIs), including KRP1-7 (Kip-related proteins 1-7) from Arabidopsis [
]. Plant CKIs fall into 2 distinct families, KIP-RELATED PROTEINS (KRPs) and SIAMESE-RELATED proteins (SMRs). KRPs share very little sequence similarity with mammalian KIP proteins outside the C-terminal conserved region. Compare to mammalian CKIs, plant KRP1-KRP7 bind to active cyclin D2 (CYCD2)/CDKA and CYCD2/CDKB complexes to a similar extent, however, they inhibit kinase activity to a different extent [
]. KRPs function as a dose-dependent cell cycle inhibitors and have an important function in cell proliferation as well as in cell cycle exit and in turning from a mitotic to an endoreplicating cell cycle mode [].
This entry represents a family of TonB-dependent outer membrane receptor/transporters acting on iron-containing proteins such as haemoglobin, transferrin and lactoferrin. It contains the haem/haemoglobin receptor family and the transferrin/lactoferrin receptor family. Nearly all of the species, which contain sequences in this family have access to haemoglobin, transferrin or lactoferrin or related proteins in their biological niche and the proteins are therefore most likely to be haemoglobin transporters.Proteins in this entry includes TbpA from Neisseria meningitidis serogroup B. It can form the transferrin complex that acquires iron by extracting it from serum transferrin (TF) in its human host. It contains the classic fold with a 22-strand transmembrane β-barrel encompassing a plug domain [
].
This domain can be found in Disks large homologue 1 (DLG1 or SAP97), a membrane-associated guanylate kinase protein (MAGUK) that serves as an important determinant of localization and organisation of ion channels into specific plasma membrane domains [
]. The residues upstream of this domain are the probable palmitoylation sites, particularly two cysteines. The domain has a putative PEST site at the very start that seems to be responsible for poly-ubiquitination [
]. PEST domains are polypeptide sequences enriched in proline (P), glutamic acid (E), serine (S) and threonine (T) that target proteins for rapid destruction. The whole domain, in conjunction with a C-terminal domain of the longer protein, is necessary for dimerisation of the whole protein [].
The tsx gene of Escherichia coli encodes an outer membrane protein, Tsx, which constitutes the receptor for colicin K and Bacteriophage T6, and functions as a substrate-specific channel for nucleosides and deoxy-nucleosides [
]. The protein contains 294 amino acids, the first 22 of which are characteristic of a bacterial signal sequence peptide. The putative mature form of Tsx contains 272 residues with a calculated Mr of 31418. The Tsx sequence shows an even distribution of charged residues and lacks extensive hydrophobic stretches []. Tsx shows no significant similarities to the channel-forming proteins OmpC, OmpF, PhoE and LamB from the E. coli outer membrane.This entry also contains related proteins of unknown function.
This entry represents CbsB, a hydrophobic protein which is found in Sulfolobus acidocaldarius and several other members of the Sulfolobales, a branch of the Crenarchaeota [
,
]. Encoded at the same locus as CbsB are: CbsA, a high potential cytochrome b558/566, SoxL, a Rieske iron-sulphur protein, SoxN, a predicted membrane-bound b-type cytochrome b, and OdsN, a protein of unknown function. Transcriptional studies suggest that these proteins may form a complex analogous to the cytochrome bc1 complex. The redox-active subunits of this complex would consist of CbsA, SoxL and SoxN, while CbsB and OdsN would be additional non-redox-active subunits. A possible function for CbsB, based on its hydrophobicity, would be to anchor CbsA to the membrane.
This superfamily represents the catenin binding domain. Proteins containing this domain include cadherins, desmocollins, desmoglein and transcription factor 7-like proteins [
,
,
,
].Cadherins and catenins play important roles in the adherents junctions in animal cells. The cadherin-catenin complex is responsible for coupling Ca(2+)-dependent intercellular junctions with actin dynamics and signalling pathways. Desmosomal cadherins are TM protein components of desmosomes (for review, see [
,
,
]), whose extracellular cadherin repeats are responsible for adhesion and whose intracellular regions interact with intermediate filaments via desmosomal plaque proteins plakoglobin, plakobilin and desmoplakin. They are believed to play a wider role in regulation of epithelial differentiation []. Two sub-families of desmosomal cadherin have been identified, desmocollin (DSC) and desmoglein (DSG).
This domain is found at the N terminus of the Apolipoprotein B mRNA editing enzyme. Apobec-1 catalyzes C to U editing of apolipoprotein B (apoB) mRNA in the mammalian intestine. The N-terminal domain of APOBEC-1 like proteins is the catalytic domain, while the C-terminal domain is a pseudocatalyitc domain. More specifically, the catalytic domain is a zinc dependent deaminases domain and is essential for cytidine deamination. APOBEC-3 like members contain two copies of this domain. This family also includes the functionally homologous activation induced deaminase, which is essential for the development of antibody diversity in B lymphocytes. RNA editing by APOBEC-1 requires homodimerisation and this complex interacts with RNA binding proteins to from the editosome [] (and references therein).
The structure of TusA (also known YhhP and SirA) consists of an α/β sandwich with a β-α-β-α-β(2) fold, comprising a mixed four-stranded β-sheet stacked against two α-helices, both of which are nearly parallel to the strands of the β-sheet [
]. Several uncharacterised bacterial proteins (73 to 81 amino-acid residues in length) that contain a well-conserved region in their N-terminal region show structural similarity to the TusA protein, including the E. coli protein YedF (), and other members of the UPF0033 family.
NOTE: TusA was previously known as SirA, but should not be confused with the sporulation inhibitor of replication protein SirA (
) or with the LuxR/UhpA family response regulator
, also known as SirA.
Abi1, also called e3B1, is a central regulator of actin cytoskeletal reorganization through interactions with many protein complexes. It is part of WAVE, a nucleation-promoting factor complex, that links Rac 1 activation to actin polymerization causing lamellipodia protrusion at the plasma membrane [
,
]. Abi1 is a target of alpha4 integrin, regulating membrane protrusions at sites of integrin engagement []. Abi proteins are adaptor proteins serving as binding partners and substrates of Abl tyrosine kinases. They are involved in regulating actin cytoskeletal reorganization and play important roles in membrane-ruffling, endocytosis, cell motility, and cell migration []. Abi proteins contain a homeobox homology domain, a proline-rich region, and a SH3 domain.This entry represents the SH3 domain of Abi1.
SNX13, also called RGS-PX1 or sorting nexin-13, contains an N-terminal PXA domain, a regulator of G protein signaling (RGS) domain, a PX domain, and a C-terminal PXC domain [
]. It specifically binds to the stimulatory subunit of the heterotrimeric G protein G(alpha)s, serving as its GTPase activating protein, through the RGS domain []. It is involved in development and regulation of endocytosis dynamics []. It influences the lysosomal targeting of the epidermal growth factor receptor, suggesting a role both in cell signaling and receptor trafficking [].This entry represents the PX domain of SNX13. The PX domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions [
,
].
This entry represents the SPRY domain found in Ssh4 (suppressor of SHR3 null mutation protein 4) and similar proteins. Ssh4 is a component of the endosome-vacuole trafficking pathway that regulates nutrient transport and may be involved in processes determining whether plasma membrane proteins are degraded or routed to the plasma membrane [
]. The SPRY domain in Ssh4 may be involved in cargo recognition, either directly or by combination with other adaptors, possibly leading to a higher selectivity. In yeast, Ssh4 and the homologous protein Ear1 (endosomal adapter of RSP5) recruit Rsp5p, an essential ubiquitin ligase of the Nedd4 family, and assist it in its function at multivesicular bodies by directing the ubiquitylation of specific cargoes [].
This family contains several seemingly unrelated proteins, including human esterase D
; mycobacterial antigen 85 [
], which is responsible for the high affinity of mycobacteria to fibronectin; Corynebacterium glutamicum major secreted protein PS1; and a number of proteins from Escherichia coli, yeast, mycobacteria and Haemophilus influenzae. Proteins in this entry share the typical AB hydrolase fold including esterases with broad specificity. Among them, S-formylglutathione hydrolases, such as YeiG and FrmB from bacteria ; the three related serine hydrolases Iron(III) enterobactin esterase fes (
), Iron(III) salmochelin esterase iroD (
) and iroE (
) from E. coli which are esterases for the apo and Fe3-bound forms of enterobactin iroB [
,
].
DNA modification/repair radical SAM protein, putative
Type:
Family
Description:
This entry represents a family of uncharacterised protein of about 400 amino acids in length that contains a radical SAM domain in the N-terminal half. Members of this family are present in about twenty percent of prokaryotic genomes, always paired with a member of the conserved hypothetical protein
. Roughly forty percent of the members of that family exist as fusions with a uracil-DNA glycosylase-like region,
. In DNA, uracil results from deamidation of cytosine, forming U/G mismatches that lead to mutation, and so uracil-DNA glycosylase is a DNA repair enzyme. This indirect connection, and the recurring role or radical SAM proteins in modification chemistries, suggest that this protein may act in DNA modification, repair, or both.
This entry represents a repeat of about 40 amino acids found in a variety of archaea and bacteria which can be present in up to 14 copies per protein. The archaeal species Methanosarcina mazei (Methanosarcina frisia) contains several predicted surface layer proteins (SLPs) containing tandem copies of this repeat [
]. The crystal structure of one of these proteins (), containing seven tandem copies of this repeat, was examined. Individually, these repeats form four-stranded beta blades, while the seven copies together form a seven-bladed beta propeller domain. This repeat shows sequence similarity to the WD-40 repeat (
) and may play a similar role, serving as a rigid scaffold for protein interactions.
Synembryns are proteins that positively regulate synaptic transmission. Theyare actively required to maintain proper activity of the Go to Gq G-protein
signalling network, which regulates neurotransmitter secretion in Caenorhabditis elegansby controlling the production and consumption of diacylglycerol [
]. Inaddition to its role in the adult nervous system, synembryn is required to
regulate a subset of centrosome movements in the early C. elegans embryo [].The protein appears to be concentrated in the cytoplasm of neurons. However,unlike other protein components of the Go-Gq signalling network, synembryn
appears to be more concentrated in the cell soma than in axonal processesthroughout the nervous system [
]. C. elegans synembryn reduction of functionmutants exhibit partial embryonic lethality [
].
This entry represents a domain found in a group of divergent 3' exoribonucleases. The proteins constitute a typical RNase fold, where the active site residues form a magnesium catalytic centre. The protein of the solved structure readily cleaves 3' overhangs in a time-dependent manner. It is similar to DEDD-type RNases and is an unusual ATP-binding protein that binds ATP and dATP. It forms a dimer in solution and both protomers in the asymmetric unit bind a magnesium ion through Asp-6 [
].Proteins containing this domain also include 3'-5' exonuclease dexA from bacteriophage T4. It may play a role in the final step of host DNA degradation, by scavenging DNA into mononucleotides [
,
].
This entry represent a domain found in a group of putative suppressors of RNA silencing proteins, P20-P25, from ssRNA positive-strand viruses such as Closterovirus, Potyvirus and Cucumovirus families. RNA silencing is one of the major mechanisms of defence against viruses, and, in response, some viruses have evolved or acquired functions for suppression of RNA silencing. These counter-defencive viral proteins with RNA silencing suppressor (RSS) activity were originally discovered in the members of plant virus genera Potyvirus and Cucumovirus. Each of the conserved blocks of amino acids found in P21-like proteins corresponds to a computer-predicted α-helix, with the most C-terminal element being 42 residues long. This suggests conservation of the predominantly α-helical secondary structure in the P21-like proteins [
].
PPAK is a repeated protein motif found in the PEVK (Pro-Glu-Val-Lys) domain of the titin protein and in a number of other proteins. Titin (
) is a giant elastic protein found in striated muscle that is a key component in the assembly and functioning of sarcomeres [
]. PPAK motifs (PPAK refers to the four amino acids found at the beginning of the motif) occur 60 times in human soleus titin []. PPAK motifs occur in groups of 2-12 that are separated by regions rich in glutamic acid (approximately 45%) and termed polyE segments. The charge fluctuation between the PPAK and polyE regions suggests ionic interactions between these segments and their involvement in the elastic function of titin.
The L27_2 domain is a protein-protein interaction domain capable of organising scaffold proteins into supramolecular assemblies by formation of heteromeric L27_2 domain complexes. L27_2 domain-mediated protein assemblies have been shown to play essential roles in cellular processes including asymmetric cell division, establishment and maintenance of cell polarity, and clustering of receptors and ion channels. Members of this family form specific heterotetrameric complexes, in which each domain contains three α-helices. The two N-terminal helices of each L27_2 domain pack together to form a tight, four-helix bundle in the heterodimer, whilst the third helix of each L27_2 domain forms another four-helix bundle that assembles the two units of the heterodimer into a tetramer [
].
This domain, found in group 3 of Schlafen proteins from mammals, including Schlafen 5, 8, 9, 11 and 13, represents the helicase domain. Schlafen proteins are involved in the control of cell proliferation, induction of immune responses, and in the regulation of viral replication [
,
,
,
]. These proteins inhibit DNA replication and promote cell death in response to DNA damage. They play a role in genome surveillance to kill cells with defective replication []. This domain is also found in various prokaryotic proteins fused to a DNA helicase, GIY-YIG or PD-(D/E)XK catalytic domain or HsdR-N(terminal) domain, which are similar to AAA DNA helicase, Type III restriction enzyme ATPase, RecD and RuvB helicase [].
This RNA binding domain is found at the amino terminus of transcriptional antitermination proteins such as BglG, SacY and LicT. These proteins control the expression of sugar metabolising operons in Gram-positive and Gram-negative bacteria. This domain has been called the CAT (Co-AntiTerminator) domain. It binds as a dimer [
] to short Ribonucleotidic Anti-Terminator (RAT) hairpin, each monomer interacting symmetrically with both strands of the RAT hairpin []. In the full-length protein, CAT is followed by two phosphorylatable PTS regulation domains that modulate the RNA binding activity of CAT. Upon activation, the dimeric proteins bind to RAT targets in the nascent mRNA, thereby preventing abortive dissociation of the RNA polymerase from the DNA template [].
This entry represents a domain found in some transcriptional regulatory proteins of the two-component systems (usually consist of a histidine protein kinase and a response regulator protein) from bacteria. The family is found in association with
. Proteins containing this domain include DpiA from E. coli , CitT from Bacillus subtilis and CitB from Klebsiella pneumoniae. DpiA is a member of the two-component regulatory system DpiA/DpiB, which is essential for expression of citrate-specific fermentation genes and genes involved in plasmid inheritance [
]. CitT is a member of the two-component regulatory system CitT/CitS [], while CitB is a member of the two-component regulatory system CitA/CitB essential for expression of citrate-specific fermentation genes [].
This entry represents the second of five domains on the Kluyveromyces lactis Ndc10 protein [
]. Each subunit of the Ndc10 dimer binds a separate fragment of DNA, suggesting that Ndc10 stabilises a DNA loop at the centromere. Proteins containing this domain also include Cbf2 from S. cerevisiae. S. cerevisiae Ndc10 (also known as Cbf2) is a component of the centromere DNA-binding protein complex CBF3, which is essential for chromosome segregation and movement of centromeres along microtubules [,
,
]. The structure of the protein shows that it can be divided in the two α-helical N- and C-lobes. The N-lobe consists of an antiparallel four-helix bundle with a central β-sheet which is involved in DNA binding [].
Plasmodium falciparum cysteine-rich protective antigen (PfCyRPA) is a 42.8kDa protein of 362 residues with a predicted N-terminal secretion signal. It is part of a multi-protein complex including the PfRH5-interacting protein PfRipr and the reticulocyte binding-like homologous protein PfRH5, which binds to the erythrocyte receptor basigin. PfRH5, PfCyRPA, and PfRipr colocalize during parasite invasion at the junction between merozoites and erythrocytes. The complex seems to be required both for triggering Ca2+ release and establishment of tight junctions. PfCyRPA adopts a 6-bladed β-propeller structure with similarity to the classic sialidase fold, but it has no sialidase activity and fulfills a purely non-enzymatic function. Each blade of the propeller is constructed by a four-stranded anti-parallel β-sheet [
].
In mammals, SAP25 is involved in the transcriptional repression mediated by the mSIN3 complex, which consists of at least SAP30, SAP45/Sds3, SAP130, SAP180/BCAA, RBP1, HDAC1 (histone deacetylase1), HDAC2, RbAp46, and RbAp48 [
]. The mSIN3 complex can be recruited by sequence-specific DNA binding transcription factors and chromatin-binding proteins to specific regions of the genome and regulate their transcription []. SAP25 binds to the the PAH1 domain of mSin3A and changes the conformation of the complex that affects its protein-protein interaction []. SAP25 can be actively exported from the nucleus to cytoplasm by a CRM1-dependent nuclear export pathway. Its localisation is regulated by promyelocytic leukemia protein (PML), which induces a nuclear accumulation of SAP25 [].
This entry represents repeat 2 of Tail fiber protein from Bacteriophage lambda (Stf or gp27) and similar proteins found in the tailed bacteriophages Caudovirales and in bacterial prophages. The repeats are about 40 residues long.The strain of the Bacteriophage lambda used in most laboratories in the early 1990's carried some mutations respect to the wild type. Stf is the gene product of one of these mutations, which allow the virus to bind to an additional outer membrane receptor and accelerate the rate of adsorption onto the host cell surface but a higher failed infection frequency [
,
]. This repeat is also found in Tail fiber protein H (K) [
] and Tail fiber protein S (S) [].
The IncW plasmid pSa contains the gene ard, which encodes an antirestriction protein that is specific for type I restriction and modification systems [
]. The protein has no significant similarities with other known Ard proteins (ArdA and ArdB types) except for the "antirestriction"motif (14 amino acid residues in length). This is conserved in all known Ard proteins. ArdC does, however, have a high degree of similarity (about 38% identity) to the N-terminal region of RP4 TraC1 primase, which includes about 300 amino acid residues and seems essential for binding to single-stranded DNA, as well as TraC1 transport to the recipient cells during the transfer of plasmid DNA. ArdC also binds to single-stranded DNA [
].
This entry represents the MI domain (after MA-3 and eIF4G), it is a protein-protein interaction module of ~130 amino acids [
,
,
]. It appears in several translation factors and is found in:One copy in plant and animal eIF4G 1 and 2 (DAP-5/NAT1/p97)Two copies in the animal programmed cell death protein 4 (PDCD4) or MA-3 that is induced during programmed cell death and inhibits neoplastic transformationFour tandem-repeated copies in a group of uncharacterised plant proteinsThe MI domain consists of seven α-helices, which pack into a globular form. The packing arrangement consists of repeating pairs of antiparallel helices packed one upon the other such that a superhelical axis is generated perpendicular to the α-helical axes [
]. The MI domain has also been named MA3 domain.
This domain identifies a group of proteins, which are described as: General vesicular transport factor, Transcytosis associate protein (TAP) and Vesicle docking protein. This myosin-shaped molecule consists of an N-terminal globular head region, a coiled-coil tail which mediates dimerisation, and a short C-terminal acidic region [
]. p115 tethers COP1 vesicles to the Golgi by binding the coiled coil proteins giantin (on the vesicles) and GM130 (on the Golgi), via its C-terminal acidic region. It is required for intercisternal transport in the Golgi stack. This domain is found in the acidic C-terminal region, which binds to the golgins giantin and GM130. p115 is thought to juxtapose two membranes by binding giantin with one acidic region, and GM130 with another [].
The downstream neighbour of Son (DONSON) protein, also known as protein humpty dumpy (Hd) in Drosophila, is essential for DNA amplification in the ovary, and is required for cell proliferation during development (the humpty dumpty name arises from the thin-eggshell phenotype of hd mutants) [
]. Depletion of the Hd protein has been shown to cause severe defects in genome replication and to result in DNA damage. Hd is largely found in nuclear foci; some may traverse the nuclear envelope. Its expression peaks during late G1 and S phase, and it responds to transcription factor E2F1/Dp []. The Hd protein sequence is conserved from plants to humans, and may constitute a new gene family required for cell proliferation in multicellular eukaryotes [].
Photosynthesis system II assembly factor Ycf48/Hcf136
Type:
Family
Description:
This entry represents a family of proteins predominantly found in the thylakoid membrane of plant chloroplasts and cyanobacteria, including Photosystem II assembly lipoprotein Ycf48 from Synechocystis sp. [
,
] and Photosystem II stability/assembly factor HCF136, chloroplastic from Arabidopsis thaliana [].The photosynthesis system II (PSII) is a multi-subunit pigment-protein complex responsible for water oxidation during oxygenic photosynthesis [
]. Its assembly is regulated by PSII assembly factors including Ycf48/Hcf136, Psb27 and Psb28. Ycf48/Hcf136 contains the BNR repeats and can be found predominantly in the thylakoid membrane. Ycf48/Hcf136 is required for assembly and repair of an early intermediate in PSII assembly that includes D2 (psbD) and cytochrome b559 [,
]. This entry also includes some Ycf48-like proteins that may not affect PSII activity.
The COG complex comprises eight proteins (COG1-8) and plays critical roles in Golgi structure and function. It is necessary for retrograde trafficking in the Golgi apparatus and for protein glycosylation [
,
]. COG complex, together with exocyst, Golgi-associated retrograde protein (GARP) and Dsl1 complexes, is a member of the CATHR (complexes associated with tethering containing helical rods) family sharing an evolutionary origin and structural features []. This domain is found in the C-terminal of COG complex subunit 2 proteins and consists of a conserved α-helical bundle. In Arabidopsis, COG2 forms a complex with FPP3/VETH1 and FPP2/VETH2 and ensures the correct secondary cell wall (SCW) deposition pattern by recruiting exocyst components to cortical microtubules in xylem cells during secondary cell wall deposition [
].
This superfamily represents domains related by a common ancestor that have a Rossmann-like, 3-layer, alpha/beta/alpha sandwich fold. Protein families in which the domain is found include:Nucleotidylyl transferases (
) such as cytidylyltransferases [
], adenylyltransferases [].Class I aminoacyl-tRNA synthetases (catalytic domain), such as tyrosyl-tRNA synthetase (
) and glutaminyl-tRNA synthetase (
) [
].Pantothenate synthetases (
) [
].ATP sulphurylase (central domain) [
]
N-type ATP pyrophosphatases, such as beta-lactam synthetase (
) and GMP synthase (
) [
].PP-loop ATPases such as the cell cycle protein MesJ (N-terminal domain) [
].Phosphoadenylyl sulphate (PAPS) reductase [
]
Electron transfer flavoprotein (ETFP) subunits, such as the N-terminal domains of the alpha and beta subunits [
].Universal stress protein A (UspA) [
].Cryptochrome and DNA photolyase [
].
Proteins in this entry are members of the glucose-methanol-choline oxidoreductase family of flavoenzymes [
]. These enzymes catalyse diverse reaction and include glucose dehydrogenase (), alcohol oxidase (
), glucose oxidase (
), choline dehydrogenase (
), and cyclase atC from Aspergillus terreus which oxidizes terremutin to terreic acid, a quinone epoxide inhibitor of a tyrosine kinase [
]. Structural studies indicate that these proteins are composed of an N-terminal FAD-binding domain, and a C-terminal substrate-binding domain [
,
,
]. The FAD-binding domain forms the α-β fold typical of dinucleotide binding proteins, while the substrate-binding domain consists of a β-sheet surrounded by α-helices. The general topology of these proteins is conserved, though inserted structural elements occur in both choline dehydrogenase and alcohol dehydrogenase [].
Cyclophilins exhibit peptidyl-prolyl cis-trans isomerase (PPIase) activity (
), accelerating protein folding by catalysing the cis-trans isomerisation of proline imidic peptide bonds in oligopeptides [
,
]. They also have protein chaperone-like functions [] and are the major high-affinity binding proteins for the immunosuppressive drug cyclosporin A (CSA) in vertebrates [].Cyclophilins are found in all prokaryotes and eukaryotes, and have been structurally conserved throughout evolution, implying their importance in cellular function [
]. They share a common 109 amino acid cyclophilin-like domain (CLD) and additional domains unique to each member of the family. The CLD domain contains the PPIase activity, while the unique domains are important for selection of protein substrates and subcellular compartmentalisation [].This entry represents the core β-barrel cyclophilin-like domain.
Cyclophilin-type peptidyl-prolyl cis-trans isomerase, conserved site
Type:
Conserved_site
Description:
Cyclophilins exhibit peptidyl-prolyl cis-trans isomerase (PPIase) activity (
), accelerating protein folding by catalysing the cis-trans isomerisation of proline imidic peptide bonds in oligopeptides [
,
]. They also have protein chaperone-like functions [] and are the major high-affinity binding proteins for the immunosuppressive drug cyclosporin A (CSA) in vertebrates [].Cyclophilins are found in all prokaryotes and eukaryotes, and have been structurally conserved throughout evolution, implying their importance in cellular function []. They share a common 109 amino acid cyclophilin-like domain (CLD) and additional domains unique to each member of the family. The CLD domain contains the PPIase activity, while the unique domains are important for selection of protein substrates and subcellular compartmentalisation [].This entry represents a conserved site in the central part of these enzymes.
This entry represents the N-terminal substrate-binding domain of the Lon protease. This ATP-dependent enzyme, a serine peptidase belonging to the MEROPS peptidase family S16, is conserved in archaeal, bacterial and eukaryotic organisms and catalyses rapid turnover of short-lived regulatory proteins and many damaged or denatured proteins. In eukaryotes, the majority of the proteins are located in the mitochondrial matrix [
,
]. In yeast, Pim1, is located in the mitochondrial matrix and required for mitochondrial function. It is constitutively expressed but is increased after thermal stress, suggesting that Pim1 may play a role in the heat shock response [].The structure of this domain has been determined and it represents a general protein and polypeptide interaction domain [
,
,
,
].
In contrast to other RNA-binding domains, the about 65 amino acids long dsRBD domain [
,
,
] has been found in a number of proteins that specifically recognise double-stranded RNAs. The dsRBD domain is also known as DSRM (Double-Stranded RNA-binding Motif). dsRBD proteins are mainly involved in posttranscriptional gene regulation, for example by preventing the expression of proteins or by mediating RNAs localization. This domain is also found in RNA editing proteins. Interaction of the dsRBD with RNA is unlikely to involve the recognition of specific sequences [,
,
]. Nevertheless, multiple dsRBDs may be able to act in combination to recognise the secondary structure of specific RNAs (i.e. Staufen) []. NMR analysis of the third dsRBD of Drosophila Staufen have revealed an α-β-β-β-α structure [].
These repeats were first identified in many cyanobacterial proteins but they are also found in bacterial as well as in plant proteins [
]. The repeats were first identified in hglK [
]. Pentapeptide repeat proteins (PRPs) are characterised by the repetition of the pentapeptide repeat motif [S,T,A,V][D,N][L,F][S,T,R][G], which allows it to adopt a right-handed β-helical structure conformation [
]. The functions of these repeats is unknown but it has been shown that members of this family share the ability to interact with DNA-binding proteins, such as DNA gyrase. For example, McbG (from Escherichia coli) protects the DNA gyrase from microcin B17 toxicity, MfpAMt (from Mycobacterium tuberculosis) and Qnr (from Klebsiella pneumoniae and other enterobacteria) are involved in resistance to fluoroquinolones [].
This family of peroxiredoxins includes osmotically inducible protein C (OsmC), a stress-induced protein found in Escherichia coli. This family also contains organic hydroperoxide resistance protein (Ohr), that has a novel pattern of oxidative stress regulation.The transcription of the osmC gene of E. coli is regulated as a function of the phase of growth and is induced during the late exponential phase when the growth rate slows before entry into stationary phase. The transcription is initiated by two overlapping promoters, osmCp1 and osmCp2 [
].Ohr from Xanthomonas campestris pv. phaseoli is highly induced by organic hydroperoxides, weakly induced by H2O2, and not induced at all by a superoxide generator. OHR may be a new type of organic hydroperoxide detoxification protein [
,
].
The bacterial DnaA protein [
,
,
] plays an important role in initiating and regulating chromosomal replication. DnaA is an ATP- and DNA-binding protein. It binds specifically to 9 bp nucleotide repeats known as dnaA boxes which are found in the chromosome origin of replication (oriC).DnaA contains two conserved regions: the first is located in the N-terminal half and corresponds to the ATP-binding domain, the second is located in the C-terminal half and could be involved in DNA-binding. The protein may also bind the RNA polymerase beta subunit, the dnaB and dnaZ proteins, and the groE gene products (chaperonins) [
].This entry represents the chromosomal replication control initiator DnaA, as well as the DnaA homologue Hda, which is also involved in chromosomal replication control.
This entry represents the KH domain found in proteins homologous to the Bacillus subtilis RNA-binding protein KhpB, also known as Jag and EloR, which is associated with SpoIIIJ and is necessary for the third stage of sporulation. It forms a complex with KhpA which binds to cellular RNA and controls its expression. It also plays a role in peptidoglycan (PG) homeostasis and cell length regulation [
]. The KH domain is a β-α-α-β-β unit that folds into an α/β structure with a three stranded β-sheet interrupted by two contiguous helices. In general, the KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA [,
,
].
This entry represents the IPT domain of the recombination signal Jkappa binding protein (RBP-Jkappa, also known as RBP-J). RBP-J kappa, was initially considered to be involved in V(D)J recombination because of its DNA binding specificity and structural similarity to site-specific recombinases known as the integrase family. Further studies indicated that RBP-J kappa functions as a repressor of transcription, via destabilization of the general transcription factor IID and recruitment of histone deacetylase complexes. Later, its was found to be a transcriptional regulator downstream of Notch receptors [
,
,
].IPT/TIG (immunoglobulin, plexin, transcription factor-like/transcription factor immunoglobulin) domains adopt an immunoglobulin-like fold and have been found in proteins involved in transcriptional regulation and signal transduction, and can also participate in protein-protein interactions [
].
This entry represents a group of helicases, including DNA2 and Nam7. Proteins in this family contain a conserved domain with s a P-loop motif that is characteristic of the AAA superfamily. They are DEAD-like helicases belonging to superfamily (SF)1, a diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. Similar to SF2 helicases, SF1 helicases do not form toroidal structures like SF3-6 helicases. Their helicase core consists of two similar protein domains that resemble the fold of the recombination protein RecA [,
,
].Dna2 is a DNA replication factor with single-stranded DNA-dependent ATPase, ATP-dependent nuclease, (5'-flap endonuclease) and helicase activities [
,
]. Nam7 (also known as Upf1) is an ATP-dependent RNA helicase involved in the nonsense-mediated mRNA decay (NMD) pathway [].
NifL from Azotobacter vinelandii senses both the redox and fixed nitrogen status to regulate nitrogen fixation. NifL acts by modulating the activity of the nitrogen fixation positive regulator protein NifA; NifL inhibits NifA in response to oxygen and low level of fixed nitrogen. NifA and NifL are encoded by adjacent genes. NifL is FAD-containing, and has a domain architecture similar to that of the cytoplasmic histidine protein kinases, containing two N-terminal PAS domains and a C-terminal transmitter region containing a conserved histidine residue (H domain) and a nucleotide binding GHKL domain corresponding to the catalytic core of the histidine kinases [
]. However, NifL does not exhibit kinase activity and regulates its partner NifA by direct protein-protein interactions rather than phosphorylation.
Aar2 is a U5 small nuclear ribonucleoprotein (snRNP) particle assembly factor and part of Prp8, which forms a large complex containing U5 snRNA, Snu114, and seven Sm proteins (B, D1, D2, D3, E, F and G). Upon import of the complex into the nucleus, Aar2 phosphorylation leads to its release from Prp8 and replacement by Brr2p, thus playing an important role in Brr2p regulation and possibly safeguarding against non-specific RNA binding to Prp8 [
,
,
,
,
]. Aar2p binds directly with the RNaseH-like domain in the C-terminal region of Prp8p []. In yeast, Aar2 protein is involved in splicing pre-mRNA of the a1 cistron and other genes important for cell growth [].This entry consists of the N-terminal domain of eukaryotic Aar2 and Aar2-like proteins.
Doublesex- and mab-3-related transcription factor 1-like
Type:
Family
Description:
This family is found in eukaryotes, and is typically between 61 and 73 amino acids in length. They are suggested to be transcription factor proteins.In Xenopus laevis, doublesex- and mab-3-related transcription factor 1a (dmrt1-a) is a transcription factor that plays a key role in male sex determination and differentiation by controlling testis development and germ cell proliferation. It acts both as a transcription repressor and activator [
,
]. However, the other family member in Xenopus laevis, doublesex- and mab-3-related transcription factor DM-W (dm-w), is a transcription factor that plays a key role in female sex determination and primary ovary development. It acts as a sex-determining protein by antagonizing the transcriptional activity of male-determination protein dmrt1-a, acting as a dominant-negative type protein [,
].