Alphaviruses are enveloped RNA viruses that use arthropods such as mosquitoes for transmission to their vertebrate hosts, and include Semliki Forest and Sindbis viruses [
]. Alphaviruses consist of three structural proteins: the core nucleocapsid protein C, and the envelope proteins P62 and E1 () that associate as a heterodimer. The viral membrane-anchored surface glycoproteins are responsible for receptor recognition and entry into target cells through membrane fusion. The proteolytic maturation of P62 into E2 (
) and E3 causes a change in the viral surface. Together the E1, E2, and sometimes E3 glycoprotein "spikes"form an E1/E2 dimer or an E1/E2/E3 trimer, where E2 extends from the centre to the vertices, E1 fills the space between the vertices, and E3, if present, is at the distal end of the spike [
,
]. Upon exposure of the virus to the acidity of the endosome, E1 dissociates from E2 to form an E1 homotrimer, which is necessary for the fusion step to drive the cellular and viral membranes together []. This entry represents the alphaviral E3 glycoprotein. Most alphaviruses lose the peripheral protein E3, but in Semliki viruses it remains associated with the viral surface.
Cullins are a family of hydrophobic proteins that act as scaffolds for ubiquitin ligases (E3). Cullins are found throughout eukaryotes. Humans express seven cullins (Cul1, 2, 3, 4A, 4B, 5 and 7), each forming part of a multi-subunit ubiquitin complex. Cullin-RING ubiquitin ligases (CRLs), such as Cul1 (SCF) [], play an essential role in targeting proteins for ubiquitin-mediated destruction; as such, they are diverse in terms of composition and function, regulating many different processes from glucose sensing and DNA replication to limb patterning and circadian rhythms. The catalytic core of CRLs consists of a RING protein and a cullin family member. For Cul1, the C-terminal cullin-homology domain binds the RING protein. The RING protein appears to function as a docking site for ubiquitin-conjugating enzymes (E2s). Other proteins contain a cullin-homology domain, such as the APC2 subunit of the anaphase-promoting complex/cyclosome and the p53 cytoplasmic anchor PARC; both APC2 and PARC have ubiquitin ligase activity. The N-terminal region of cullins is more variable, and is used to interact with specific adaptor proteins [,
,
].This entry represents the cullin-homology domain superfamily. This domain is composed of three subdomains: a 4-helical bundle domain, an alpha+beta domain, and a winged helix-like domain.
This entry represents the oligomerisation domain of the breakpoint cluster region oncoprotein Bcr, and the Bcr/Abl (Abelson-leukemia-virus) fusion protein created by a reciprocal (9;22) fusion [
]. Brc displays serine/threonine protein kinase activity (), acting as a GTPase-activating protein for RAC1 and CDC42. Brc promotes the exchange of RAC or CDC42-bound GDP by GTP, thereby activating them [
]. The Bcr/Abl fusion protein loses some of the regulatory function of Bcr with regards to small Rho-like GTPases with negative consequences on cell motility, in particular on the capacity to adhere to endothelial cells [].The Bcr, Bcr/Abl oncoprotein oligomerisation domain consists of a short N-terminal helix (alpha-1), a flexible loop and a long C-terminal helix (alpha-2). Together these form an N-shaped structure, with the loop allowing the two helices to assume a parallel orientation. The monomeric domains associate into a dimer through the formation of an antiparallel coiled coil between the alpha-2 helices and domain swapping of two alpha-1 helices, where one alpha-1 helix swings back and packs against the alpha-2 helix from the second monomer. Two dimers then associate into a tetramer. The oligomerisation domain is essential for the oncogenicity of the Bcr-Abl protein [
].
Herpesviruses are enveloped by a lipid bilayer that contains at least a dozen glycoproteins. The virion surface glycoproteins mediate recognition of susceptible cells and promote fusion of the viral envelope with the cell membrane, leading to virus entry. No single glycoprotein associated with the virion membrane has been identified as the fusogen [
].Glycoprotein L (gL) forms a non-covalently linked heterodimer with glycoprotein H (gH). This heterodimer is essential for virus-cell and cell-cell fusion since the association of gH and gL is necessary for correct localisation of gH to the virion or cell surface. gH anchoring the heterodimer to the plasma membrane through its transmembrane domain. gL lacks a transmembrane domain and is secreted from cells when expressed in the absence of gH [
].This entry represents Herpesvirus glycoprotein L (gL), which is a virion associated envelope glycoprotein [
]. Heterodimer formation between gH and gL has been demonstrated in both virions and infected cells []. Heterodimer formation between gL and gH is important for the proper folding of gH and its insertion into the membrane because the anti-gH conformation-dependent monoclonal antibodies (mAbs) 53S and LP11 bind gH only when gL is present [,
].
Alphaviruses are enveloped RNA viruses that use arthropods such as mosquitoes for transmission to their vertebrate hosts, and include Semliki Forest and Sindbis viruses [
]. Alphaviruses consist of three structural proteins: the core nucleocapsid protein C, and the envelope proteins P62 and E1 () that associate as a heterodimer. The viral membrane-anchored surface glycoproteins are responsible for receptor recognition and entry into target cells through membrane fusion. The proteolytic maturation of P62 into E2 and E3 (
) causes a change in the viral surface. Together the E1, E2, and sometimes E3 glycoprotein "spikes"form an E1/E2 dimer or an E1/E2/E3 trimer, where E2 extends from the centre to the vertices, E1 fills the space between the vertices, and E3, if present, is at the distal end of the spike [
,
]. Upon exposure of the virus to the acidity of the endosome, E1 dissociates from E2 to form an E1 homotrimer, which is necessary for the fusion step to drive the cellular and viral membranes together []. This entry represents the alphaviral E2 glycoprotein. The E2 glycoprotein functions to interact with the nucleocapsid through its cytoplasmic domain, while its ectodomain is responsible for binding a cellular receptor.
This superfamily represents SMAD (Mothers against decapentaplegic (MAD) homolog) (also called MH2 for MAD homology 2) domains as well as their structural homologues, such as the transactivation domain of interferon regulatory protein 3 (IRF3), both of which have a β-sandwich structural fold.SMAD domains are found at the carboxy terminus of MAD related proteins such as Smads. SMAD domain proteins are found in a range of species from nematodes to humans. These highly conserved proteins contain an N-terminal MH1 domain that contacts DNA, and is separated by a short linker region from the C-terminal MH2 domain, the later showing a striking similarity to FHA domains. SMAD proteins mediate signalling by the TGF-beta/activin/BMP-2/4 cytokines from receptor Ser/Thr protein kinases at the cell surface to the nucleus. SMAD proteins fall into three functional classes: the receptor-regulated SMADs (R-SMADs), including SMAD1, -2, -3, -5, and -8, each of which is involved in a ligand-specific signalling pathway [
]; the co-mediator SMADs (co-SMADs), including SMAD4, which interact with R-SMADs to participate in signalling []; and the inhibitory SMADs (I-SMADs), including SMAD6 and -7, which block the activation of R-SMADs and Co-SMADs, thereby negatively regulating signalling pathways [].
This entry represents the oligomerisation domain of the breakpoint cluster region oncoprotein Bcr, and the Bcr/Abl (Abelson-leukemia-virus) fusion protein created by a reciprocal (9;22) fusion [
]. Brc displays serine/threonine protein kinase activity (), acting as a GTPase-activating protein for RAC1 and CDC42. Brc promotes the exchange of RAC or CDC42-bound GDP by GTP, thereby activating them [
]. The Bcr/Abl fusion protein loses some of the regulatory function of Bcr with regards to small Rho-like GTPases with negative consequences on cell motility, in particular on the capacity to adhere to endothelial cells [].The Bcr, Bcr/Abl oncoprotein oligomerisation domain consists of a short N-terminal helix (alpha-1), a flexible loop and a long C-terminal helix (alpha-2). Together these form an N-shaped structure, with the loop allowing the two helices to assume a parallel orientation. The monomeric domains associate into a dimer through the formation of an antiparallel coiled coil between the alpha-2 helices and domain swapping of two alpha-1 helices, where one alpha-1 helix swings back and packs against the alpha-2 helix from the second monomer. Two dimers then associate into a tetramer. The oligomerisation domain is essential for the oncogenicity of the Bcr-Abl protein [
].
Herpesviruses are enveloped by a lipid bilayer that contains at least a dozen glycoproteins. The virion surface glycoproteins mediate recognition of susceptible cells and promote fusion of the viral envelope with the cell membrane, leading to virus entry. No single glycoprotein associated with the virion membrane has been identified as the fusogen [
].Glycoprotein L (gL) forms a non-covalently linked heterodimer with glycoprotein H (gH). This heterodimer is essential for virus-cell and cell-cell fusion since the association of gH and gL is necessary for correct localisation of gH to the virion or cell surface. gH anchoring the heterodimer to the plasma membrane through its transmembrane domain. gL lacks a transmembrane domain and is secreted from cells when expressed in the absence of gH [
].This entry represents Herpesvirus glycoprotein H (gH), which is a virion associated envelope glycoprotein [
]. Heterodimer formation between gH and gL has been demonstrated in both virions and infected cells []. Heterodimer formation between gL and gH is important for the proper folding of gH and its insertion into the membrane because the anti-gH conformation-dependent monoclonal antibodies (mAbs) 53S and LP11 bind gH only when gL is present [,
].
This domain superfamily is found in the yeast cell wall assembly regulator Smi1 (also known as Knr4) [
,
]. Saccharomyces cerevisiae Knr4 has a regulatory role in chitin deposition and in cell wall assembly [
]. It is believed to connect the PKC1-SLT2 MAPK pathway with cell proliferation. It has been shown to interact with Bck2, a gene involved in cell cycle progression in S. cerevisiae (forming a complex) to allow PKC1 to coordinate the cell cycle (cell proliferation) with cell wall integrity [,
]. Knr4 also interacts with the tyrosine-tRNA synthetase protein encoded by Tys1 and is involved in sporulation process []. Proteins containing this domain also include the animal F-box only protein 3 (FBXO3). In humans, FBXO3 is a substrate recognition component of the SCF (SKP1-CUL1-F-box protein)-type E3 ubiquitin ligase complex [
].Interestingly, Smi1/Knr4 homologues from bacteria are potential immunity proteins in a subset of these contact-dependent inhibitory toxin systems [
,
].Note: previously reported evidence that Knr4 may interact with nuclear matrix-association region [
] may be due to an artefact [].This domain superfamily is also found in the Syd protein which interacts with the SecY protein.
The outer and inner segments of vertebrate rod photoreceptor cells contain phosducin, a soluble phosphoprotein that complexes with the beta/gamma-subunits of the GTP-binding protein, transducin. Light-induced changes in cyclic nucleotide levels modulate the phosphorylation of phosducin by protein kinase A [
]. The protein is thought to participate in the regulation of visual phototransduction or in the integration of photo-receptor metabolism. Similar proteins have been isolated from the pineal gland and it is believed that the functional role of the protein is the same in both retina and pineal gland [].This superfamily represents the N-terminal domain of phosducin. Together with the C-terminal domain, it covers one side and the top of the seven-bladed beta propeller of Gt beta gamma. The binding of phosducin induces a distinct structural change in the beta propeller of Gt beta gamma, such that a small cavity opens up between blades 6 and 7 [
]. Binding of phosducin results in sequestration of beta gamma from the membrane to the cytosol and turns off the signal-transduction cascade. Regulation of this membrane association/dissociation switch of Gt beta gamma by phosducin may be a general mechanism for attenuation of G protein coupled signal transduction cascades.
RNF4 is a SUMO-targeted E3 ubiquitin-protein ligase with a pivotal function in the DNA damage response (DDR) through interacting with the deubiquitinating enzyme ubiquitin-specific protease 11 (USP11), a known DDR-component, and further facilitating DNA repair [
]. It plays a novel role in preventing the loss of intact chromosomes and ensures the maintenance of chromosome integrity. Moreover, RNF4 is responsible for the UbcH5A-catalyzed formation of K48 chains that target SUMO-modified promyelocytic leukemia (PML) protein for proteasomal degradation in response to arsenic treatment []. It also interacts with telomeric repeat binding factor 2 (TRF2) in a small ubiquitin-like modifiers (SUMO)-dependent manner and preferentially targets SUMO-conjugated TRF2 for ubiquitination through SUMO-interacting motifs (SIMs) []. Furthermore, RNF4 can form a complex with a Ubc13-ubiquitin conjugate and Ube2V2. It catalyzes K63-linked polyubiquitination by the Ube2V2-Ubc13 (ubiquitin-loaded) complex []. Meanwhile, RNF4 negatively regulates nuclear factor kappa B (NF-kappaB) signaling by down-regulating transforming growth factor beta (TGF-beta)-activated kinase 1 (TAK1)-TAK1-binding protein2 (TAB2) []. This protein family also includes RFN4 orthologues such as E3 ubiquitin-protein ligase complex SLX5-SLX8 subunit SLX8 from Saccharomyces cerevisiae and E3 ubiquitin-protein ligase complex slx8-rfp subunit rfp2 from Schizosaccharomyces pombe, and similar proteins predominantly found in animals and fungi.
The BSD domain is an about 60-residue long domain named after the BTF2-like transcription factors, Synapse-associated proteins and DOS2-like proteins in which it is found. Additionally, it is also found in several hypothetical proteins. The BSD domain occurs in one or two copies in a variety of species ranging from primal protozoan to human. It can be found associated with other domains such as the BTB domain (see
) or the U-box in multidomain proteins. The function of the BSD domain is unknown [
].Secondary structure prediction indicates the presence of three predicted alpha helices, which probably form a three-helical bundle in small domains. The third predicted helix contains neighbouring phenylalanine and tryptophan residues - less common amino acids that are invariant in all the BSD domains identified and that are the most striking sequence features of the domain [
].Some proteins known to contain one or two BSD domains are listed below:Mammalian TFIIH basal transcription factor complex p62 subunit (GTF2H1).Yeast RNA polymerase II transcription factor B 73kDa subunit (TFB1), the
homologue of BTF2.Yeast DOS2 protein. It is involved in single-copy DNA replication and
ubiquitination.Drosophila synapse-associated protein SAP47.Mammalian SYAP1.Arabidopsis thaliana (Mouse-ear cress) TFB1-1 (TFB1A) and TFB1-3 (TFB1C).
This entry includes the heterogeneous nuclear ribonucleoprotein A1 (hnRNPA1), a nuclear factor that binds to Pol II transcripts. The family of hnRNP proteins is involved in numerous RNA-related activities [
]. hnRNPA1 is involved in the packaging of pre-mRNA into hnRNP particles, transport of poly(A) mRNA from the nucleus to the cytoplasm and modulation of splice site selection []. hnRNPA1 plays a role in the splicing of pyruvate kinase PKM by binding repressively to sequences flanking PKM exon 9, inhibiting exon 9 inclusion and resulting in exon 10 inclusion and production of the PKM M2 isoform []. This protein binds to the IRES and inhibits the translation of the apoptosis protease activating factor APAF1. During Enterovirus 71 (EV71) infection, APAF1 expression is essential for virus-induced apoptosis and viral particle release [].hnRNPA1 consists of two RNA recognition motifs (RRM1 and RRM2) followed by an unstructured low complexity (LC) C-terminal domain, represented in this entry [
,
,
]. This domain, shared by hnRNPA1 and hnRNPA2/B1, can mediate protein-protein and protein-RNA interactions and it is involved in amyloid aggregation, leading to neurodegenerative diseases including amyotrophic lateral sclerosis (ALS) and multisystem proteinopathy (MSP) [,
].
Stomatin is also known as erythrocyte membrane protein band 7.2b. It is a 31kDa membrane protein [
], and was named after the rare human disease: haemolytic anaemia hereditary stomatocytosis. The protein contains a single hydrophobic domain, close to the N terminus, and is phosphorylated [].Stomatin is believed to be involved in regulating monovalent cation transport through lipid membranes. Absence of the protein in hereditary stomatocytosis is believed to be the reason for the leakage of Na
+and K
+ions into and from erythrocytes [
].A second function of stomatin is to act as a cytoskeletal anchor. One possible example of this is its interaction with some anti-malarial drugs. Current opinion speculates that such drugs bind to high density lipoproteins in serum. The lipoproteins are delivered to erythrocytes, where it is believed they Interact with stomatin as a means of transfer to the intracellular parasite, via a pathway used for the uptake of exogenous phospholipid [
].Stomatin-like proteins have been identified in various organisms, including Caenorhabditis elegans and Mus musculus.This domain covers a small conserved region located about 110 residues after the transmembrane domain.
A Disintegrin and Metalloproteinase (ADAM) is a family of proteolytic enzymes that regulate shedding of membrane-bound proteins, growth factors, cytokines, ligands and receptors []. This group of proteins are fundamental to many control processes in development and homeostasis, and they are linked to pathological states [].ADAM is a transmembrane protein that contains a disintegrin and metalloprotease domain (MEROPS peptidase family M12B). All members of the ADAM family display a common domain organisation - a pro-domain, the metalloprotease, disintigrin, cysteine-rich, epidermal-growth factor like, and transmembrane domains and a C-terminal cytoplasmic tail. They possess four potential functions: proteolysis, cell adhesion, cell fusion, and cell signalling.
ADAMs are membrane-anchored proteases that proteolytically modify cell surface and extracellular matrix (ECM) in order to alter cell behaviour.They are responsible for the proteolytic cleavage of transmembrane proteins and release of their extracellular domain [
,
].The ADAM cysteine-rich domain is not found in plant, archaeal, bacterial or viral proteins. The cysteine-rich domain complements the binding capacity of the disintegrin domain, and perhaps imparts specificity to disintegrin domain-mediated interactions. It has been shown that the cysteine-rich domain of ADAM13 regulates the protein's metalloprotease activity [
].
The SidE family includes four large proteins SidE, SdeA, SdeB, and SdeC, required for efficient intracellular bacterial replication and catalyses ubiquitination in an E1/E2-independent manner. These proteins contain four domains: a DUB domain, a phosphodiesterase (PDE) domain, a mono-ADP-ribosyltransferase (mART) domain, and a coiled-coil (CC) domain [
,
].Ubiquitination is a post-translational modification that regulates many cellular processes, the conventional ubiquitination cascade culminates in a covalent linkage between the C terminus of ubiquitin (Ub) and a target protein [
,
]. SidE family proteins can catalyse the non-canonical ubiquitination of several different substrate proteins, including Rab small GTPases, Reticulon-4 (Rtn4), and Rag small GTPases, as well as SidE proteins themselves []. This specificity resides on the specific ubiquitin-binding surfaces of mART and the unique features of the PDE domain [,
].The DUB activity of SdeA is important for regulating the dynamics of ubiquitin association with the bacterial phagosome, but is not necessary for its role in intracellular bacterial replication [
,
].This entry represents the mono-ADP-ribosyltransferase (mART) domain that mediates the mono-ADP-ribosylation of ubiquitin, which is then transferred to serine residues of host proteins [
].
This entry represents the PX domain found in Sorting nexin-6 (SNX6).SNX6 was found to interact with members of the transforming growth factor-beta family of receptor serine/threonine kinases. Strong heteromeric interactions were also seen among SNX1, -2, -4, and -6, suggesting the formation
in vivoof oligomeric complexes. SNX6 is localized in the cytoplasm where it is thought to target proteins to the
trans-Golgi network [
]. In addition, SNX6 was found to be translocated from the cytoplasm to nucleus by Pim-1, an oncogene product of serine/threonine kinase. This translocation is not affected by Pim-1-dependent phosphorylation, but the functional significance is unknown [
].The Phox Homology (PX) domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds phosphoinositides (PIs) and targets the protein to PI-enriched membranes [
,
]. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway [,
,
].
The HEPN (higher eukaryotes and prokaryotes nucleotide-binding) domain is a region of 110 residues found in the C terminus of sacsin, a chaperonin
implicated in an early-onset neurodegenerative disease in human, and in manybacterial and archeabacterial proteins. There are three classes of proteins
with HEPN domain:Single-domain HEPN proteins found in many bacteria.Two-domain proteins with N-terminal nucleotidyltransferase (NT) and C-
terminal HEPN domains. This N-terminal NT domain belongs to a large familyof NTs, which includes several classes of enzymes that are responsible for
some types of bacterial resistance to aminoglycosides. These enzymesdeactivate various antibiotics by transferring a nucleotidyl group to the
drug.A multidomain sacsin protein in genomes of fish and mammals. The HEPN
domain is located at the C terminus of the protein, directly after the DnaJdomain (
).
The crystal structure of the HEPN domain from the TM0613 protein of Thermotoga maritima indicates that it is structurally similar to the C-terminal all-
α-helical domain of kanamycin nucleotidyltransferases (KNTases). It is composed of five alpha helices, three of which form an up-and-down helical bundle, with a pair of short helices on the side. The distant
structural similarity suggests that the HEPN domain might be involved innucleotide binding [
].
Herpesviruses are enveloped by a lipid bilayer that contains at least a dozen glycoproteins. The virion surface glycoproteins mediate recognition of susceptible cells and promote fusion of the viral envelope with the cell membrane, leading to virus entry. No single glycoprotein associated with the virion membrane has been identified as the fusogen [].Glycoprotein L (gL) forms a non-covalently linked heterodimer with glycoprotein H (gH). This heterodimer is essential for virus-cell and cell-cell fusion since the association of gH and gL is necessary for correct localisation of gH to the virion or cell surface. gH anchoring the heterodimer to the plasma membrane through its transmembrane domain. gL lacks a transmembrane domain and is secreted from cells when expressed in the absence of gH [
].This entry represents Herpesvirus glycoprotein L (gL), which is a virion associated envelope glycoprotein [
]. Heterodimer formation between gH and gL has been demonstrated in both virions and infected cells []. Heterodimer formation between gL and gH is important for the proper folding of gH and its insertion into the membrane because the anti-gH conformation-dependent monoclonal antibodies (mAbs) 53S and LP11 bind gH only when gL is present [,
].
This superfamily is represented by the Bacteriophage T4, Gp32, single-stranded DNA-binding protein. The characteristics of the protein distribution suggest prophage matches in addition to the phage matches.
Single-stranded DNA-binding protein (also known as Gp32 or SSB) is essential for bacteriophage T4 DNA replication, recombination and repair, acting to stimulate replisome processing and accuracy through its binding to ssDNA as the replication fork advances [
]. The crystal structure of Gp32 shows an ssDNA binding cleft comprised of regions from three structural subdomains, through which ssDNA can slide freely []. The structure of Gp32 is similar to other phage ssDNA-binding proteins such as Gp2.5 from bacteriophage T4, and gene V protein, both of which have a nucleic acid-binding OB-type fold. However, Gp32 contains a zinc-finger subdomain at residues 63-111 that is not found in the other two phage proteins.This protein stimulates the activities of viral DNA polymerase and DnaB-like SF4 replicative helicase, probably via its interaction with the helicase assembly factor [
], and together with DnaB-like SF4 replicative helicase and the helicase assembly factor, promotes pairing of two homologous DNA molecules containing complementary single-stranded regions and mediates homologous DNA strand exchange [].
This group of cysteine peptidases belong to the MEROPS peptidase family C12 (ubiquitin C-terminal hydrolase family, clan CA). Families within the CA clan are loosely termed papain-like as protein fold of the peptidase unit resembles that of papain, the type example for clan CA. The type example is the human ubiquitin C-terminal hydrolase UCH-L1.Ubiquitin is highly conserved, commonly found conjugated to proteins in eukaryotic cells, where it may act as a marker for rapid degradation, or it may have a chaperone function in protein assembly [
]. The ubiquitin is released by cleavage from the bound protein by a protease []. A number of deubiquitinising proteases are known: all are activated by thiol compounds [,
], and inhibited by thiol-blocking agents and ubiquitin aldehyde [,
], and as such have the properties of cysteine proteases [].The deubiquitinsing proteases can be split into 2 size ranges: 20-30kDa (this entry) and 100-200kDa (
) [
]. The 20-30kDa group includes the yeast yuh1, which is known to be active only against small ubiquitin conjugates, being inactive against conjugated beta-galactosidase []. A mammalian homologue, UCH (ubiquitin conjugate hydrolase), is one of the most abundant proteins in the brain []. Only one conserved cysteine can be identified, along with two conserved histidines. The spacing between the cysteine and the second histidine is thought to be more representative ofthe cysteine/histidine spacing of a cysteine protease catalytic dyad [
].A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families [
]. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid,
N-ethylmaleimide or
p-chloromercuribenzoate.
Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [
].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [
]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [
]. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.
MADS genes in plants encode key developmental regulators of vegetative and reproductive development. The majority of the plant MADS proteins share a stereotypical MIKC structure. It comprises (from N- to C-terminal) an N-terminal domain, which is, however, present only in a minority of proteins; a MADS domain (see
,
), which is the major determinant of DNA-binding but which also performs dimerisation and accessory factor binding functions; a weakly conserved intervening (I) domain, which constitutes a key molecular determinant for the selective formation of DNA-binding dimers; a keratin-like (K-box) domain, which promotes protein dimerisation; and a C-terminal (C) domain, which is involved in transcriptional activation or in the formation of ternary or quaternary protein complexes. The 80-amino acid K-box domain was originally identified as a region with low but significant similarity to a region of keratin, which is part of the coiled-coil sequence constituting the central rod-shaped domain of keratin [
,
,
].The K-box protein-protein interaction domain which mediates heterodimerization of MIKC-type MADS proteins contains several heptad repeats in which the first and the fourth positions are occupied by hydrophobic amino acids suggesting that the K-box domain forms three amphipathic α-helices referred to as K1, K2, and K3 [
].
The homeobox domain or homeodomain was first identified in a number of Drosophila homeotic and segmentation proteins, but is now known to be well-conserved in many other animals, including vertebrates [
,
]. Hox genes encode homeodomain-containing transcriptional regulators that operate differential genetic programs along the anterior-posterior axis of animal bodies []. The domain binds DNA through a helix-turn-helix (HTH) structure. The HTH motif is characterised by two α-helices, which make intimate contacts with the DNA and are joined by a short turn. The second helix binds to DNA via a number of hydrogen bonds and hydrophobic interactions, which occur between specific side chains and the exposed bases and thymine methyl groups within the major groove of the DNA. The first helix helps to stabilise the structure.
The motif is very similar in sequence and structure in a wide range of DNA-binding proteins (e.g., cro and repressor proteins, homeotic proteins, etc.). One of the principal differences between HTH motifs in these different proteins arises from the stereo-chemical requirement for glycine in the turn which is needed to avoid steric interference of the β-carbon with the main chain: for cro and repressor proteins the glycine appears to be mandatory, while for many of the homeotic and other DNA-binding proteins the requirement is relaxed.
The homeobox domain or homeodomain was first identified in a number of Drosophila homeotic and segmentation proteins, but is now known to be well-conserved in many other animals, including vertebrates [
,
]. Hox genes encode homeodomain-containing transcriptional regulators that operate differential genetic programs along the anterior-posterior axis of animal bodies []. The domain binds DNA through a helix-turn-helix (HTH) structure. The HTH motif is characterised by two α-helices, which make intimate contacts with the DNA and are joined by a short turn. The second helix binds to DNA via a number of hydrogen bonds and hydrophobic interactions, which occur between specific side chains and the exposed bases and thymine methyl groups within the major groove of the DNA. The first helix helps to stabilise the structure.
The motif is very similar in sequence and structure in a wide range of DNA-binding proteins (e.g., cro and repressor proteins, homeotic proteins, etc.). One of the principal differences between HTH motifs in these different proteins arises from the stereo-chemical requirement for glycine in the turn which is needed to avoid steric interference of the β-carbon with the main chain: for cro and repressor proteins the glycine appears to be mandatory, while for many of the homeotic and other DNA-binding proteins the requirement is relaxed.
Membrane transport between compartments in eukaryotic cells requires proteins that allow the budding and scission of nascent cargo vesicles from one compartment and their targeting and fusion with another. Dynamins are large GTPases that belong to a protein superfamily [
] that, in eukaryotic cells, includes classical dynamins, dynamin-like proteins,OPA1, Mx proteins, mitofusins and guanylate-binding proteins/atlastins [
,
,
,
,
], and are involved in the scission of a wide range of vesicles and organelles. They play a role in many processes including budding of transport vesicles, division of organelles, cytokinesis and pathogen resistance. The minimal distinguishing architectural features that are common to all dynamins and are distinct from other GTPases are the structure of the large GTPase domain (300 amino acids) and the presence of two additional domains; the middle domain and the GTPase effector domain (GED), which are involved in oligomerization and regulation of the GTPase activity.This entry represents the GTPase domain, containing the GTP-binding motifs that are needed for guanine-nucleotide binding and hydrolysis. The conservation of these motifs is absolute except for the the final motif in guanylate-binding proteins. The GTPase catalytic activity can be stimulated by oligomerisation of the protein, which is mediated by interactions between the GTPase domain, the middle domain and the GED.
This entry represents both eukaryotic mitochondrial porins and Tom40 proteins.Eukaryotic mitochondrial porins are voltage-dependent anion-selective channels (VDAC) that behave as general diffusion pores for small hydrophilic molecules [
,
,
,
]. The channels adopt an open conformation at low or zero membrane potential and a closed conformation at potentials above 30-40 mV. The eukaryotic mitochondrial porins are β-barrel proteins, composed of between 12 to 16 β-strands that span the mitochondrial outer membrane. Yeast contains two members of this family (genes POR1 and POR2); vertebrates have at least three members (genes VDAC1, VDAC2 and VDAC3) []. They are related to the mitochondrial import receptor subunit Tom40 proteins, sharing a common evolutionary origin and structure [].Tom40 is a mitochondrion outer membrane protein and a component of the TOM (translocator of the outer mitochondrial membrane) complex, which is essential for import of protein precursors into mitochondria [
]. In Saccharomyces cerevisiae, TOM complex is composed of the subunits Tom70, Tom40, Tom22, Tom20, Tom7, Tom6, and Tom5 [,
]. Tom40 is an integral membrane protein and the main structural component of the protein-conducting channel formed by the TOM complex []. It is stabilised by other components, such as Tom5, Tom6, and Tom7 [].
This entry represents the PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain found in 60S ribosome subunit biogenesis protein NIP7, some UPF0113 family members, such as KD93, and similar proteins found in eukaryotes and some archaeal species [
,
,
]. PUA domains are predicted to bind RNA molecules with complex folded structures []. NIP7 is required for efficient 60S ribosome subunit biogenesis and has been shown to interact with another essential nucleolar protein, Nop8p, and the exosome subunit Rrp43p. These proteins are required for 60S subunit synthesis and may be part of a dynamic complex involved in this process. Nip7 orthologues share a two-domain architecture with the C-terminal PUA domain mediating interaction with RNA, suggesting that Nip7 is an adaptor protein with the C-terminal domain interacting with RNA targets and the N-terminal domain mediating interaction with protein targets. Structural analyses of the RNA-interacting surfaces of the orthologues from Saccharomyces cerevisiae and Pyrococcus abyssi Nip7 indicate that, in the archaeal PUA domain, C-terminal positively charged residues (arginines and lysines) are involved in RNA interaction while equivalent positions in eukaryotic orthologues are occupied by mostly hydrophobic residues. Both proteins can bind specifically to polyuridine, and RNA interaction requires specific residues of the PUA domain as determined by site-directed mutagenesis [
,
,
].
DNA protection during starvation protein, gammaproteobacteria
Type:
Family
Description:
This group belongs to the ferritin domain superfamily and has the ferritin-like structural fold. Ferritins constitute a broad superfamily of iron storage proteins, widespread in all domains of life [
,
]. Ferritins and bacterioferritins have essentially the same architecture, assembling in a 24mer cluster to form a hollow, roughly spherical, construction. This consists of a mineral core of hydrated ferric oxide and a multi-subunit protein shell, which encloses the former and assures its solubility in an aqueous environment. Due to the absence of the C-terminal fifth helix of 24mer ferritins, members of the Dps group assemble only to dodecameric protein shells [].Members of this entry were originally discovered as stress proteins, which protect DNA against oxidative stress during nutrient starvation [
], hence the name Dps (DNA protection during starvation protein). Several members of the group, such as Dps from Escherichia coli, exhibit a DNA-binding activity that is at least partially linked with iron complexation []. DNA binding by these proteins was shown to suffice for protection against oxidative DNA damage and might be mediated by magnesium ions, which bridge the protein surfaces with the polyanionic DNA [,
]. Dps also contributes to defense against copper stress in growing cells of E. coli [].
This entry represents a family of proteins which are involved in enzymes assembly and/or maturation: The TorD protein is involved in the maturation of the the trimethylamine N-oxide reductase TorA (a DMSO reductase family member) in Escherichia coli []. TorA is a molybdenum-containing enzyme which requires the the insertion of a bis(molybdopterin guanine dinucleotide) molybdenum (bis(MGD)Mo) cofactor in its catalytic site to be active and translocated to the periplasm. TorD acts as a chaperone, binding to apoTorA and promoting efficient incorporation of the cofactor into the protein.Nitrate reductase delta subunit (NarJ). This subunit is not part of the nitrate reductase enzyme but is a chaperone required for proper molybdenum cofactor insertion and final assembly of the nitrate reductase [
,
,
]. NarJ exhibits sequence homology to chaperones involved in maturation and cofactor insertion of E. coli redox enzymes that are mediated by twin-arginine translocase (Tat) dependent translocation []. The archetypal Tat proofreading chaperones belong to the TorD family [].Twin-arginine leader-binding protein DmsD, which could be required for the biogenesis of DMSO reductase rather than for the targeting of DmsA to the inner membrane [
,
,
].Dimethyl sulphide dehydrogenase protein DdhD. This protein is thought to function as chaperone protein in the assembly of an active dimethyl sulphide dehydrogenase DdhABC [
].
A number of serum transport proteins are known to be evolutionarily related, including albumin, alpha-fetoprotein, vitamin D-binding protein and afamin [
,
,
]. Albumin is the main protein of plasma; it binds water, cations (such as Ca2+, Na
+and K
+), fatty acids, hormones, bilirubin and drugs - its main function is to regulate the colloidal osmotic pressure of blood. Alphafeto- protein (alpha-fetoglobulin) is a foetal plasma protein that binds various cations, fatty acids and bilirubin. Vitamin D-binding protein binds to vitamin D and its metabolites, as well as to fatty acids. The biological role of afamin (alpha-albumin) has not yet been characterised. The 3D structure of human serum albumin has been determined by X-ray crystallography to a resolution of 2.8A [
]. It comprises three homologous domains that assemble to form a heart-shaped molecule []. Each domain is a product of two subdomains that possess common structural motifs []. The principal regions of ligand binding to human serum albumin are located in hydrophobic cavities in subdomains IIA and IIIA, which exhibit similar chemistry. Structurally, the serum albumins are similar, each domain containing five or six internal disulphide bonds, as shown schematically below:+---+ +----+ +-----+
| | | | | |xxCxxxxxxxxxxxxxxxxCCxxCxxxxCxxxxxCCxxxCxxxxxxxxxCxxxxxxxxxxxxxxCCxxxxCxxxx
| | | | | |+-----------------+ +-----+ +---------------+
Stathmin [
] (from the Greek 'stathmos' which means relay), is a ubiquitous intracellular protein, present in a variety of phosphorylated forms. It is involved in the regulation of the microtubule (MT) filament system by destabilising microtubules. It prevents assembly and promotes disassembly of microtubules []. However, when phosphorylated, its destabilisation ability is significantly reduced []. The stathmin family also includes:Stathmin 2 (Protein SCG10); a neuron-specific, membrane-associated protein
that accumulates in the growth cones of developing neurons. It is highlysimilar in its sequence to stathmin, but differs in that it contains an
additional N-terminal hydrophobic segment of 32 residues which is probablyresponsible for its interaction with membranes.Stathmin 3 (SCG10-like protein; SCLIP) [
]; a protein specificallyexpressed in neurons.
Stathmin 4 (Stathmin-like protein B3); which contains an additional N-
terminal hydrophobic domain [].These proteins possess a stathmin-like domain (SLD) with various N-terminal extensions. SLD is a highly conserved domain of 149 amino acid residues. Structurally, it consists of an N-terminal domain of about 45 residues followed by a 78 residue α-helical domain consisting of a heptad repeat coiled coil structure and a C-terminal domain of 25 residues [
,
]. The SLD binds two tubulins arranged longitudinally, head-to-tail, in protofilament-like complexes.
ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)). It is likely that the peptidase domain is involved in the cleavage of the polyprotein [
].The nucleotide sequence for the RNA of PLrV has been determined [
,
]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].
A number of serum transport proteins are known to be evolutionarily related, including albumin, alpha-fetoprotein, vitamin D-binding protein and afamin [
,
,
]. Albumin is the main protein of plasma; it binds water, cations (such as Ca2+, Na
+and K
+), fatty acids, hormones, bilirubin and drugs - its main function is to regulate the colloidal osmotic pressure of blood. Alphafeto- protein (alpha-fetoglobulin) is a foetal plasma protein that binds various cations, fatty acids and bilirubin. Vitamin D-binding protein binds to vitamin D and its metabolites, as well as to fatty acids. The biological role of afamin (alpha-albumin) has not yet been characterised. The 3D structure of human serum albumin has been determined by X-ray crystallography to a resolution of 2.8A [
]. It comprises three homologous domains that assemble to form a heart-shaped molecule []. Each domain is a product of two subdomains that possess common structural motifs []. The principal regions of ligand binding to human serum albumin are located in hydrophobic cavities in subdomains IIA and IIIA, which exhibit similar chemistry. Structurally, the serum albumins are similar, each domain containing five or six internal disulphide bonds, as shown schematically below:+---+ +----+ +-----+
| | | | | |xxCxxxxxxxxxxxxxxxxCCxxCxxxxCxxxxxCCxxxCxxxxxxxxxCxxxxxxxxxxxxxxCCxxxxCxxxx
| | | | | |+-----------------+ +-----+ +---------------+
The HORMA domain (for HOP1, REV7 and MAD2) is an about 180-240 amino acids region containing several conserved motifs. Whereas the MAD2 and the REV7 proteins are almost entirely made up of HORMA domains, HOP1 contains a HORMA domain in its N-terminal region and a Zn-finger domain, whose general arrangement of metal-chelating residues is similar to that of the PHD finger, in the C-terminal region. The HORMA domain is found in proteins showing a direct association with chromatin of all crown group eukaryotes. It has been suggested that the HORMA domain recognises chromatin states that result from DNA adducts, double-stranded breaks or non-attachment to the spindle and acts as an adaptor that recruits other proteins involved in repair [
].Secondary structure prediction suggests that the HORMA domain is globular and could potentially form a complex β-sheet(s) with associated α-helices [
].Some proteins known to contain a HORMA domain are listed below:Eukaryotic HOP1, a conserved protein that is involved in meiotic-synaptonemal-complex assembly.Eukaryotic mitotic-arrest-deficient 2 protein (MAD2), a key component of the mitotic-spindle-assembly checkpoint [
].Eukaryotic REV7, a subunit of the DNA polymerase zeta that is involved in translesion, template-independent DNA synthesis.Fungal Atg13, adaptor protein for the Atg1 kinase complex [
,
].
Stathmin [
] (from the Greek 'stathmos' which means relay), is a ubiquitous intracellular protein, present in a variety of phosphorylated forms. It is involved in the regulation of the microtubule (MT) filament system by destabilising microtubules. It prevents assembly and promotes disassembly of microtubules []. However, when phosphorylated, its destabilisation ability is significantly reduced []. The stathmin family also includes:Stathmin 2 (Protein SCG10); a neuron-specific, membrane-associated protein
that accumulates in the growth cones of developing neurons. It is highlysimilar in its sequence to stathmin, but differs in that it contains an
additional N-terminal hydrophobic segment of 32 residues which is probablyresponsible for its interaction with membranes.Stathmin 3 (SCG10-like protein; SCLIP) [
]; a protein specificallyexpressed in neurons.
Stathmin 4 (Stathmin-like protein B3); which contains an additional N-
terminal hydrophobic domain [].These proteins possess a stathmin-like domain (SLD) with various N-terminal extensions. SLD is a highly conserved domain of 149 amino acid residues. Structurally, it consists of an N-terminal domain of about 45 residues followed by a 78 residue α-helical domain consisting of a heptad repeat coiled coil structure and a C-terminal domain of 25 residues [
,
]. The SLD binds two tubulins arranged longitudinally, head-to-tail, in protofilament-like complexes.
Fragile X messenger ribonucleoprotein 1 (FMRP/FMR1), and its paralogues, FXR1 and FXR2 (RNA-binding protein FXR1 and 2, respectively), comprise a family of RNA-binding proteins [
].FMR1 protein is a multifunctional polyribosome-associated protein that plays a central role in neuronal development and synaptic plasticity. It regulates alternative mRNA splicing, mRNA stability, mRNA dendritic transport and postsynaptic local protein synthesis of a subset of mRNAs [
,
,
]. FMR1 is thought to bind target mRNA in the nucleus to form a ribonucleoprotein complex which is transported to dendrites and spines []. It is also required for ovary development and function []. FMR1 has also been shown to interact with components of the miRNA pathway []. A large expansion of the CGG trinucleotide repeat in the 5'-untranslated region of the FMR1 gene causes Fragile X syndrome (FXS), an inherited developmental disorder that causes a broad range of intellectual and physical challenges [].FXR1 and FXR2 interact with FMR1 and seem to have a related role; therefore they are likely to play important roles in the function of FMR1 and in the pathogenesis of the FXS syndrome [
,
]. FXR1 is highly expressed in vertebrates muscle cells and is required for proper development of this tissue [,
].
The PAS (Per, Arnt, Sim) domain [
,
] is an approximately 300 amino-acid segment of sequence similarity which is conserved between the Drosophila protein period clock (PER), the Ah receptor nuclear translocator (ARNT) and the Drosophila single-minded (SIM). It is composed of two or more imperfect repeats (PAS-1, PAS-2). In addition, some proteins have another similar region of 40-45 amino acids situated carboxy-terminal to any PAS repeat and which contributes to the PAS structural domain: the PAC motif. The PAS family can be divided in two groups; the proteins that have the PAS motif followed by a PAC motif, and those that do not. It appears that these domains are directly linked, and that together they form the conserved 3D PAS fold. The division between the PAS and PAC domains is caused by major differences in sequences in the region connecting these two motifs []. Within the bHLH/PAS proteins, the PAS domain is involved in protein dimerization with another protein of the family. It has also been associated with light reception, light regulation and circadian rhythm regulators (clock).In bacteria, the PAS domain is usually associated with the input domain of a histidine kinase, or a sensor protein that regulates a histidine kinase.
Nonsense-mediated mRNA decay (NMD) is a surveillance mechanism by which eukaryotic cells detect and degrade transcripts containing premature termination codons. Three 'up-frameshift' proteins, UPF1, UPF2 and UPF3, are essential for this process in organisms ranging from yeast, human to plants [
]. Exon junction complexes (EJCs) are deposited ~24 nucleotides upstream of exon-exon junctions after splicing. Translation causes displacement of the EJCs, however, premature translation termination upstream of one or more EJCs triggers the recruitment of UPF1, UPF2 and UPF3 and activates the NMD pathway [
,
]. This entry contains UPF3. The crystal structure of the complex between human UPF2 and UPF3b, which are, respectively, a MIF4G (middle portion of eIF4G) domain and an RNP domain (ribonucleoprotein-type RNA-binding domain) has been determined to 1.95A. The protein-protein interface is mediated by highly conserved charged residues in UPF2 and UPF3b and involves the β-sheet surface of the UPF3b ribonucleoprotein (RNP) domain, which is generally used by these domains to bind nucleic acids. In UPF3b the RNP domain does not bind RNA, whereas the UPF2 construct and the complex do. It is clear that some RNP domains have evolved for specific protein-protein interactions rather than as nucleic acid binding modules [
].
ORF2 of Potato leafroll virus (PLrV) encodes a polyprotein which is translated following a -1 frameshift. The polyprotein has a putative linear arrangement of membrane achor-VPg-peptidase-polmerase domains. The serine peptidase domain which is found in this group of sequences belongs to MEROPS peptidase family S39 (clan PA(S)), subfamily S39B. It is likely that the peptidase domain is involved in the cleavage of the polyprotein [
].The nucleotide sequence for the RNA of PLrV has been determined [
,
]. The sequence contains six large open reading frames (ORFs). The 5' coding region encodes two polypeptides of 28K and 70K, which overlap in different reading frames; it is suggested that the third ORF in the 5' block is translated by frameshift readthrough near the end of the 70K protein, yielding a 118K polypeptide []. Segments of the predicted amino acid sequences of these ORFs resemble those of known viral RNA polymerases, ATP-binding proteins and viral genome-linked proteins. The nucleotide sequence of the genomic RNA of Beet western yellows virus (BWYV) has been determined []. The sequence contains six long ORFs. A cluster of three of these ORFs, including the coat protein cistron, display extensive amino acid sequence similarity to corresponding ORFs of a second luteovirus: Barley yellow dwarf virus [].
This entry is limited to Chordata proteinsSomatomedin B (SMB), a serum factor of unknown function, is a small cysteine-rich peptide, derived proteolytically from the N terminus of the cell-substrate adhesion protein vitronectin [
]. Cys-rich somatomedin B-like domains are found in a number of proteins [], including ectonucleotide pyrophosphatase/phosphodiesterase family member proteins (previously known as plasma-cell membrane glycoprotein) [] and placental protein 11 (also known as Poly(U)-specific endoribonuclease), which appears to possess amidolytic activity.The SMB domain of vitronectin has been demonstrated to interact with both the urokinase receptor and the plasminogen activator inhibitor-1 (PAI-1) and the conserved cysteines of the NPP1 somatomedin B-like domain have been shown to mediate homodimerisation [
].The SMB domain contains eight Cys residues, arranged into four disulphide bonds. It has been suggested that the active SMB domain may be permitted considerable disulphide bond heterogeneity or variability, provided that the Cys25-Cys31 disulphide bond is preserved. The three dimensional structure of the SMB domain is extremely compact and the disulphide bonds are packed in the centre of the domain forming a covalently bonded core [
]. The structure of the SMB domain presents a new protein fold, with the only ordered secondary structure being a single-turn α-helix and a single-turn 3(10)-helix [].
A number of serum transport proteins are known to be evolutionarily related, including albumin, alpha-fetoprotein, vitamin D-binding protein and afamin [
,
,
]. Albumin is the main protein of plasma; it binds water, cations (such as Ca2+, Na
+and K
+), fatty acids, hormones, bilirubin and drugs - its main function is to regulate the colloidal osmotic pressure of blood. Alphafeto- protein (alpha-fetoglobulin) is a foetal plasma protein that binds various cations, fatty acids and bilirubin. Vitamin D-binding protein binds to vitamin D and its metabolites, as well as to fatty acids. The biological role of afamin (alpha-albumin) has not yet been characterised. The 3D structure of human serum albumin has been determined by X-ray crystallography to a resolution of 2.8A [
]. It comprises three homologous domains that assemble to form a heart-shaped molecule []. Each domain is a product of two subdomains that possess common structural motifs []. The principal regions of ligand binding to human serum albumin are located in hydrophobic cavities in subdomains IIA and IIIA, which exhibit similar chemistry. Structurally, the serum albumins are similar, each domain containing five or six internal disulphide bonds, as shown schematically below:+---+ +----+ +-----+
| | | | | |xxCxxxxxxxxxxxxxxxxCCxxCxxxxCxxxxxCCxxxCxxxxxxxxxCxxxxxxxxxxxxxxCCxxxxCxxxx
| | | | | |+-----------------+ +-----+ +---------------+
This group of aspartic peptidases belong to MEROPS peptidase family A22 (presenilin family, clan AD): subfamily A22A, the type example being presenilin 1 from Homo sapiens (Human).Presenilins are polytopic transmembrane (TM) proteins, mutations in which
are associated with the occurrence of early-onset familial Alzheimer'sdisease, a rare form of the disease that results from a single-gene
mutation [,
]. The physiological functions of presenilins are unknown, but they may be related to developmental signalling, apoptotic signal transduction, or processing of selected proteins, such as the beta-amyloid precursor protein(beta-APP). There are a number of subtypes which belong to this presenilin family. That presenilin homologues have been identified in species that do not have an Alzhemier's disease correlate suggests that they may have functions unrelated to the disease, homologues having been identified in Mus musculus (Mouse), Drosophila melanogaster, Caenorhabditis elegans
[] and other members of the eukarya including plants. In humans, there are two presenilin genes (PS1 and PS2) that share 67% amino acid identity, the greatest divergence between the two falling in the N terminus and in the large hydrophilic loop towards the C terminus of each molecule. Six to nine TM domains are predicted for each, and biochemical analysis has demonstrated that their C-termini are cytoplasmic; but the orientation of their N-termini and large hydrophilic loops remains to be resolved. They are expressed in almost all tissues, including the brain and, at a cellular level, they have been localised to the nuclear envelope, endoplasmic
reticulum and Golgi apparatus. Presenilin 1 has been shown to be phosphorylated by protein kinase C, and is endogenously cleaved into 28kDa N-terminal and 19kDa C-terminal fragments. Consequently, little of the uncleaved peptide is detectable
in vivo. PS1
gene mutations are thought to account for the majority of early-onset
familial Alzheimer's disease cases. To date, 45 different mutations havebeen identified in PS1, all but one of which result in a single amino change
in the presenilin 1 molecule. Affected residues always occur in regions ofthe sequence that are conserved between presenilins 1 and 2, and the C. elegans
homologue, sel-12 []. The mutations are thought to be responsible for ~50% of cases of early-onset familial Alzheimer's disease, in contrast, less than 1% resulting from mutations in PS2. How the mutations trigger disease is unknown, but one biochemical effect consistently associated with them is an alteration in the proteolytic cleavage of beta-APP such that there is overproduction of long-tailed beta-amyloid peptide derivatives.Presenilin-1 (MEROPS identifer A22.001) has been identified as the gamma-secretase that performs one of the cleavages that leads to the release of the amyloid-beta peptide from its precursor protein. Processing of the amyloid beta precursor requires an initial cleavage by beta-secretase (now identified as BACE-1), which releases the extracellular domain, followed by cleavages by presenilin-1 to release the amyloid-beta peptide and an intracellular domain [
]. Released amyloid-beta peptide can form the plaques that are depoited in the brains of patients suffering Alzheimer's disease []. Presenilin-1 is a transmembrane peptidase in which the active site is buried in the membrane [,
]. Native presenilin-1 is processed in the large loop between the sixth and seventh transmembrane regions to form a two-chain protein, with a larger N-terminal domain [,
]. The heterodimer forms a complex with three other membrane proteins, nicastrin, Aph-1 and Pen-2 []. In addition to the amyloid beta protein, presenilin-1 also cleaves Notch, various cadherins, Notch ligands Delta and Jagged, and other proteins [,
].
Globins are haem-containing proteins involved in binding and/or transporting oxygen. They belong to a very large and well studied family that is widely distributed in many organisms [
]. Globins have evolved from a common ancestor and can be divided into three groups: single-domain globins, and two types of chimeric globins, flavohaemoglobins and globin-coupled sensors. Bacteria have all three types of globins, while archaea lack flavohaemoglobins, and eukaryotes lack globin-coupled sensors []. Several functionally different haemoglobins can coexist in the same species. The major types of globins include:Haemoglobin (Hb): tetramer of two alpha and two beta chains, although embryonic and foetal forms can substitute the alpha or beta chain for ones with higher oxygen affinity, such as gamma, delta, epsilon or zeta chains. Hb transports oxygen from lungs to other tissues in vertebrates []. Hb proteins are also present in unicellular organisms where they act as enzymes or sensors [].Myoglobin (Mb): monomeric protein responsible for oxygen storage in vertebrate muscle [
].Neuroglobin: a myoglobin-like haemprotein expressed in vertebrate brain and retina, where it is involved in neuroprotection from damage due to hypoxia or ischemia [
]. Neuroglobin belongs to a branch of the globin family that diverged early in evolution. Cytoglobin: an oxygen sensor expressed in multiple tissues. Related to neuroglobin [
].Erythrocruorin: highly cooperative extracellular respiratory proteins found in annelids and arthropods that are assembled from as many as 180 subunit into hexagonal bilayers [
].Leghaemoglobin (legHb or symbiotic Hb): occurs in the root nodules of leguminous plants, where it facilitates the diffusion of oxygen to symbiotic bacteriods in order to promote nitrogen fixation.Non-symbiotic haemoglobin (NsHb): occurs in non-leguminous plants, and can be over-expressed in stressed plants [
].Flavohaemoglobins (FHb): chimeric, with an N-terminal globin domain and a C-terminal ferredoxin reductase-like NAD/FAD-binding domain. FHb provides protection against nitric oxide via its C-terminal domain, which transfers electrons to haem in the globin [
].Globin-coupled sensors: chimeric, with an N-terminal myoglobin-like domain and a C-terminal domain that resembles the cytoplasmic signalling domain of bacterial chemoreceptors. They bind oxygen, and act to initiate an aerotactic response or regulate gene expression [
,
]. Protoglobin: a single domain globin found in archaea that is related to the N-terminal domain of globin-coupled sensors [
].Truncated 2/2 globin: lack the first helix, giving them a 2-over-2 instead of the canonical 3-over-3 α-helical sandwich fold. Can be divided into three main groups (I, II and II) based on structural features [
].This entry represents erythrocruorins (Ec), giant extracellular haemoglobins freely dissolved in the blood of annelids and arthropods, rather than packaged in cells [
]. Ec proteins are assembled from up to 200 haemoglobin subunits, some of which are disulphide-bonded, as well as non-haemoglobin linker subunits. For example, an Ec from Lumbricus terrestris (Common earthworm) consists of 144 oxygen-binding haemoglobin subunits and 36 non-haemoglobin linker subunits, where the haemoglobin subunits are arranged in dodecameric substructures []. The 3D structures of a number of Ec proteins are known. The protein is largely α-helical, eight conserved helices (A to H) providing the scaffold for a well-defined haem-binding pocket. The imidazole ring of the 'proximal' His residue provides the fifth haem iron ligand; the other axial haem iron position remains essentially free for oxygen coordination. Many Ec proteins lack the 'distal' His and Val residues that are conserved in vertebrate globins.
Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll 'a' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane [
,
,
]. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection []. In PSII, the oxygen-evolving complex (OEC) is responsible for catalysing the splitting of water to O(2) and 4H+. The OEC is composed of a cluster of manganese, calcium and chloride ions bound to extrinsic proteins. In cyanobacteria there are five extrinsic proteins in OEC (PsbO, PsbP-like, PsbQ-like, PsbU and PsbV), while in plants there are only three (PsbO, PsbP and PsbQ), PsbU and PsbV having been lost during the evolution of green plants [
].This family represents the PSII OEC protein PsbO, which appears to be the most important extrinsic protein for oxygen evolution. PsbO lies closest to the Mn cluster where water oxidation occurs, and has a stabilising effect on the Mn cluster. As a result, PsbO is often referred to as the Mn-stabilising protein (MSP), although none of its amino acids are likely ligands for Mn. Calcium ions were found to modify the conformation of PsbO in solution [
].
The Macro or A1pp domain is a module of about 180 amino acids which can bind ADP-ribose (an NAD metabolite) or related ligands. Binding to ADP-ribose could be either covalent or non-covalent [
]: in certain cases it is believed to bind non-covalently []; while in other cases (such as Aprataxin) it appears to bind both non-covalently through a zinc finger motif, and covalently through a separate region of the protein []. The domain was described originally in association with ADP-ribose 1''-phosphate (Appr-1''-P) processing activity (A1pp) of the yeast YBR022W protein []. The domain is also called Macro domain as it is the C-terminal domain of mammalian core histone macro-H2A [,
]. Macro domain proteins can be found in eukaryotes, in (mostly pathogenic) bacteria, in archaea and in ssRNA viruses, such as coronaviruses [,
], Rubella and Hepatitis E viruses. In vertebrates the domain occurs e.g. in histone macroH2A, in predicted poly-ADP-ribose polymerases (PARPs) and in B aggressive lymphoma (BAL) protein. The macro domain can be associated with catalytic domains, such as PARP, or sirtuin. The Macro domain can recognise ADP-ribose or in some cases poly-ADP-ribose, which can be involved in ADP-ribosylation reactions that occur in important processes, such as chromatin biology, DNA repair and transcription regulation []. The human macroH2A1.1 Macro domain binds an NAD metabolite O-acetyl-ADP-ribose []. The Macro domain has been suggested to play a regulatory role in ADP-ribosylation, which is involved in inter- and intracellular signaling, transcriptional regulation, DNA repair pathways and maintenance of genomic stability, telomere dynamics, cell differentiation and proliferation, and necrosis and apoptosis. The 3D structure of the SARS-CoV Macro domain has a mixed α/β fold consisting of a central seven-stranded twisted mixed β-sheet sandwiched between two α-helices on one face, and three on the other. The final α-helix, located on the edge of the central β-sheet, forms the C terminus of the protein [
]. The crystal structure of AF1521 (a Macro domain-only protein from Archaeoglobus fulgidus) has also been reported and compared with other Macro domain containing proteins. Several Macro domain only proteins are shorter than AF1521, and appear to lack either the first strand of the β-sheet or the C-terminal helix 5. Well conserved residues form a hydrophobic cleft and cluster around the AF1521-ADP-ribose binding site [,
,
,
].
Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [,
,
,
,
]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. A number of eukaryotic and viral proteins contain a conserved cysteine-rich domain of 40 to 60 residues (called C3HC4 zinc-finger or 'RING' finger) [
] that binds two atoms of zinc. There are two different variants, the C3HC4-type and the C3H2C3-type, which is clearly related despite the different cysteine/histidine pattern. The latter type is sometimes referred to as "RING-H2 finger". The 3D structure [
] of the zinc ligation system is referred to as the "cross-brace"motif. This atypical conformation is also shared by the FYVE (see
) and PHD (see
) domains. Many proteins containing a RING finger play a key role in the ubiquitination pathway. The ubiquitination pathway generally involves three types of enzyme, know as E1, E2 and E3. E1 and E2 are ubiquitin conjugating enzymes. E1 acts first and passes ubiquitin to E2. E3 are ubiquitin protein ligases, responsible for substrate recognition. It has been shown [
,
] that several RING fingers act as E3 enzymes in the ubiquitination process.
Tubby, an autosomal recessive mutation, mapping to mouse chromosome 7, was recently found to be the result of a splicing defect in a novel gene with unknown function. This mutation maps to the tub gene [
,
]. The mouse tubby mutation is the cause of maturity-onset obesity, insulin resistance and sensory deficits. By contrast with the rapid juvenile-onset weight gain seen in diabetes (db) and obese (ob) mice, obesity in tubby mice develops gradually, and strongly resembles the late-onset obesity observed in the human population. Excessive deposition of adipose tissue culminates in a two-fold increase of body weight. Tubby mice also suffer retinal degeneration and neurosensory hearing loss. The tripartite character of the tubby phenotype is highly similar to human obesity syndromes, such as Alstrom and Bardet-Biedl. Although these phenotypes indicate a vital role for tubby proteins, no biochemical function has yet been ascribed to any family member [], although it has been suggested that the phenotypic features of tubby mice may be the result of cellular apoptosis triggered by expression of the mutated tub gene. TUB is the founding-member of the tubby-like proteins, the TULPs. TULPs are found in multicellular organisms from both the plant and animal kingdoms. Ablation of members of this protein family cause disease phenotypes that are indicative of their importance in nervous-system function and development [].Mammalian TUB is a hydrophilic protein of ~500 residues. The N-terminal (
) portion of the protein is conserved neither in length nor sequence, but, in TUB, contains the nuclear localisation signal and may have transcriptional-activation activity. The C-terminal 250 residues are highly conserved. The C-terminal extremity contains a cysteine residue that might play an important role in the normal functioning of these proteins. The crystal structure of the C-terminal core domain from mouse tubby has been determined to 1.9A resolution. This domain is arranged as a 12-stranded, all anti-parallel, closed β-barrel that surrounds a central alpha helix, (which is at the extreme carboxyl terminus of the protein) that forms most of the hydrophobic core. Structural analyses suggest that TULPs constitute a unique family of bipartite transcription factors [
].This entry represents conserved sites found in the C-terminal domain. The site closest to the C terminus contains a penultimate cysteine residue that could be critical to the normal functioning of these proteins.
Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [
,
,
,
,
]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. The FYVE zinc finger is named after four proteins that it has been found in: Fab1, YOTB/ZK632.12, Vac1, and EEA1. The FYVE finger has been shown to bind two zinc ions [
]. The FYVE finger has eight potential zinc coordinating cysteine positions. Many members of this family also include two histidines in a motif R+HHC+XCG, where + represents a charged residue and X any residue. FYVE-type domains are divided into two known classes: FYVE domains that specifically bind to phosphatidylinositol 3-phosphate in lipid bilayers and FYVE-related domains of undetermined function []. Those that bind to phosphatidylinositol 3-phosphate are often found in proteins targeted to lipid membranes that are involved in regulating membrane traffic [,
,
]. Most FYVE domains target proteins to endosomes by binding specifically to phosphatidylinositol-3-phosphate at the membrane surface. By contrast, the CARP2 FYVE-like domain is not optimized to bind to phosphoinositides or insert into lipid bilayers. FYVE domains are distinguished from other zinc fingers by three signature sequences: an N-terminal WxxD motif, a basic R(R/K)HHCR patch, and a C-terminal RVC motif.
Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [
,
,
,
,
]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. Pirh2 is an eukaryotic ubiquitin protein ligase, which has been shown to promote p53 degradation in mammals. Pirh2 physically interacts with p53 and promotes ubiquitination of p53 independently of MDM2. Like MDM2, Pirh2 is thought to participate in an autoregulatory feedback loop that controls p53 function. Pirh2 proteins contain three distinct zinc fingers, the CHY-type, the CTCHY-type which is C-terminal to the CHY-type zinc finger and a RING finger. The CHY-type zinc finger has no known functions but the region encompassing the CTCHY-type zinc finger is required for binding to p53 in mammals [
].The CTCHY-type zinc finger has so far only been found in Pirh2 proteins. It binds 3 zinc atoms as shown in the following schematic representation:
The CTCHY-type zinc finger:
+--+------------+------+| | | |
CxxCxxxxxxxxxxHCxxCxxCxxxxxxxxxHCxxCxxCxxxxxxxxHxC| | | | | | | |
+--+----------+------+ +--+-----------+-+'C': conserved cysteine involved in the binding of one zinc atom.
'H': conserved histidine involved in the binding of one zinc atom.
The lipocalin family can be subdivided into kernal and outlier sets. The
kernal lipocalins form the largest self consistent group, comprising the subfamily of alpha-1-microglobulins. The outlier lipocalins form several smaller distinct subgroups: the OBPs, the von Ebner's gland proteins, alpha-1-acid glycoproteins,
tick histamine binding proteins and the nitrophorins.Alpha-1-microglobulin (A1M), also known as protein HC (for Heterogeneous
Charge), is a low molecular weight protein component of plasma first discovered in pathological human urine. It is a member of the lipocalin superfamily. Although much is now known of its structure and properties, the function and physiological role of A1M remains unclear, although evidence suggests that it functions in the regulation of the immune system. A1M is known to exist in both a free form and complexed to other macromolecules: immunoglobulin A (IgA) in humans and alpha-1-inhibitor-3 in the rat. Free A1M is a monomeric protein composed of one 188 residue polypeptide and contains three cysteines, two of which (residues 75 and 173) form a conserved intra-molecular disulphide link []. A1M is glycosylated by three separate carbohydrate chains: two complex carbohydrates are N-linked to asparagines at residues 17 and 96, and the other simple carbohydrate is O-linked to threonine at position 5. 22% of the total molecular mass of the protein is derived from carbohydrate. Free A1M is extremely heterogeneous in charge, and is found tightly associated with a chromophore. This chromophoric group is covalently bound to the free cysteine residue at position 34. It also binds retinol as a major ligand, but this is probably distinct from the its covalent chromophore. The glycosylation is different between species. The principal sites of A1M synthesis are the liver and kidney. Half of all human plasma A1M (about 0.03mg/ml) forms a 1:1 complex with about 5% of plasma immunoglobulin A. The resulting macromolecular complex has a molecular weight of 200000, anda plasma concentration of 0.3mg/ml. It can exhibit both antibody activity and affect many of the biological actions of free A1M [
]. A1M has many affects on the immune system. It inhibits stimulation of cultured lymphocytes by protein antigens; it can induce cell division of lymphocytes, a mitogenic effect that can either be enhanced or inhibited by the action of other plasma components; it inhibits neutrophil granulocyte migration
in vitro; and it inhibits chemotaxis.
Amyloid-beta precursor protein (APP, or A4) is associated with Alzheimer's disease (AD), because one of its breakdown products, amyloid-beta (A-beta), aggregates to form amyloid or senile plaques [
,
,
]. Mutations in APP or in proteins that process APP have been linked with early-onset, familial AD. Individuals with Down's syndrome carry an extra copy of chromosome 21, which contains the APP gene, and almost invariably develop amyloid plaques and Alzheimer's symptoms.APP is important for the neurogenesis and neuronal regeneration, either through the intact protein, or through its many breakdown products [
,
]. APP consists of a large N-terminal extracellular region containing heparin-binding and copper-binding sites, Kunitz domain, E2 domain, a short hydrophobic transmembrane domain, and a short C-terminal intracellular domain. The N-terminal region is similar in structure to cysteine-rich growth factors and appears to function as a cell surface receptor, contributing to neurite growth, neuronal adhesion, axonogenesis and cell mobility []. APP acts as a kinesin I membrane receptor to mediate the axonal transport of beta-secretase and presenilin 1. The N-terminal domain can regulate neurite outgrowth through its binding to heparin and collagen I and IV, which are components of the extracellular matrix. APP is also coupled to apoptosis-inducing pathways, and is involved in copper homeostasis/oxidative stress through copper ion reduction, where copper-metallated APP induces neuronal death [,
]. The C-terminal intracellular domain appears to be involved in transcription regulation through protein-protein interactions. APP can promote transcription activation through binding to APBB1/Tip60, and may bind to the adaptor protein FE65 to transactivate a wide variety of different promoters.APP can be processed by different sets of enzymes:In the non-amyloidogenic (non-plaque-forming) pathway, APP is cleaved by alpha-secretase to yield a soluble N-terminal sAPP-alpha (neuroprotective) and a membrane-bound CTF-alpha. CTF-alpha is broken-down by presenilin-containing gamma-secretase to yield soluble p3 and membrane-bound AICD (nuclear signalling). In the amyloidogenic pathway (plaque-forming), APP is broken down by beta-secretase to yield soluble sAPP-beta and membrane-bound CTF-beta. CTF-beta is broken down by gamma-secretase to yield soluble amyloid-beta and membrane-bound AICD. Amyloid-beta is required for neuronal function, but can aggregate to form amyloid plaques that seem to disrupt brain cells by clogging points of cell-cell contact.This superfamily represents a copper-binding domain found within the extracellular domain, which is at the N-terminal of amyloidogenic glycoproteins such as amyloid-beta precursor protein (APP, or A4). The copper-binding domain has a dodecin-like fold consisting of a 2-layer alpha/beta topology [
].
The CRISPR-Cas system is a prokaryotic defense mechanism against foreign genetic elements. The key elements of this defense system are the Cas proteins and the CRISPR RNA. Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes [
]. CRISPRs appear to provide acquired resistance against mobile genetic elements (viruses, transposable elements and conjugative plasmids). CRISPR clusters contain sequences complementary to antecedent mobile elements and target invading nucleic acids. CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA).The defense reaction is divided into three stages. In the adaptation stage, the invader DNA is cleaved, and a piece of it is selected to be integrated as a new spacer into the CRISPR locus, where it is stored as an identity tag for future attacks by this invader. During the second stage (the expression stage), the CRISPR RNA (pre-crRNA) is transcribed and subsequently processed into the mature crRNAs. In the third stage (the interference stage), Cas proteins, together with crRNAs, identify and degrade the invader [
,
,
].The CRISPR-Cas systems have been sorted into three major classes. In CRISPR-Cas types I and III, the mature crRNA is generally generated by a member of the Cas6 protein family. Whereas in system III the Cas6 protein acts alone, in some class I systems it is part of a complex of Cas proteins known as Cascade (CRISPR-associated complex for antiviral defense). The Cas6 protein is an endoribonuclease necessary for crRNA production whereas the additional Cas proteins that form the Cascade complex are needed for crRNA stability [
]. This entry represents the Csf4 family of DEAD/DEAH-box helicases. Members of this Cas family show up near CRISPR repeats in Acidithiobacillus ferrooxidans ATCC 23270, Azoarcus sp. (strain EbN1), and Rhodoferax ferrireducens (strain DSM 15236/ATCC BAA-621/T118). In the latter two species, the CRISPR/cas locus is found on a plasmid. This family is one of several characteristic of a type of cas gene cluster we designate Aferr after A. ferrooxidans, where it is both chromosomal and the only type of cas gene cluster found. The gene is designated csf4 (CRISPR/cas Subtype as in A. ferrooxidans protein 1), as it lies farthest (fourth closest) from the repeats in the A. ferrooxidans genome.
Amyloid-beta precursor protein (APP, or A4) is associated with Alzheimer's disease (AD), because one of its breakdown products, amyloid-beta (A-beta), aggregates to form amyloid or senile plaques [
,
,
]. Mutations in APP or in proteins that process APP have been linked with early-onset, familial AD. Individuals with Down's syndrome carry an extra copy of chromosome 21, which contains the APP gene, and almost invariably develop amyloid plaques and Alzheimer's symptoms.APP is important for the neurogenesis and neuronal regeneration, either through the intact protein, or through its many breakdown products [
,
]. APP consists of a large N-terminal extracellular region containing heparin-binding and copper-binding sites, Kunitz domain, E2 domain, a short hydrophobic transmembrane domain, and a short C-terminal intracellular domain. The N-terminal region is similar in structure to cysteine-rich growth factors and appears to function as a cell surface receptor, contributing to neurite growth, neuronal adhesion, axonogenesis and cell mobility []. APP acts as a kinesin I membrane receptor to mediate the axonal transport of beta-secretase and presenilin 1. The N-terminal domain can regulate neurite outgrowth through its binding to heparin and collagen I and IV, which are components of the extracellular matrix. APP is also coupled to apoptosis-inducing pathways, and is involved in copper homeostasis/oxidative stress through copper ion reduction, where copper-metallated APP induces neuronal death [,
]. The C-terminal intracellular domain appears to be involved in transcription regulation through protein-protein interactions. APP can promote transcription activation through binding to APBB1/Tip60, and may bind to the adaptor protein FE65 to transactivate a wide variety of different promoters.APP can be processed by different sets of enzymes:In the non-amyloidogenic (non-plaque-forming) pathway, APP is cleaved by alpha-secretase to yield a soluble N-terminal sAPP-alpha (neuroprotective) and a membrane-bound CTF-alpha. CTF-alpha is broken-down by presenilin-containing gamma-secretase to yield soluble p3 and membrane-bound AICD (nuclear signalling). In the amyloidogenic pathway (plaque-forming), APP is broken down by beta-secretase to yield soluble sAPP-beta and membrane-bound CTF-beta. CTF-beta is broken down by gamma-secretase to yield soluble amyloid-beta and membrane-bound AICD. Amyloid-beta is required for neuronal function, but can aggregate to form amyloid plaques that seem to disrupt brain cells by clogging points of cell-cell contact.This entry represents the amyloid-beta peptide (A-beta) superfamily, which originates as a breakdown product from the cleavage of amyloid-beta precursor protein (APP, or A4), an integral, glycosylated membrane brain protein.
Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [
,
,
,
,
]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. Pirh2 is an eukaryotic ubiquitin protein ligase, which has been shown to promote p53 degradation in mammals. Pirh2 physically interacts with p53 and promotes ubiquitination of p53 independently of MDM2. Like MDM2, Pirh2 is thought to participate in an autoregulatory feedback loop that controls p53 function. Pirh2 proteins contain three distinct zinc fingers, the CHY-type, the CTCHY-type which is C-terminal to the CHY-type zinc finger and a RING finger. The CHY-type zinc finger has no known functions but the region encompassing the CTCHY-type zinc finger is required for binding to p53 in mammals [
].The CTCHY-type zinc finger has so far only been found in Pirh2 proteins. It binds 3 zinc atoms as shown in the following schematic representation:
The CTCHY-type zinc finger:
+--+------------+------+| | | |
CxxCxxxxxxxxxxHCxxCxxCxxxxxxxxxHCxxCxxCxxxxxxxxHxC| | | | | | | |
+--+----------+------+ +--+-----------+-+'C': conserved cysteine involved in the binding of one zinc atom.
'H': conserved histidine involved in the binding of one zinc atom.
Amyloid-beta precursor protein (APP, or A4) is associated with Alzheimer's disease (AD), because one of its breakdown products, amyloid-beta (A-beta), aggregates to form amyloid or senile plaques [
,
,
]. Mutations in APP or in proteins that process APP have been linked with early-onset, familial AD. Individuals with Down's syndrome carry an extra copy of chromosome 21, which contains the APP gene, and almost invariably develop amyloid plaques and Alzheimer's symptoms.APP is important for the neurogenesis and neuronal regeneration, either through the intact protein, or through its many breakdown products [
,
]. APP consists of a large N-terminal extracellular region containing heparin-binding and copper-binding sites, Kunitz domain, E2 domain, a short hydrophobic transmembrane domain, and a short C-terminal intracellular domain. The N-terminal region is similar in structure to cysteine-rich growth factors and appears to function as a cell surface receptor, contributing to neurite growth, neuronal adhesion, axonogenesis and cell mobility []. APP acts as a kinesin I membrane receptor to mediate the axonal transport of beta-secretase and presenilin 1. The N-terminal domain can regulate neurite outgrowth through its binding to heparin and collagen I and IV, which are components of the extracellular matrix. APP is also coupled to apoptosis-inducing pathways, and is involved in copper homeostasis/oxidative stress through copper ion reduction, where copper-metallated APP induces neuronal death [,
]. The C-terminal intracellular domain appears to be involved in transcription regulation through protein-protein interactions. APP can promote transcription activation through binding to APBB1/Tip60, and may bind to the adaptor protein FE65 to transactivate a wide variety of different promoters.APP can be processed by different sets of enzymes:In the non-amyloidogenic (non-plaque-forming) pathway, APP is cleaved by alpha-secretase to yield a soluble N-terminal sAPP-alpha (neuroprotective) and a membrane-bound CTF-alpha. CTF-alpha is broken-down by presenilin-containing gamma-secretase to yield soluble p3 and membrane-bound AICD (nuclear signalling). In the amyloidogenic pathway (plaque-forming), APP is broken down by beta-secretase to yield soluble sAPP-beta and membrane-bound CTF-beta. CTF-beta is broken down by gamma-secretase to yield soluble amyloid-beta and membrane-bound AICD. Amyloid-beta is required for neuronal function, but can aggregate to form amyloid plaques that seem to disrupt brain cells by clogging points of cell-cell contact.This entry represents the amyloid-beta peptide (A-beta), which originates as a breakdown product from the cleavage of amyloid-beta precursor protein (APP, or A4), an integral, glycosylated membrane brain protein.
Amyloid-beta precursor protein (APP, or A4) is associated with Alzheimer's disease (AD), because one of its breakdown products, amyloid-beta (A-beta), aggregates to form amyloid or senile plaques [
,
,
]. Mutations in APP or in proteins that process APP have been linked with early-onset, familial AD. Individuals with Down's syndrome carry an extra copy of chromosome 21, which contains the APP gene, and almost invariably develop amyloid plaques and Alzheimer's symptoms.APP is important for the neurogenesis and neuronal regeneration, either through the intact protein, or through its many breakdown products [
,
]. APP consists of a large N-terminal extracellular region containing heparin-binding and copper-binding sites, Kunitz domain, E2 domain, a short hydrophobic transmembrane domain, and a short C-terminal intracellular domain. The N-terminal region is similar in structure to cysteine-rich growth factors and appears to function as a cell surface receptor, contributing to neurite growth, neuronal adhesion, axonogenesis and cell mobility []. APP acts as a kinesin I membrane receptor to mediate the axonal transport of beta-secretase and presenilin 1. The N-terminal domain can regulate neurite outgrowth through its binding to heparin and collagen I and IV, which are components of the extracellular matrix. APP is also coupled to apoptosis-inducing pathways, and is involved in copper homeostasis/oxidative stress through copper ion reduction, where copper-metallated APP induces neuronal death [,
]. The C-terminal intracellular domain appears to be involved in transcription regulation through protein-protein interactions. APP can promote transcription activation through binding to APBB1/Tip60, and may bind to the adaptor protein FE65 to transactivate a wide variety of different promoters.APP can be processed by different sets of enzymes:In the non-amyloidogenic (non-plaque-forming) pathway, APP is cleaved by alpha-secretase to yield a soluble N-terminal sAPP-alpha (neuroprotective) and a membrane-bound CTF-alpha. CTF-alpha is broken-down by presenilin-containing gamma-secretase to yield soluble p3 and membrane-bound AICD (nuclear signalling). In the amyloidogenic pathway (plaque-forming), APP is broken down by beta-secretase to yield soluble sAPP-beta and membrane-bound CTF-beta. CTF-beta is broken down by gamma-secretase to yield soluble amyloid-beta and membrane-bound AICD. Amyloid-beta is required for neuronal function, but can aggregate to form amyloid plaques that seem to disrupt brain cells by clogging points of cell-cell contact.This entry represents a copper-binding domain found within the extracellular domain, which is at the N-terminal of amyloidogenic glycoproteins such as amyloid-beta precursor protein (APP, or A4). The copper-binding domain has a dodecin-like fold consisting of a 2-layer alpha/beta topology [
].
Amyloid-beta precursor protein (APP, or A4) is associated with Alzheimer's disease (AD), because one of its breakdown products, amyloid-beta (A-beta), aggregates to form amyloid or senile plaques [
,
,
]. Mutations in APP or in proteins that process APP have been linked with early-onset, familial AD. Individuals with Down's syndrome carry an extra copy of chromosome 21, which contains the APP gene, and almost invariably develop amyloid plaques and Alzheimer's symptoms.APP is important for the neurogenesis and neuronal regeneration, either through the intact protein, or through its many breakdown products [
,
]. APP consists of a large N-terminal extracellular region containing heparin-binding and copper-binding sites, Kunitz domain, E2 domain, a short hydrophobic transmembrane domain, and a short C-terminal intracellular domain. The N-terminal region is similar in structure to cysteine-rich growth factors and appears to function as a cell surface receptor, contributing to neurite growth, neuronal adhesion, axonogenesis and cell mobility [
]. APP acts as a kinesin I membrane receptor to mediate the axonal transport of beta-secretase and presenilin 1. The N-terminal domain can regulate neurite outgrowth through its binding to heparin and collagen I and IV, which are components of the extracellular matrix. APP is also coupled to apoptosis-inducing pathways, and is involved in copper homeostasis/oxidative stress through copper ion reduction, where copper-metallated APP induces neuronal death [,
]. The C-terminal intracellular domain appears to be involved in transcription regulation through protein-protein interactions. APP can promote transcription activation through binding to APBB1/Tip60, and may bind to the adaptor protein FE65 to transactivate a wide variety of different promoters.APP can be processed by different sets of enzymes:In the non-amyloidogenic (non-plaque-forming) pathway, APP is cleaved by alpha-secretase to yield a soluble N-terminal sAPP-alpha (neuroprotective) and a membrane-bound CTF-alpha. CTF-alpha is broken-down by presenilin-containing gamma-secretase to yield soluble p3 and membrane-bound AICD (nuclear signalling). In the amyloidogenic pathway (plaque-forming), APP is broken down by beta-secretase to yield soluble sAPP-beta and membrane-bound CTF-beta. CTF-beta is broken down by gamma-secretase to yield soluble amyloid-beta and membrane-bound AICD. Amyloid-beta is required for neuronal function, but can aggregate to form amyloid plaques that seem to disrupt brain cells by clogging points of cell-cell contact.This entry represents an extracellular domain that is usually found at the N-terminal of amyloidogenic glycoproteins such as amyloid-beta precursor protein (APP, or A4).
Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [
,
,
,
,
]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents ZPR1-type zinc finger domains. ZPR1 was shown experimentally to bind approximately two moles of zinc, and has two copies of a domain homologous to this protein, each containing a putative zinc finger of the form CXXCX(25)CXXC. ZPR1 binds the tyrosine kinase domain of epidermal growth factor receptor but is displaced by receptor activation and autophosphorylation after which it redistributes in part to the nucleus. The proteins described by this family by analogy may be suggested to play a role in signal transduction as proven for other Z-finger binding proteins.Deficiencies in ZPR1 may contribute to neurodegenerative disorders. ZPR1 appears to be down-regulated in patients with spinal muscular atrophy (SMA), a disease characterised by degeneration of the alpha-motor neurons in the spinal cord that can arise from mutations affecting the expression of Survival Motor Neurons (SMN) [
]. ZPR1 interacts with complexes formed by SMN [], and may act as a modifier that effects the severity of SMA.
Several proteins have recently been shown to contain the 5 structural motifs characteristic
of GTP-binding proteins []. These include murine DRG protein; GTP1 proteinfrom Schizosaccharomyces pombe; OBG protein from Bacillus subtilis [
]; ferrous iron transport protein B [
] and several others.
This entry contains proteins which have a Greek key motif [
]. They are all structurally related to the beta/gamma crystallin superfamily. This superfamily of proteins includes:Beta and gamma crystallinsYeast killer toxinKiller toxin-like protein SKLPAntifungal protein, AFP1Plant antimicrobial protein, MIAMP1Streptomyces metalloproteinase inhibitor, SMPI [
].
The tripartite DENN (after differentially expressed in neoplastic versus
normal cells) domain is found in several proteins involved in Rab-mediatedprocesses or regulation of MAPKs (Mitogen-activated preotein kinases)
signaling pathways. It actually consists of three parts as the original DENNdomain is always encircled on both sides by more divergent domains, called
uDENN (after upstream DENN) and dDENN (for downstream DENN). The tripartiteDENN domain is found associated with other domains, such as RUN, PLAT, PH, PPR, WD-40, GRAM or C1. The function of DENN domain remains to date unclear, although it appears to represent a good candidate for a GTP/GDP exchange activity [
,
].The general characteristics of DENN domains - three regions dDENN, DENNitself, and uDENN having different patterns of sequence conservation and
separated by sequences of variable length - suggest that they are composed ofat least three sub-domains which may feature distinct folds but which are
always associated due to functional and/or structural constraints [].Some proteins known to contain a tripartite DENN domain are listed below:Rat Rab3 GDP/GTP exchange protein (Rab3GEP) Human mitogen-activated protein kinase activating protein containing death
domain (MADD). It is orthologous to Rab3GEP Caenorhabditis elegans regulator of presynaptic activity aex-3, the
ortholog of Rab3GEP Mouse Rab6 interacting protein 1 (Rab6IP1) Human SET domain-binding factor 1(SBF1) Human suppressor of tumoreginicity 5 (ST5) Human C-MYC promoter-binding protein IRLB This entry represents the core or cDENN domain.
The tripartite DENN (after differentially expressed in neoplastic versus
normal cells) domain is found in several proteins involved in Rab-mediatedprocesses or regulation of MAPKs (Mitogen-activated preotein kinases)
signaling pathways. It actually consists of three parts as the original DENNdomain is always encircled on both sides by more divergent domains, called
uDENN (after upstream DENN) and dDENN (for downstream DENN). The tripartiteDENN domain is found associated with other domains, such as RUN, PLAT, PH, PPR, WD-40, GRAM or C1. The function of DENN domain remains to date unclear, although it appears to represent a good candidate for a GTP/GDP exchange activity [
,
].The general characteristics of DENN domains - three regions dDENN, DENNitself, and uDENN having different patterns of sequence conservation and
separated by sequences of variable length - suggest that they are composed ofat least three sub-domains which may feature distinct folds but which are
always associated due to functional and/or structural constraints [].Some proteins known to contain a tripartite DENN domain are listed below:Rat Rab3 GDP/GTP exchange protein (Rab3GEP) Human mitogen-activated protein kinase activating protein containing death
domain (MADD). It is orthologous to Rab3GEP Caenorhabditis elegans regulator of presynaptic activity aex-3, the
ortholog of Rab3GEP Mouse Rab6 interacting protein 1 (Rab6IP1) Human SET domain-binding factor 1(SBF1) Human suppressor of tumoreginicity 5 (ST5) Human C-MYC promoter-binding protein IRLB This entry represents the uDENN domain.
It has been shown that several proteins share two sequence motifs [
]. Two of these proteins, vertebrate and plant inositol monophosphatase (), and vertebrate inositol polyphosphate 1-phosphatase (
), are enzymes of the inositol phosphate second messenger signalling pathway, and share similar enzyme activity. Both enzymes exhibit an absolute requirement for metal ions (Mg2 is preferred), and their amino acid sequences contain a number of conserved motifs, which are also shared by several other proteins related to MPTASE (including products of fungal QaX and qutG, bacterial suhB and cysQ, and yeast hal2) [
]. The function of the other proteins is not yet clear, but it is suggested that they may act by enhancing the synthesis or degradation of phosphorylated messenger molecules []. Structural analysis of these proteins has revealed a common core of 155 residues, which includes residues essential for metal binding and catalysis. An interesting property of the enzymes of this family is their sensitivity to Li+. The targets and mechanism of action of Li+ are unknown, but overactive inositol phosphate signalling may account for symptoms of manic depression [].This entry represents a conserved signature pattern found within the inositol monophosphatase family of proteins. It is suggested [
] that these proteins may act by enhancing the synthesis or degradation of phosphorylated messenger molecules.
The tripartite DENN (after differentially expressed in neoplastic versus
normal cells) domain is found in several proteins involved in Rab-mediatedprocesses or regulation of MAPKs (Mitogen-activated preotein kinases)
signaling pathways. It actually consists of three parts as the original DENNdomain is always encircled on both sides by more divergent domains, called
uDENN (after upstream DENN) and dDENN (for downstream DENN). The tripartiteDENN domain is found associated with other domains, such as RUN, PLAT, PH, PPR, WD-40, GRAM or C1. The function of DENN domain remains to date unclear, although it appears to represent a good candidate for a GTP/GDP exchange activity [
,
].The general characteristics of DENN domains - three regions dDENN, DENNitself, and uDENN having different patterns of sequence conservation and
separated by sequences of variable length - suggest that they are composed ofat least three sub-domains which may feature distinct folds but which are
always associated due to functional and/or structural constraints [].Some proteins known to contain a tripartite DENN domain are listed below:Rat Rab3 GDP/GTP exchange protein (Rab3GEP) Human mitogen-activated protein kinase activating protein containing death
domain (MADD). It is orthologous to Rab3GEP Caenorhabditis elegans regulator of presynaptic activity aex-3, the
ortholog of Rab3GEP Mouse Rab6 interacting protein 1 (Rab6IP1) Human SET domain-binding factor 1(SBF1) Human suppressor of tumoreginicity 5 (ST5) Human C-MYC promoter-binding protein IRLB This entry represents the dDENN domain.
Cullins are a family of hydrophobic proteins that act as scaffolds for ubiquitin ligases (E3). Cullins are found throughout eukaryotes. Humans express seven cullins (Cul1, 2, 3, 4A, 4B, 5 and 7), each forming part of a multi-subunit ubiquitin complex. Cullin-RING ubiquitin ligases (CRLs), such as Cul1 (SCF) [
], play an essential role in targeting proteins for ubiquitin-mediated destruction; as such, they are diverse in terms of composition and function, regulating many different processes from glucose sensing and DNA replication to limb patterning and circadian rhythms. The catalytic core of CRLs consists of a RING protein and a cullin family member. For Cul1, the C-terminal cullin-homology domain binds the RING protein. The RING protein appears to function as a docking site for ubiquitin-conjugating enzymes (E2s). Other proteins contain a cullin-homology domain, such as the APC2 subunit of the anaphase-promoting complex/cyclosome and the p53 cytoplasmic anchor PARC; both APC2 and PARC have ubiquitin ligase activity. The N-terminal region of cullins is more variable, and is used to interact with specific adaptor proteins [,
,
].This superfamily represents the N-terminal cullin repeat-containing domain; these repeats form a domain with a multi-helical 2-layered alpha/alpha structure, which in turn is folded into a right-handed superhelix. A similar structural domain is found in exocyst complex components such as EXO70 and EXO84.
This entry represents SsrA-binding protein (aka small protein B or SmpB), which is a unique RNA-binding protein that is conserved throughout the bacterial kingdom and is an essential component of the SsrA quality-control system. Tight recognition of codon-anticodon pairings by the ribosome ensures the accuracy and fidelity of protein synthesis. In eubacteria, translational surveillance and ribosome rescue are performed by the 'tmRNA-SmpB' system (transfer messenger RNA-small protein B). SmpB binds specifically to the ssrA RNA (tmRNA) and is required for stable association of ssrA with ribosomes. SsrA RNA recognises ribosomes stalled on defective messages and acts to mediate the addition of a short peptide tag to the C terminus of the partially synthesised nascent polypeptide chain. Within a stalled ribosome, SmpB interacts with the three universally conserved bases G530, A1492 and A1493 that form the 30S subunit decoding centre, in which canonical codon-anticodon pairing occurs [
]. The SsrA-tagged protein is then degraded by C-terminal-specific proteases. Formation of an SmpB-SsrA complex appears to be critical in mediating SsrA activity after aminoacylation with alanine but prior to the transpeptidation reaction that couples this alanine to the nascent chain []. The SmpB protein has functional and structural similarities with initiation factor 1, and is proposed to be afunctional mimic of the pairing between a codon and an anticodon.
Sorting nexins (SNXs) are a diverse group of cellular trafficking proteins that are unified by the presence of a phospholipid-binding motif, the PX domain. The ability of these proteins to bind specific phospholipids, as well as their propensity to form protein-protein complexes, points to a role for these proteins in membrane trafficking and protein sorting [
]. Members of this group also contain coiled-coil regions within their large C-terminal domains and a BAR domain, whose function has been defined as a dimerisation motif, as sensing and inducing membrane curvature, and/or likely to bind to small GTPases [].This entry includes SNX5, SNX6 and SNX32 (also known as SNX6B).SNX5 contains a BAR domain that is C teminus to the PX domain. SNX5 plays a role in macropinocytosis [
] and in the internalisation of EGFR after EGF stimulation [].SNX6 was found to interact with members of the transforming growth factor-beta family of receptor serine/threonine kinases. Strong heteromeric interactions were also seen among SNX1, -2, -4, and -6, suggesting the formation
in vivoof oligomeric complexes. SNX6 is localized in the cytoplasm where it is thought to target proteins to the
trans-Golgi network [
]. In addition, SNX6 was found to be translocated from the cytoplasm to nucleus by Pim-1, an oncogene product of serine/threonine kinase. This translocation is not affected by Pim-1-dependent phosphorylation, but the functional significance is unknown [].
X-linked lissencephaly is a severe brain malformation affecting males. Recently it has been demonstrated that the doublecortin gene is implicated in this disorder [
]. Doublecortin was found to bind to the microtubule cytoskeleton. In vivo and in vitro assays show that Doublecortin stabilises microtubules and causes bundling []. Doublecortin is a basic protein with an iso-electric point of 10, typical of microtubule-binding proteins. However, its sequence contains no known microtubule-binding domain(s).The detailed sequence analysis of Doublecortin and Doublecortin-like proteins allowed the identification of an evolutionarily conserved Doublecortin (DC) domain, which is ubiquitin-like. This domain is found in the N terminus of proteins and consists of one or two tandemly repeated copies of an around 80 amino acids region. It has been suggested that the first DC domain of Doublecortin binds tubulin and enhances microtubule polymerisation [
].Some proteins known to contain a DC domain are listed below:Doublecortin. It is required for neuronal migration [
]. A large number of point mutations in the human DCX gene leading to lissencephaly are located within the DC domains [].Human serine/threonine-protein kinase DCAMKL1. It is a probable kinase that may be involved in a calcium-signaling pathway controlling neuronal migration in the developing brain [
,
].Retinitis pigmentosa 1 protein. It is required for the differentiation of photoreceptor cells. Mutation in the human RP1 gene cause retinitis pigmentosa of type 1 [
,
].
Nuclear factor NF-kappa-B p105 subunit, death domain
Type:
Domain
Description:
This entry represents the Death Domain (DD) of NF-kappaB subunit precursor p105, which can undergo cotranslational processing by the 26S proteasome to produce a 50kDa protein (p50). p50 is a DNA binding subunit of the NF-kappaB (NF-kappaB) protein complex [
].NF-kappaB is a pleiotropic transcription factor present in almost all cell types. It is the endpoint of a series of signal transduction events that are initiated by a vast array of stimuli related to many biological processes such as inflammation, immunity, differentiation, cell growth, tumorigenesis and apoptosis. NF-kappaB is a homo- or heterodimeric complex formed by the Rel-like domain-containing proteins RelA/p65, RelB, NFKB1/p50, c-Rel and NFKB2/p52 [
]. Each individual NF-kappaB subunit, and perhaps each dimer, carries out unique functions in regulating transcription. Dimer-specific functions can be conferred by selective protein-protein interactions with other transcription factors, coregulatory proteins, and chromatin proteins [].NF-kB1 and NF-kB2 are synthesised as large precursors, called p105 and p100, which undergo processing to generate the NF-kB subunits p50 and p52, respectively [
]. The processing of p105 and p100 is mediated by the ubiquitin/proteasome pathway, and involves selective degradation of their C-terminal regions containing ankyrin repeats []. Unlike RelA, RelB and c-Rel, p50 and p52 do not contain transactivation domains in their C-termini. Nevertheless, they play critical roles in modulating the specificity of NF-kB function [].
Nuclear factor I (NF-I) or CCAAT box-binding transcription factor (CTF) [
,
,
] (also known as TGGCA-binding proteins) are a family of vertebrate nuclear proteins which recognise and bind, as dimers, the palindromic DNA sequence 5'-TGGCANNNTGCCA-3'. This family was first described for its role in stimulating the initiation of adenovirus DNA replication []. In vertebrates there are four members NFIA, NFIB, NFIC, and NFIX and an orthologue from Caenorhabditis elegans has been described, called Nuclear factor I family protein (NFI-I) []. The CTF/NF-I proteins are individually capable of activating transcription and DNA replication, thus they function by regulating cell proliferation and differentiation. They are involved in normal development and have been associated with developmental abnormalities and cancer in humans []. In a given species, there are a large number of different CTF/NF-I proteins, generated both by alternative splicing and by the occurrence of four different genes. CTF/NF-1 proteins contain 400 to 600 amino acids. The N-terminal 200 amino-acid sequence, almost perfectly conserved in all species and genes sequenced, mediates site-specific DNA recognition, protein dimerisation and Adenovirus DNA replication. The C-terminal 100 amino acids contain the transcriptional activation domain. This activation domain is the target of gene expression regulatory pathways elicited by growth factors and it interacts with basal transcription factors and with histone H3 [].
This entry represents the N-terminal domain found in a family of neurogenic mastermind-like proteins (MAMLs), which act as critical transcriptional co-activators for Notch signaling [
,
,
]. Notch receptors are cleaved upon ligand engagement and the intracellular domain of Notch shuttles to the nucleus. MAMLs form a functional DNA-binding complex with the cleaved Notch receptor and the transcription factor CSL, thereby regulating transcriptional events that are specific to the Notch pathway. MAML proteins may also play roles as key transcriptional co-activators in other signal transduction pathways as well, including: muscle differentiation and myopathies (MEF2C) [], tumour suppressor pathway (p53) [] and colon carcinoma survival (beta-catenin) []. MAML proteins could mediate cross-talk among the various signaling pathways and the diverse activities of the MAML proteins converge to impact normal biological processes and human diseases, including cancers.The N-terminal domain of MAML proteins adopt an elongated kinked helix that wraps around ANK and CSL forming one of the complexes in the build-up of the Notch transcriptional complex for recruiting general transcription factors [
]. This N-terminal domain is responsible for its interaction with the ankyrin repeat region of the Notch proteins NOTCH1 [], NOTCH2 [], NOTCH3 [] and NOTCH4. It forms a DNA-binding complex with Notch proteins and RBPSUH/RBP-J kappa/CBF1, and also binds CREBBP/CBP [] and CDK8 []. The C-terminal region is required for transcriptional activation.
This entry represents the CFC domain found in the membrane protein Cripto (or teratocarcinoma-derived growth factor), a protein over expressed in many tumours [
,
] and structurally similar to the C-terminal extracellular portions of Jagged 1 and Jagged 2 []. CFC is approx 40-residues long, compacted by three internal disulphide bridges, and binds Alk4 via a hydrophobic patch. CFC is structurally homologous to the VWFC-like domain []. The protein Cripto is the founding member of the extra-cellular EGF-CFC growth factors, which are composed of two adjacent cysteine-rich domains: the EGF-like and the CFC domains. Members of the EGF-CFC family play key roles in embryonic development and are also implicated in tumourigenesis [
]. The Cripto protein could play a role in the determination of the epiblastic cells that subsequently give rise to the mesoderm. Although both the EGF and CFC domains are involved in the tumourigenic activity of Crispto proteins, the CFC domain appears to play a crucial role, as it is through the CFC domain that Crispto interferes with the onco-suppressive activity of Activins, either by blocking the Activin receptor ALK4 or by antagonising proteins of the TGF-beta family []. The Cryptic protein is involved in the correct establishment of the left-right axis. May play a role in mesoderm and/or neural patterning during gastrulation.
This entry represents the N-terminal domain found in a family of neurogenic mastermind-like proteins (MAMLs), which act as critical transcriptional co-activators for Notch signaling [
,
,
]. Notch receptors are cleaved upon ligand engagement and the intracellular domain of Notch shuttles to the nucleus. MAMLs form a functional DNA-binding complex with the cleaved Notch receptor and the transcription factor CSL, thereby regulating transcriptional events that are specific to the Notch pathway. MAML proteins may also play roles as key transcriptional co-activators in other signal transduction pathways as well, including: muscle differentiation and myopathies (MEF2C) [], tumour suppressor pathway (p53) [] and colon carcinoma survival (beta-catenin) []. MAML proteins could mediate cross-talk among the various signaling pathways and the diverse activities of the MAML proteins converge to impact normal biological processes and human diseases, including cancers.The N-terminal domain of MAML proteins adopt an elongated kinked helix that wraps around ANK and CSL forming one of the complexes in the build-up of the Notch transcriptional complex for recruiting general transcription factors [
]. This N-terminal domain is responsible for its interaction with the ankyrin repeat region of the Notch proteins NOTCH1 [], NOTCH2 [], NOTCH3 [] and NOTCH4. It forms a DNA-binding complex with Notch proteins and RBPSUH/RBP-J kappa/CBF1, and also binds CREBBP/CBP [] and CDK8 []. The C-terminal region is required for transcriptional activation.
This entry represents the N-terminal sub-domain of the Rel homology domain (RHD) of NF-kappaB subunit precursor p105, which can undergo cotranslational processing by the 26S proteasome to produce a 50kDa protein (p50). p50 is a DNA binding subunit of the NF-kappaB (NF-kappaB) protein complex [
].NF-kappaB is a pleiotropic transcription factor present in almost all cell types. It is the endpoint of a series of signal transduction events that are initiated by a vast array of stimuli related to many biological processes such as inflammation, immunity, differentiation, cell growth, tumorigenesis and apoptosis. NF-kappaB is a homo- or heterodimeric complex formed by the Rel-like domain-containing proteins RelA/p65, RelB, NFKB1/p50, c-Rel and NFKB2/p52 [
]. Each individual NF-kappaB subunit, and perhaps each dimer, carries out unique functions in regulating transcription. Dimer-specific functions can be conferred by selective protein-protein interactions with other transcription factors, coregulatory proteins, and chromatin proteins [].NF-kB1 and NF-kB2 are synthesised as large precursors, called p105 and p100, which undergo processing to generate the NF-kB subunits p50 and p52, respectively [
]. The processing of p105 and p100 is mediated by the ubiquitin/proteasome pathway, and involves selective degradation of their C-terminal regions containing ankyrin repeats []. Unlike RelA, RelB and c-Rel, p50 and p52 do not contain transactivation domains in their C-termini. Nevertheless, they play critical roles in modulating the specificity of NF-kB function [].
X-linked lissencephaly is a severe brain malformation affecting males. Recently it has been demonstrated that the doublecortin gene is implicated in this disorder [
]. Doublecortin was found to bind to the microtubule cytoskeleton. In vivo and in vitro assays show that Doublecortin stabilises microtubules and causes bundling []. Doublecortin is a basic protein with an iso-electric point of 10, typical of microtubule-binding proteins. However, its sequence contains no known microtubule-binding domain(s).The detailed sequence analysis of Doublecortin and Doublecortin-like proteins allowed the identification of an evolutionarily conserved Doublecortin (DC) domain, which is ubiquitin-like. This domain is found in the N terminus of proteins and consists of one or two tandemly repeated copies of an around 80 amino acids region. It has been suggested that the first DC domain of Doublecortin binds tubulin and enhances microtubule polymerisation [
].Some proteins known to contain a DC domain are listed below:Doublecortin. It is required for neuronal migration [
]. A large number of point mutations in the human DCX gene leading to lissencephaly are located within the DC domains [
].Human serine/threonine-protein kinase DCAMKL1. It is a probable kinase that may be involved in a calcium-signaling pathway controlling neuronal migration in the developing brain [
,
].Retinitis pigmentosa 1 protein. It is required for the differentiation of photoreceptor cells. Mutation in the human RP1 gene cause retinitis pigmentosa of type 1 [
,
].
The death domain (DD) is a homotypic protein interaction module composed of a bundle of six α-helices. DD is related in sequence and structure to the death effector domain (DED, see
) and the caspase recruitment domain (CARD, see
), which work in similar pathways and show similar interaction properties [
]. DD bind each other forming oligomers. Mammals have numerous and diverse DD-containing proteins []. Within these proteins, the DD domains can be found in combination with other domains, including: CARDs, DEDs, ankyrin repeats (), caspase-like folds, kinase domains, leucine zippers, leucine-rich repeats (LRR) (
), TIR domains (
), and ZU5 domains (
) [
].Some DD-containing proteins are involved in the regulation of apoptosis and inflammation through their activation of caspases and NF-kappaB, which typically involves interactions with TNF (tumour necrosis factor) cytokine receptors [
,
]. In humans, eight of the over 30 known TNF receptors contain DD in their cytoplasmic tails; several of these TNF receptors use caspase activation as a signalling mechanism. The DD mediates self-association of these receptors, thus giving the signal to downstream events that lead to apoptosis. Other DD-containing proteins, such as ankyrin, MyD88 and pelle, are probably not directly involved in cell death signalling. DD-containing proteins also have links to innate immunity, communicating with Toll family receptors through bipartite adapter proteins such as MyD88 [].
Urease and other nickel metalloenzymes are synthesised as precursors devoid of the metalloenzyme active site. These precursors then undergo a complex post-translational maturation process that requires a number of accessory proteins.Members of this group are nickel-binding proteins required for urease metallocentre assembly [
]. They are believed to function as metallochaperones to deliver nickel to urease apoprotein [,
]. It has been shown by yeast two-hybrid analysis that UreE forms a dimeric complex with UreG in Helicobacter pylori []. The UreDFG-apoenzyme complex has also been shown to exist [,
] and is believed to be, with the addition of UreE, the assembly system for active urease []. The complexes, rather than the individual proteins, presumably bind to UreB via UreE/H recognition sites.The structure of Klebsiella aerogenes UreE reveals a unique two-domain architecture.The N-terminal domain is structurally related to a heat shock protein, while the C-terminal domain shows homology to the Atx1 copper metallochaperone [
,
]. Significantly, the metal-binding sites in UreE and Atx1 are distinct in location and types of residues despite the relationship between these proteins and the mechanism for UreE activation of urease is proposed to be different from the thiol ligand exchange mechanism used by the copper metallochaperones.The N-terminal domain is termed the peptide-binding domain. Deletion of this domain does not eliminate enzymatic activity, and the truncated protein can still activate urease [
].
The BESS domain has been named after the three proteins that originally defined the domain: BEAF (Boundary element associated factor 32) [
], Suvar(3)7 [] and Stonewall []). The BESS domain is 40 amino acid residues long and is predicted to be composed of three alpha helices, as such it might be related to the myb/SANT HTH domain. The BESS domain directs a variety of protein-protein interactions, including interactions with itself, with Dorsal, and with a TBP-associated factor. It is found in a single copy in Drosophila proteins and is often associated with the MADF domain [,
,
].Proteins known to contain a BESS domain include:Drosophila Boundary element associated factor 32 (BEAF-32). Drosophila Suppressor of variegation protein 3-7 (Su(var)3-7), which could play a role in chromosome condensation.Drosophila Ravus, which is homologous to the C-terminal part of Su(var)3-7 [
]. Drosophila Stonewall (Stwl), a putative transcription factor required for maintenance of female germline stem cells as well as oocyte differentiation.Drosophila Adf-1, a transcription factor first identified on the basis of its interaction with the alcohol dehydrogenase promoter but that binds the promoters of a diverse group of genes [
].Drosophila Dorsal-interacting protein 3 (Dip3). It functions both as an activator to bind DNA in a sequence specific manner and a coactivator to stimulate synergistic activation by Dorsal and Twist [
].
This is a family of U47 herpesvirus proteins [
]. U47 protein is also known as 120kDa glycoprotein O (or 130kDa glycoprotein O from human herpesvirus 6A). U47 proteins are modified with N-linked oligosaccharides and co-immunoprecipitated with glycoprotein H. They may have a role in cell-cell fusion in virus infection [].
Aconitase (aconitate hydratase;
) is an iron-sulphur protein that contains a [4Fe-4S]-cluster and catalyses the interconversion of isocitrate and citrate via a cis-aconitate intermediate. Aconitase functions in both the TCA and glyoxylate cycles, however unlike the majority of iron-sulphur proteins that function as electron carriers, the [4Fe-4S]-cluster of aconitase reacts directly with an enzyme substrate. In eukaryotes there is a cytosolic form (cAcn) and a mitochondrial form (mAcn) of the enzyme. In bacteria there are also 2 forms, aconitase A (AcnA) and B (AcnB). Several aconitases are known to be multi-functional enzymes with a second non-catalytic, but essential function that arises when the cellular environment changes, such as when iron levels drop [
,
]. Eukaryotic cAcn and mAcn, and bacterial AcnA have the same domain organisation, consisting of three N-terminal alpha/beta/alpha domains, a linker region, followed by a C-terminal 'swivel' domain with a beta/beta/alpha structure (1-2-3-linker-4), although mAcn is smaller than cAcn. However, bacterial AcnB has a different organisation: it contains an N-terminal HEAT-like domain, followed by the 'swivel' domain, then the three alpha/beta/alpha domains (HEAT-4-1-2-3) [].Eukaryotic cAcn enzyme balances the amount of citrate and isocitrate in the cytoplasm, which in turn creates a balance between the amount of NADPH generated from isocitrate by isocitrate dehydrogenase with the amount of acetyl-CoA generated from citrate by citrate lyase. Fatty acid synthesis requires both NADPH and acetyl-CoA, as do other metabolic processes, including the need for NADPH to combat oxidative stress. The enzymatic form of cAcn predominates when iron levels are normal, but if they drop sufficiently to cause the disassembly of the [4Fe-4S]-cluster, then cAcn undergoes a conformational change from a compact enzyme to a more open L-shaped protein known as iron regulatory protein 1 (IRP1; or IRE-binding protein 1, IREBP1) [,
]. As IRP1, the catalytic site and the [4Fe-4S]-cluster are lost, and two new RNA-binding sites appear. IRP1 functions in the post-transcriptional regulation of genes involved in iron metabolism - it binds to mRNA iron-responsive elements (IRE), 30-nucleotide stem-loop structures at the 3' or 5' end of specific transcripts. Transcripts containing an IRE include ferritin L and H subunits (iron storage), transferrin (iron plasma chaperone), transferrin receptor (iron uptake into cells), ferroportin (iron exporter), mAcn, succinate dehydrogenase, erythroid aminolevulinic acid synthetase (tetrapyrrole biosynthesis), among others. If the IRE is in the 5'-UTR of the transcript (e.g. in ferritin mRNA), then IRP1-binding prevents its translation by blocking the transcript from binding to the ribosome. If the IRE is in the 3'-UTR of the transcript (e.g. transferrin receptor), then IRP1-binding protects it from endonuclease degradation, thereby prolonging the half-life of the transcript and enabling it to be translated [
].IRP2 is another IRE-binding protein that binds to the same transcripts as IRP1. However, since IRP1 is predominantly in the enzymatic cAcn form, it is IRP2 that acts as the major metabolic regulator that maintains iron homeostasis [
]. Although IRP2 is homologous to IRP1, IRP2 lacks aconitase activity, and is known only to have a single function in the post-transcriptional regulation of iron metabolism genes []. In iron-replete cells, IRP2 activity is regulated primarily by iron-dependent degradation through the ubiquitin-proteasomal system.Bacterial AcnB is also known to be multi-functional. In addition to its role in the TCA cycle, AcnB was shown to be a post-transcriptional regulator of gene expression in Escherichia coli and Salmonella enterica [
,
]. In S. enterica, AcnB initiates a regulatory cascade controlling flagella biosynthesis through an interaction with the ftsH transcript, an alternative RNA polymerase sigma factor. This binding lowers the intracellular concentration of FtsH protease, which in turn enhances the amount of RNA polymerase sigma32 factor (normally degraded by FtsH protease), and sigma32 then increases the synthesis of chaperone DnaK, which in turn promotes the synthesis of the flagellar protein FliC. AcnB regulates the synthesis of other proteins as well, such as superoxide dismutase (SodA) and other enzymes involved in oxidative stress.
This entry represents the N-terminal HEAT-like domain, which is present in bacterial aconitase (AcnB), but not in AcnA or eukaryotic cAcn/IRP2 or mAcn. This domain is multi-helical, forming two curved layers in a right-handed α-α superhelix. HEAT-like domains are usually implicated in protein-protein interactions. The HEAT-like domain and the 'swivel' domain that follows it were shown to be sufficient for dimerisation and for AcnB binding to mRNA. An iron-mediated dimerisation mechanism may be responsible for switching AcnB between its catalytic and regulatory roles, as dimerisation requires iron while mRNA binding is inhibited by iron.
The Yersinia enterocolitica O:8 periplasmic binding protein-dependent transport system consisted of four proteins: the periplasmic haemin-binding protein HemT, the haemin permease protein HemU, the ATP-binding hydrophilic protein HemV and the haemin-degrading protein HemS. The structure for HemS has been solved and consists of a tandem repeat of the domain represented in this entry[].
This family consists of several Rotavirus major outer capsid protein VP7 sequences. The rotavirus capsid is composed of three concentric protein layers. Proteins VP4 and VP7 comprise the outer layer. VP4 forms spikes and is the viral attachment protein. VP7 is a glycoprotein and the major constituent of the outer protein layer [
].
This is leucine-zipper is found in the enterobacterial outer membrane lipoprotein LPP [
]. It is likely that this domain oligomerises and is involved in protein-protein interactions. As such it is a bundle of α-helical coiled-coils, which are known to play key roles in mediating specific protein-protein interactions for in molecular recognition and the assembly of multi-protein complexes [,
,
].
Bunyavirus has three genomic segments: small (S), middle-sized (M), and large (L). The S segment encodes the nucleocapsid and a non-structural protein. The M segment codes for two glycoproteins, G1 and G2, and another non-structural protein (NSm). The L segment codes for an RNA polymerase. This entry represents the polyprotein region forming the G2 glycoprotein, which interacts with the
G1 glycoprotein [
].
It has been shown that several proteins share two sequence motifs [
]. Two of these proteins, vertebrate and plant inositol monophosphatase (), and vertebrate inositol polyphosphate 1-phosphatase (
), are enzymes of the inositol phosphate second messenger signalling pathway, and share similar enzyme activity. Both enzymes exhibit an absolute requirement for metal ions (Mg2 is preferred), and their amino acid sequences contain a number of conserved motifs, which are also shared by several other proteins related to MPTASE (including products of fungal QaX and qutG, bacterial suhB and cysQ, and yeast hal2) [
]. The function of the other proteins is not yet clear, but it is suggested that they may act by enhancing the synthesis or degradation of phosphorylated messenger molecules []. Structural analysis of these proteins has revealed a common core of 155 residues, which includes residues essential for metal binding and catalysis. An interesting property of the enzymes of this family is their sensitivity to Li+. The targets and mechanism of action of Li+ are unknown, but overactive inositol phosphate signalling may account for symptoms of manic depression [].This entry represents the metal-binding site found within the inositol monophosphatase family of proteins. It is suggested [
] that these proteins may act by enhancing the synthesis or degradation of phosphorylated messenger molecules. The signature pattern of this entry contains the aspartic and threonine residues involved in binding a metal ion [].
Rab proteins constitute a family of small GTPases that serve a regulatory
role in vesicular membrane traffic [,
]; C-terminal geranylgeranylation iscrucial for their membrane association and function. This post-translational
modification is catalysed by Rab geranylgeranyl transferase (Rab-GGTase), a multi-subunit enzyme that contains a catalytic heterodimer and an accessory
component, termed Rab escort protein (REP)-1 []. REP-1 presents newly-synthesised Rab proteins to the catalytic component, and forms a stable
complex with the prenylated proteins following the transfer reaction. The mechanism of REP-1-mediated membrane association of Rab5 is similar
to that mediated by Rab GDP dissociation inhibitor (GDI). REP-1 and Rab GDI also share other functional properties, including the ability to inhibit the
release of GDP and to remove Rab proteins from membranes.The crystal structure of the bovine alpha-isoform of Rab GDI has been
determined to a resolution of 1.81A []. The protein is composed of twomain structural units: a large complex multi-sheet domain I, and a smaller
α-helical domain II.The structural organisation of domain I is closely related to FAD-containing
monooxygenases and oxidases []. Conserved regions common to GDI and thechoroideraemia gene product, which delivers Rab to catalytic subunits of
Rab geranylgeranyltransferase II, are clustered on one face of the domain[
]. The two most conserved regions form a compact structure at the apex ofthe molecule; site-directed mutagenesis has shown these regions to play a
critical role in the binding of Rab proteins [].
Bestrophin is a 68kDa basolateral plasma membrane protein expressed in retinal pigment epithelial cells (RPE). It is encoded by the VMD2 gene, which is mutated in Best macular dystrophy, a disease characterised by a depressed light peak in the electrooculogram [
]. VMD2 encodes a 585-amino acid protein with an approximate mass of 68kDa which has been designated bestrophin. Bestrophin shares homology with the Caenorhabditis elegans RFP gene family, named for the presence of a conserved arginine (R), phenylalanine (F), proline (P), amino acid sequence motif. Bestrophin is a plasma membrane protein, localised to the basolateral surface of RPE cells consistent with a role for bestrophin in the generation or regulation of the EOG light peak. Bestrophin and other RFP family members represent a new class of calcium-activated chloride channels (CaCC) [], indicating a direct role for bestrophin in generating the light peak [,
,
]. Bestrophins are also permeable to other monovalent anions including bicarbonate, bromine, iodine, thiocyanate an nitrate [,
]. Structural analysis revealed that N-terminal region of the proteins is highly conserved and sufficient for its CaCC activity. The C-terminal region has low sequence identity. The VMD2 gene underlying Best disease was shown to represent the first human member of the RFP-TM protein family. More than 97% of the disease-causing mutations are located in the N-terminal domain altering the electrophysiological properties of the channel [,
].This entry also includes uncharacterised proteins belonging to protein family UPF0187.
The LCCL domain has been named after the best characterised proteins that were found to contain it, namely Limulus factor C, Coch-5b2 and Lgl1. It is an about 100 amino acids domain whose C-terminal part contains a highly conserved histidine in a conserved motif YxxxSxxCxAAVHxGVI. The LCCL module is thought to be an autonomously folding domain that has been used for the construction of various modular proteins through exon-shuffling. It has been found in various metazoan proteins in association with complement B-type domains, C-type lectin domains, von Willebrand type A domains, CUB domains, discoidin lectin domains or CAP domains. It has been proposed that the LCCL domain could be involved in lipopolysaccharide (LPS) binding [
,
]. Secondary structure prediction suggests that the LCCL domain contains six beta strands and two alpha helices []. The structure of the LCCL domain from human Coch-5b2 has been solved. It has an unusual fold, where a centrally located helix is wrapped by extended polypeptide segments of mostly irregular secondary structure []. Some proteins known to contain a LCCL domain include Limulus factor C, a LPS endotoxin-sensitive trypsin type serine protease which serves to protect the organism from bacterial infection; vertebrate cochlear protein cochlin or coch-5b2 (Cochlin is probably a secreted protein, mutations affecting the LCCL domain of coch-5b2 cause the deafness disorder DFNA9 in humans); and mammalian late gestation lung protein Lgl1, contains two tandem copies of the LCCL domain [
].
The hsp70 chaperone machine performs many diverse roles in the cell, including folding of nascent proteins, translocation of polypeptides across organelle membranes, coordinating responses to stress, and targeting selected proteins for degradation. DnaJ is a member of the hsp40 family of molecular chaperones, which is also called the J-protein family, the members of which regulate the activity of hsp70s. DnaJ (hsp40) binds to dnaK (hsp70) and stimulates its ATPase activity, generating the ADP-bound state of dnaK, which interacts stably with the polypeptide substrate [,
]. Structurally, the DnaJ protein consists of an N-terminal conserved domain (called 'J' domain) of about 70 amino acids, a glycine-rich region ('G' domain') of about 30 residues, a central domain containing four repeats of a CXXCXGXG motif ('CRR' domain) and a C-terminal region of 120 to 170 residues.Such a structure is shown in the following schematic representation:
+------------+-+-------+-----+-----------+--------------------------------+
| J-domain | | Gly-R | | CXXCXGXG | C-terminal |+------------+-+-------+-----+-----------+--------------------------------+
The structure of the J-domain has been solved [
]. The J domain consists of four helices, the second of which has a charged surface that includes basic residues that are essential for interaction with the ATPase domain of hsp70 []. J-domains are found in many prokaryotic and eukaryotic proteins [
]. In yeast, three J-like proteins have been identified containing regions closely resembling a J-domain, but lacking the conserved HPD motif - these proteins do not appear to act as molecular chaperones [].
The JmjN and JmjC domains are two non-adjacent domains which have been identified in the jumonji family of transcription factors. Although it was originally suggested that the JmjN and JmjC domains always co-occur and might form a single functional unit within the folded protein, the JmjC domain was later found without the JmjN domain in organisms from bacteria to human [
,
,
].Proteins containing JmjC domain are predicted to be metalloenzymes that adopt the cupin fold and are candidates for enzymes that regulate chromatin remodelling [
]. The cupin fold is a flattened β-barrel structure containing two sheets of five antiparallel β-strands that form the walls of a zinc-binding cleft. Based on the crystal structure of JmjC domain containing protein FIH and JHDM3A/JMJD2A, the JmjC domain forms an enzymatically active pocket that coordinates Fe(III) and alphaKG. Three amino-acid residues within the JmjC domain bind to the Fe(II) cofactor and two additional residues bind to alphaKG []. JmjC domains were identified in numerous eukaryotic proteins containing domains typical of transcription factors, such as PHD, C2H2, ARID/BRIGHT and zinc fingers [
,
]. The JmjC has been shown to function in a histone demethylation mechanism that is conserved from yeast to human []. JmjC domain proteins may be protein hydroxylases that catalyse a novel histone modification []. The human JmjC protein named Tyw5p unexpectedly acts in the biosynthesis of a hypermodified nucleoside, hydroxy-wybutosine, in tRNA-Phe by catalysing hydroxylation [].
Crustacean and cheliceratan hemocyanins (oxygen-transport proteins) and insect hexamerins (storage proteins) are homologous gene products, although the latter do not bind oxygen [
].Haemocyanins are found in the haemolymph of many invertebrates. They are divided into 2 main groups, arthropodan and molluscan. These have structurally similar oxygen-binding centres, which are similar to the oxygen-binding centre of tyrosinases, but their quaternary structures are arranged differently. The arthropodan proteins exist as hexamers comprising 3 heterogeneous subunits (a, b and c) and possess 1 oxygen-binding centre per subunit; and the molluscan proteins exist as cylindrical oligomers of 10 to 20 subunits and possess 7 or 8 oxygen-binding centres per subunit [
]. Although the proteins have similar amino acid compositions, the only real similarity in their primary sequences is in the region corresponding to the second copper-binding domain, which also shows similarity to the copper-binding domain of tyrosinases. Hexamerins are proteins from the hemolymph of insects, which may serve as a store of amino acids for synthesis of adult proteins. They do not possess the copper-binding histidines present in hemocyanins [
]. Homologues are also present in other kinds of organism, for example, Cyclopenase asqI from the yeast Emericella nidulans and Cyclopenase penL from Penicillium thymicola. AsqL is a tyrosinase involved in biosynthesis of the aspoquinolone mycotoxins, though its exact function is unknown [
]. PenL is part of the gene cluster that mediates the biosynthesis of penigequinolones, potent insecticidal alkaloids that contain a highly modified 10-carbon prenyl group [].
Secretion of virulence factors in Gram-negative bacteria involves transportation of the protein across two membranes to reach the cell exterior. There have been four secretion systems described in animal enteropathogens, such as Salmonella and Yersinia, with further sequence similarities in plant pathogens like Ralstonia and Erwinia [
].The type III secretion system is of great interest, as it is used to transport virulence factors from the pathogen directly into the host cell and is only triggered when the bacterium comes into close contact with the host. The protein subunits of the system are very similar to those of bacterial flagellar biosynthesis. However, while the latter forms a ring structure to allow secretion of flagellin and is an integral part of the flagellum itself [], type III subunits in the outer membrane translocate secreted proteins through a channel-like structure.Exotoxins secreted by the type III system do not possess a secretion signal, and are considered unique for this reason [
]. Yersinia secrete a Rho GTPase-activating protein, YopE [,
], that disrupts the host cell actin cytoskeleton. YopE is regulated by another bacterial gene, SycE [], that enables the exotoxin to remain soluble in the bacterial cytoplasm. A similar protein, exoenzyme S (ExoS) from Pseudomonas aeruginosa, has both ADP-ribosylation and GTPase activity [,
].This entry refers to the GTPase-activating protein (GAP) domain found in YopE, ExoS, and also SptP (Secreted effector protein) [
].
Urease and other nickel metalloenzymes are synthesised as precursors devoid of the metalloenzyme active site. These precursors then undergo a complex post-translational maturation process that requires a number of accessory proteins.Members of this group are nickel-binding proteins required for urease metallocentre assembly [
]. They are believed to function as metallochaperones to deliver nickel to urease apoprotein [,
]. It has been shown by yeast two-hybrid analysis that UreE forms a dimeric complex with UreG in Helicobacter pylori []. The UreDFG-apoenzyme complex has also been shown to exist [,
] and is believed to be, with the addition of UreE, the assembly system for active urease []. The complexes, rather than the individual proteins, presumably bind to UreB via UreE/H recognition sites.The structure of Klebsiella aerogenes UreE reveals a unique two-domain architecture.The N-terminal domain is structurally related to a heat shock protein, while the C-terminal domain shows homology to the Atx1 copper metallochaperone [
,
]. Significantly, the metal-binding sites in UreE and Atx1 are distinct in location and types of residues despite the relationship between these proteins and the mechanism for UreE activation of urease is proposed to be different from the thiol ligand exchange mechanism used by the copper metallochaperones.The N-terminal domain is termed the peptide-binding domain. Deletion of this domain does not eliminate enzymatic activity, and the truncated protein can still activate urease [
]. It has a closed barrel fold and a crossover loop topology.
This entry represents the catalytic domain of kexin, furin and related proteins [
]. Protein convertases, whose members include furin (MEROPS identifier S08.071) and kexin (S08.070), are members of the peptidase S8 or subtilase family of peptidases []. Kexins are involved in the activation of peptide hormones, growth factors, and viral proteins. Furin cleaves cell surface vasoactive peptides and proteins involved in cardiovascular tissue remodeling in the TGN, at cell surface, or in endosomes but rarely in the ER. Furin also plays a key role in blood pressure regulation though the activation of transforming growth factor (TGF)-beta. High specificity is seen for cleavage after dibasic (Lys-Arg or Arg-Arg) or multiple basic residues in protein convertases [].The subtilisin family is one of the largest serine peptidase families characterised to date. Over 200 subtilises are presently known, more than 170 of which with their complete amino acid sequence [
]. It is widespread, being found in eubacteria, archaebacteria, eukaryotes and viruses []. The vast majority of the family are endopeptidases, although there is an exopeptidase, tripeptidyl peptidase [,
]. Structures have been determined for several members of the subtilisin family: they exploit the same catalytic triad as the chymotrypsins, although the residues occur in a different order (HDS in chymotrypsin and DHS in subtilisin), but the structures show no other similarity [,
]. Some subtilisins are mosaic proteins, while others contain N- and C-terminal extensions that show no sequence similarity to any other known protein [
].
This entry represents the catalytic domain of AGC family of Serine/Threonine Kinases.
AGC kinases regulate many cellular processes including division, growth, survival, metabolism, motility, and differentiation. Many are implicated in the development of various human diseases. Proteins containing this domain include cAMP-dependent Protein Kinase (PKA), cGMP-dependent Protein Kinase (PKG), Protein Kinase C (PKC), Protein Kinase B (PKB), G protein-coupled Receptor Kinase (GRK), Serum- and Glucocorticoid-induced Kinase (SGK), and 70 kDa ribosomal Protein S6 Kinase (p70S6K or S6K), among others [,
].AGC kinases share an activation mechanism based on the phosphorylation of up to three sites: the activation loop (A-loop), the hydrophobic motif (HM) and the turn motif. Phosphorylation at the A-loop is required of most AGC kinases, which results in a disorder-to-order transition of the A-loop. The ordered conformation results in the access of substrates and ATP to the active site. A subset of AGC kinases with C-terminal extensions containing the HM also requires phosphorylation at this site. Phosphorylation at the HM allows the C-terminal extension to form an ordered structure that packs into the hydrophobic pocket of the catalytic domain, which then reconfigures the kinase into an active bi-lobed state. In addition, growth factor-activated AGC kinases such as PKB, p70S6K, RSK, MSK, PKC, and SGK, require phosphorylation at the turn motif (also called tail or zipper site), located N-terminal to the HM at the C-terminal extension [
,
,
,
].
This superfamily was originally identified in Drosophila and called mago nashi, it is a strict maternal effect, grandchildless-like, gene [
]. The protein is an integral member of the exon junction complex (EJC). The EJC is a multiprotein complex that is deposited on spliced mRNAs after intron removal at a conserved position upstream of the exon-exon junction, and transported to the cytoplasm where it has been shown to influence translation, surveillance, and localization of the spliced mRNA. It consists of four core proteins (eIF4AIII, Barentsz [Btz], Mago, and Y14), mRNA, and ATP and is supposed to be a binding platform for more peripherally and transiently associated factors along mRNA travel. Mago and Y14 form a stable heterodimer that stabilizes the complex by inhibiting eIF4AIII's ATPase activity. Mago-Y14 heterodimer has been shown to interact with the cytoplasmic protein PYM, an EJC disassembly factor, and specifically binds to the karyopherin nuclear receptor importin 13 [
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
].The human homologue has been shown to interact with an RNA binding protein, ribonucleoprotein rbm8 (
) [
]. An RNAi knockout of the Caenorhabditis elegans homologue causes masculinization of the germ line (Mog phenotype) hermaphrodites, suggesting it is involved in hermaphrodite germ-line sex determination [] but the protein is also found in hermaphrodites and other organisms without a sexual differentiation.Structurally, Mago nashi has a beta(4)-α-β(2)-alpha fold arranged into two layers (alpha/beta) with an antiparallel β-sheet.
The LCCL domain has been named after the best characterised proteins that were found to contain it, namely Limulus factor C, Coch-5b2 and Lgl1. It is an about 100 amino acids domain whose C-terminal part contains a highly conserved histidine in a conserved motif YxxxSxxCxAAVHxGVI. The LCCL module is thought to be an autonomously folding domain that has been used for the construction of various modular proteins through exon-shuffling. It has been found in various metazoan proteins in association with complement B-type domains, C-type lectin domains, von Willebrand type A domains, CUB domains, discoidin lectin domains or CAP domains. It has been proposed that the LCCL domain could be involved in lipopolysaccharide (LPS) binding [
,
]. Secondary structure prediction suggests that the LCCL domain contains six beta strands and two alpha helices []. The structure of the LCCL domain from human Coch-5b2 has been solved. It has an unusual fold, where a centrally located helix is wrapped by extended polypeptide segments of mostly irregular secondary structure []. Some proteins known to contain a LCCL domain include Limulus factor C, a LPS endotoxin-sensitive trypsin type serine protease which serves to protect the organism from bacterial infection; vertebrate cochlear protein cochlin or coch-5b2 (Cochlin is probably a secreted protein, mutations affecting the LCCL domain of coch-5b2 cause the deafness disorder DFNA9 in humans); and mammalian late gestation lung protein Lgl1, contains two tandem copies of the LCCL domain [
].
The hsp70 chaperone machine performs many diverse roles in the cell, including folding of nascent proteins, translocation of polypeptides across organelle membranes, coordinating responses to stress, and targeting selected proteins for degradation. DnaJ is a member of the hsp40 family of molecular chaperones, which is also called the J-protein family, the members of which regulate the activity of hsp70s. DnaJ (hsp40) binds to dnaK (hsp70) and stimulates its ATPase activity, generating the ADP-bound state of dnaK, which interacts stably with the polypeptide substrate [
,
]. Structurally, the DnaJ protein consists of an N-terminal conserved domain (called 'J' domain) of about 70 amino acids, a glycine-rich region ('G' domain') of about 30 residues, a central domain containing four repeats of a CXXCXGXG motif ('CRR' domain) and a C-terminal region of 120 to 170 residues.Such a structure is shown in the following schematic representation:
+------------+-+-------+-----+-----------+--------------------------------+
| J-domain | | Gly-R | | CXXCXGXG | C-terminal |+------------+-+-------+-----+-----------+--------------------------------+
The structure of the J-domain has been solved [
]. The J domain consists of four helices, the second of which has a charged surface that includes basic residues that are essential for interaction with the ATPase domain of hsp70 []. J-domains are found in many prokaryotic and eukaryotic proteins [
]. In yeast, three J-like proteins have been identified containing regions closely resembling a J-domain, but lacking the conserved HPD motif - these proteins do not appear to act as molecular chaperones [].
The circadian clock protein KaiC, is encoded in the kaiABC operon that controls circadian rhythms and may be universal in
Cyanobacteria. Each member contains two copies of the KaiC domain, which is alsofound in other proteins. KaiC performs autophosphorylation and acts as its own transcriptional repressor. Kai proteins (KaiA and KaiC) appear to positively and negatively regulate kaiBC transcription which is consistent with a transcription/translation oscillatory (TTO) feedback model, believed to be at the core of all self-sustained circadian timers. However, the cyanobacterial circadian clock is able to function without de novo synthesis of clock gene mRNAs and the clock proteins, and the period is accurately determined without TTO feedback and the system is also temperature-compensated. It has been demonstrated that these three purified proteins form a temperature-compensated molecular oscillator in vitro that exhibits rhythmic phosphorylation and dephosphorylation of KaiC[
].A negative-stain electron microscopy study of Synechococcus elongatus (Thermosynechococcus elongatus) and Thermosynechococcus elongatus BP-1 KaiA-KaiC complexes in combination with site-directed mutagenesis reveals that KaiA binds exclusively to the CII half of the KaiC hexamer. The EM-based model of the KaiA-KaiC complex reveals protein-protein interactions at two sites: the known interaction of the flexible C-terminal KaiC peptide with KaiA, and a second postulated interaction between the apical region of KaiA and the ATP binding cleft on KaiC. This model brings KaiA mutation sites that alter clock period or abolish rhythmicity into contact with KaiC and suggests how KaiA might regulate KaiC phosphorylation [
].
Tuberous sclerosis (TSC) is an autosomal dominant disorder caused by a
mutation in either the TSC1 or TSC2 tumour suppressor genes. The disease ischaracterised by hamartomas in one or more organs (including brain, skin,
heart and kidney) giving rise to a broad phenotypic spectrum (including seizures, mental retardation, renal dysfunction and dermatological
abnormalities. TSC2 encodes tuberin, a putative GTPase activatingprotein for rap1 and rab5. The TSC1 gene was recently identified and codes
for hamartin, a novel protein with no significant similarity to tuberin orany other known vertebrate protein [
]. Hamartin and tuberin have been shown to associate physically in vivo, their interaction being mediated by predicted coiled-coil domains. It is thought that hamartin and tuberin function in the same complex, rather than in separate pathways.
Moreover, because oligomerisation of the hamartin C-terminal coiled coildomain is inhibited by the presence of tuberin, it is possible that tuberin
acts as a chaperone, preventing hamartin self-aggregation [].Tuberin is a widely expressed 1784-amino-acid protein. Expression of
the wild-type gene in TSC2 mutant tumour cells inhibits proliferation andtumorigenicity. This "suppressor"activity is encoded by a functional
domain in the C terminus that shares similarity with the GTPase activatingprotein Rap1GAP [
]. It is thought that tuberin functions as a Rab5GAP in vivo to negatively regulate Rab5-GTP activity in endocytosis []. It also acts as a GTPase-activating protein (GAP) for the small GTPase RheB, a direct activator of the protein kinase activity of mTORC1 [,
].