Triglyceride lipases (
) are lipolytic enzymes that hydrolyse
ester linkages of triglycerides []. Lipases are widely distributed inanimals, plants and prokaryotes. At least three tissue-specific isozymes
exist in higher vertebrates: pancreatic, hepatic and gastric/lingual. Theselipases are closely related to each other and to lipoprotein lipase
(), which hydrolyses triglycerides of chylomicrons and very
low density lipoproteins (VLDL) [].Familial human hepatic lipase deficiency is a rare recessive disorder that
results from mutation in position 405 of the mature protein. The disease ischaracterised by premature atherosclerosis and abnormal circulating
lipoproteins [].The structure of the human hepatic triglyceride lipase gene has beendetermined [
]. The hepatic lipase gene spans ~60 kb, and contains 8 intronsand 9 exons: exon 1 encodes the signal peptide; exon 4, a region that binds
to the lipoprotein substrate; exon 5, an evolutionarily highly-conserved region of potential catalytic function; and exons 6 and 9 encode sequences
rich in basic amino acids, thought to be important in anchoring the enzymeto the endothelial surface by interacting with acidic domains of surface
glycosaminoglycans []. The human lipoprotein lipase gene has an identicalexon-intron organisation, with analogous structural domains, supporting the
common evolutionary origin of these two lipolytic enzymes [].
Somatomedin B (SMB), a serum factor of unknown function, is a small cysteine-rich peptide, derived proteolytically from the N terminus of the cell-substrate adhesion protein vitronectin [
]. Cys-rich somatomedin B-like domains are found in a number of proteins [], including ectonucleotide pyrophosphatase/phosphodiesterase family member proteins (previously known as plasma-cell membrane glycoprotein) [] and placental protein 11 (also known as Poly(U)-specific endoribonuclease), which appears to possess amidolytic activity.The SMB domain of vitronectin has been demonstrated to interact with both the urokinase receptor and the plasminogen activator inhibitor-1 (PAI-1) and the conserved cysteines of the NPP1 somatomedin B-like domain have been shown to mediate homodimerisation [
].The SMB domain contains eight Cys residues, arranged into four disulphide bonds. It has been suggested that the active SMB domain may be permitted considerable disulphide bond heterogeneity or variability, provided that the Cys25-Cys31 disulphide bond is preserved. The three dimensional structure of the SMB domain is extremely compact and the disulphide bonds are packed in the centre of the domain forming a covalently bonded core [
]. The structure of the SMB domain presents a new protein fold, with the only ordered secondary structure being a single-turn α-helix and a single-turn 3(10)-helix [].
TPP1 (Est3 in yeast) is a component of the telomerase holoenzyme (shelterin complex), involved in telomere replication. It has been demonstrated that TPP1 dimerises and binds to DNA and RNA. Furthermore, TPP1 stimulates the dissociation of RNA/DNA hetero-duplexes [
,
]. Yeast telomerase protein TPP1 (Est3) is a novel type of GTPase []. The key residues in Saccharomyces cerevisiae are an Asp at residue 86 and the Arg at residue 110. The Asp is totally conserved in the family, whereas the Arg is not so well conserved. The N-terminal of TPP1 is likely to be the binding surface for TIN2, whereas the C terminus probably binds to POT1, thereby tethering POT1 to the shelterin complex [
]. The complex bound to telomeric DNA increases the activity and processivity of the human telomerase core enzyme, thus helping to maintain the length of the telomeres [,
,
].The human shelterin complex includes six proteins: telomere repeat binding factor 1 (TRF1); TRF2, repressor/activator protein 1 (RAP1); TRF1-interacting nuclear protein 2 (TIN2); TIN2-interacting protein 1 (TPP1), also known as ACD from adrenocortical dysplasia protein; and protection of telomeres 1 (POT1) [
].
Nesprins (nuclear envelope spectrin-repeat proteins) are a family of giant spectrin-repeat containing proteins that act as versatile intracellular protein scaffolds [
]. They are characterised by a central extended spectrin-repeat (SR) and a C-terminal Klarsicht/ANC-1/Syne homology (KASH) domain that can associate with Sad1p/UNC-84 (SUN)-domain proteins of the inner nuclear membrane within the periplasmic space of the nuclear envelope (NE) [].This entry represents Nesprin-4 predominantly from mammals. Nesprin-4 links the nucleus to microtubules through its binding to kinesin-1 [
]. It is a component of the linker of the nucleoskeleton and cytoskeleton (LINC) complex, which plays critical roles in nuclear positioning, cell polarisation and cellular stiffness []. This entry also conbtains the karyogamy meiotic segregation protein 2 (kms2) from the fission yeast
Schizosaccharomyces pombe. Kms2 contains a KASH domain and during interphase colocalizes within the nuclear envelope with the SUN domain-containing protein Sad1 at the site of attachment of the spindle pole body (SPB, the yeast version of the centrosome). Kms2 interacts with the SPB components Cut12 and Pcp1 and the Polo kinase Plo1 and is important for remodelling of the SPB and entry of the cell into mitosis [
].
Antifreeze proteins (AFPs) are a class of proteins that are able to bind to and inhibit the growth of macromolecular ice, thereby permitting an organism to survive subzero temperatures by decreasing the probability of ice nucleation in their bodies [
]. These proteins have been characterised from a variety of organisms, including fish, plants, bacteria, fungi and arthropods. This entry represents insect AFPs of the type found in spruce budworm, Choristoneura fumiferana.The structure of these AFPs consists of a left-handed β-helix with 15 residues per coil [
]. The β-helices of insect AFPs present a highly rigid array of threonine residues and bound water molecules that can effectively mimic the ice lattice. As such, β-helical AFPs provide a more effective coverage of the ice surface compared to the α-helical fish AFPs.A second insect antifreeze from Tenebrio molitor (
) also consists of β-helices, however in these proteins the helices form a right-handed twist; these proteins show no sequence homology to the current entry, but may act by a similar mechanism. The β-helix motif may be used as an AFP structural motif in non-homologous proteins from other (non-fish) organisms as well.
Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [
,
,
,
,
]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few [
]. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents PARP (Poly(ADP) polymerase) type zinc finger domains.NAD(+) ADP-ribosyltransferase(
) [
,
] is a eukaryotic enzyme that catalyses the covalent attachment of ADP-ribose units from NAD(+) to various nuclear acceptor proteins. This post-translational modification of nuclear proteins is dependent on DNA. It appears to be involved in the regulation of various important cellular processes such as differentiation, proliferation and tumour transformation as well as in the regulation of the molecular events involved in the recovery of the cell from DNA damage. Structurally, NAD(+) ADP-ribosyltransferase consists of three distinct domains: an N-terminal zinc-dependent DNA-binding domain, a central automodification domain and a C-terminal NAD-binding domain. The DNA-binding region contains a pair of PARP-type zinc finger domains which have been shown to bind DNA in a zinc-dependent manner. The PARP-type zinc finger domains seem to bind specifically to single-stranded DNA and to act as a DNA nick sensor. DNA ligase III [] contains, in its N-terminal section, a single copy of a zinc finger highly similar to those of PARP.
Tensins constitute an eukaryotic family of lipid phosphatases that are defined by the presence of two adjacent domains: a lipid phosphatase domain and a C2-like domain. The tensin-type C2 domain has a structure similar to the classical C2 domain (see
) that mediates the Ca2+-dependent membrane recruitment of several signalling proteins. However the tensin-type C2 domain lacks two of the three conserved loops that bind Ca2+, and in this respect it is similar to the C2 domains of PKC-type [
,
]. The tensin-type C2 domain can bind phopholipid membranes in a Ca2+ independent manner []. In the tumour suppressor protein PTEN, the best characterised member of the family, the lipid phosphatase domain was shown to specifically dephosphorylate the D3 position of the inositol ring of the lipid second messenger, phosphatydilinositol-3-4-5-triphosphate (PIP3). The lipid phosphatase domain contains the signature motif HCXXGXXR present in the active sites of protein tyrosine phosphatases (PTPs) and dual specificity phosphatases (DSPs). Furthermore, two invariant lysines are found only in the tensin-type phosphatase motif (HCKXGKXR) and are suspected to interact with the phosphate group at position D1 and D5 of the inositol ring [,
]. The C2 domain is found at the C terminus of the tumour suppressor protein PTEN (phosphatidyl-inositol triphosphate phosphatase). This domain may include a CBR3 loop, indicating a central role in membrane binding. This domain associates across an extensive interface with the N-terminal phosphatase domain DSPc suggesting that the C2 domain productively positions the catalytic part of the protein on the membrane. The crystal structure of the PTEN tumour suppressor has been solved [
]. The lipid phosphatase domain has a structure similar to the dual specificity phosphatase (see ). However, PTEN has a larger active site pocket that could be important to accommodate PI(3,4,5)P3.
Proteins known to contain a phosphatase and a C2 tensin-type domain are listed below: Tensin, a focal-adhesion molecule that binds to actin filaments. It may be involved in cell migration, cartilage development and in linking signal transduction pathways to the cytoskeleton.Phosphatase and tensin homologue deleted on chromosome 10 protein (PTEN). It antagonizes PI 3-kinase signalling by dephosphorylating the 3-position of the inositol ring of PI(3,4,5)P3 and thus inactivates downstream signalling. It plays major roles both during development and in the adult to control cell size, growth, and survival.Auxilin. It binds clathrin heavy chain and promotes its assembly into regular cages.Cyclin G-associated kinase or auxilin-2. It is a potential regulator of clathrin-mediated membrane trafficking.
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].This family represents a group of animal proteins that play important roles in both physiological state and diseases [
]. Proteins in this family are frequently overexpressed by common tumors. Consequently, they are considered a possible therapeutic target in several tumors, particularly in prostate, breast, and lung cancer, but its role in some CNS/neural tumors (gliomas, neuroblastomas, medulloblastomas) may also be of interest []. This small family represents [Phe13]-bombesin receptor (Bombesin receptor suptype 4, BRS4) from Bombina orientalis (oriental fire-bellied toad) and similar proteins from amphibia. The recently-identified BRS-4 bombesin receptor subtype is found only in the brain, primarily in the cortex and forebrain, and at low levels in themidbrain. The relative rank potency of bombesin-like peptides for this receptor is [Phe13]bombesin >[Leu13]bombesin >GRP >neuromedin B [
].
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Computational methods, including percent identity plots, hydropathy profiles and BLAST, have been used to analyse a gene-rich cluster at human chromosome 12p13 and to compare it with its syntenic region in mouse chromosome 6 [
,
,
]. Of 6 genes identified, a number were novel receptors, including GPR153 (also known as PGR1) and GPR162 (also known as GRCA) []. GPR153 is a cerebellar target of the Gli1 transcription factor, which is involved in the maintenance and proliferation of grabule neuron precursor cells in the cerebellum, and like GPR162 has a noted role in food uptake and decision making processes [].This entry represents G-protein coupled receptor 153, identified by conserved sections along the length of the protein that characterise GP153 and distinguish it from closely related GP162 proteins.
This entry represents a six transmembrane helix rhomboid domain.This domain is found in serine peptidases belonging to the MEROPS peptidase family S54 (Rhomboid, clan ST). They are integral membrane proteins related to the Drosophila melanogaster (Fruit fly) rhomboid protein
. Members of this family are found in archaea, bacteria and eukaryotes.
The rhomboid protease cleaves type-1 transmembrane domains using a catalytic dyad composed of serine and histidine. The active site is embedded within the membrane and the active site residues are on different transmembrane regions. From the tertiary structure of the Escherichia coli homologue GlpG [
] it was shown that hydrolysis occurs in a fluid filled cavity within the membrane. Initially, a catalytic triad including a highly conserved asparagine had been proposed, but this residue has been shown not to be essential []. Drosophila rhomboid cleaves the transmembrane proteins Spitz, Gurken and Keren within their transmembrane domains to release a soluble TGFalpha-like growth factor. Cleavage occurs in the Golgi, following translocation of the substrates from the endoplasmic reticulum membrane by Star, another transmembrane protein. The growth factors are then able to activate the epidermal growth factor receptor [,
].Few substrates of mammalian rhomboid homologues have been determined, but rhomboid-like protein 2 has been shown to cleave ephrin B3 [
]. Parasite-encoded rhomboid enzymes are also important for invasion of host cells by Toxoplasma and the malaria parasite. Invasion of host cells first requires their recognition and this is achieved by parasite transmembrane adhesins interacting with host cell receptors. Before the parasite can enter a host cell the adhesins must be released by cleavage. In Toxoplasma rhomboid TgROM5 cleaves the adhesins, and in Plasmodium, which lacks a TgROM5 orthologue, PfROMs 1 and 4 cleave the diverse array of malaria parasite adhesins [].This entry also includes catalytically inactive rhomboid protease homologues, iRhom1/2, which are metazoan-specific and play crucial roles within the secretory pathway, including protein degradation, trafficking regulation, and inflammatory signaling [
]. They regulate ADAM17 protease, acting as trafficking factors that escort ADAM17 from the ER to the later secretory pathway. They are required for the cleavage and release of a variety of membrane-associated proteins [,
]. iRhombs have been linked to the development and progression of several autoimmune diseases including rheumatoid arthritis, lupus nephritis, as well as hemophilic arthropathy [] and also in neurological disorders such as Alzheimer's and Parkinson's diseases, inflammation, cancer and skin diseases [].
ADP-ribosylation factor-binding protein GGA3, also known as golgi-localised gamma ear-containing ARF-binding protein 3, plays a role in protein sorting and trafficking between the trans-Golgi network (TGN) and endosomes. It is required for the lysosomal degradation of BACE (beta-site APP-cleaving enzyme), the protease that initiates the production of beta-amyloid, which causes Alzheimer's disease [
]. It also plays a key role in GABA transmission, which is important in the regulation of anxiety-like behaviours []. GGA3 mediates the ARF-dependent recruitment of clathrin to the TGN [] and binds ubiquitinated proteins and membrane cargo molecules []. GGA3 belong to the GGA family of proteins, which have a multidomain structure consisting of an N-terminal VHS domain linked by a short proline-rich linker to a GAT (GGA and TOM) domain, which is followed by a long flexible linker to the C-terminal appendage, GAE (Gamma-Adaptin Ear) domain. The GAT domain is a region of homology of ~130 residues, which is found in eukaryotic GGAs (for Golgi-localized, gamma ear-containing ADP ribosylation factor (ARF)-binding proteins) and vertebrate TOMs (for target of myb). The GAT domain is found in its entirety only in GGAs, although, at the C terminus it shares partial sequence similarity with a short region of TOMs. The GAT domain is found in association with other domains, such as VHS and GAE. The GAT domain of GGAs serves as a molecular anchor of GGA to trans-Golgi network (TGN) membranes via its interaction with the GTP-bound form of a member of the ARF family of small GTPases and can bind specifically to the Rab GTPase effector rabaptin5 and to ubiquitin [
,
,
,
].The GGA-GAT domain possesses an all α-helical structure, composed of four helices arranged in a somewhat unusual topology, which has been called the helical paper clip. The overall structure shows that the GAT domain has an elongated shape, in which the longest helix participates in two small independent subdomains: an N-terminal helix-loop-helix hook and a C-terminal three-helix bundle. The hook subdomain has been shown to be both necessary and sufficient for ARF-GTP binding and Golgi targeting of GGAs. The N-terminal hook subdomain contains a hydrophobic patch, which is found to interact directly with ARF [
]. It has been proposed that this interaction might stabilise the hook subdomain [
]. The C-terminal three-helix bundle is involved in the binding with Rabaptin5 and ubiquitin [].This entry represents the GAT domain found in ADP-ribosylation factor-binding protein GGA3.
Amyloidogenic glycoprotein, intracellular domain, conserved site
Type:
Conserved_site
Description:
Amyloid-beta precursor protein (APP, or A4) is associated with Alzheimer's disease (AD), because one of its breakdown products, amyloid-beta (A-beta), aggregates to form amyloid or senile plaques [
,
,
]. Mutations in APP or in proteins that process APP have been linked with early-onset, familial AD. Individuals with Down's syndrome carry an extra copy of chromosome 21, which contains the APP gene, and almost invariably develop amyloid plaques and Alzheimer's symptoms.APP is important for the neurogenesis and neuronal regeneration, either through the intact protein, or through its many breakdown products [
,
]. APP consists of a large N-terminal extracellular region containing heparin-binding and copper-binding sites, Kunitz domain, E2 domain, a short hydrophobic transmembrane domain, and a short C-terminal intracellular domain. The N-terminal region is similar in structure to cysteine-rich growth factors and appears to function as a cell surface receptor, contributing to neurite growth, neuronal adhesion, axonogenesis and cell mobility []. APP acts as a kinesin I membrane receptor to mediate the axonal transport of beta-secretase and presenilin 1. The N-terminal domain can regulate neurite outgrowth through its binding to heparin and collagen I and IV, which are components of the extracellular matrix. APP is also coupled to apoptosis-inducing pathways, and is involved in copper homeostasis/oxidative stress through copper ion reduction, where copper-metallated APP induces neuronal death [,
]. The C-terminal intracellular domain appears to be involved in transcription regulation through protein-protein interactions. APP can promote transcription activation through binding to APBB1/Tip60, and may bind to the adaptor protein FE65 to transactivate a wide variety of different promoters.APP can be processed by different sets of enzymes:In the non-amyloidogenic (non-plaque-forming) pathway, APP is cleaved by alpha-secretase to yield a soluble N-terminal sAPP-alpha (neuroprotective) and a membrane-bound CTF-alpha. CTF-alpha is broken-down by presenilin-containing gamma-secretase to yield soluble p3 and membrane-bound AICD (nuclear signalling). In the amyloidogenic pathway (plaque-forming), APP is broken down by beta-secretase to yield soluble sAPP-beta and membrane-bound CTF-beta. CTF-beta is broken down by gamma-secretase to yield soluble amyloid-beta and membrane-bound AICD. Amyloid-beta is required for neuronal function, but can aggregate to form amyloid plaques that seem to disrupt brain cells by clogging points of cell-cell contact.This entry represents a conserved signature pattern located in the intra-cellular domain and found towards the C-terminal extremity of these proteins.
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes [
]. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Many families of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases [].Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base [
]. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [,
].Limited proteolysis of most large protein precursors is carried out in vivo by the subtilisin-like pro-protein convertases. Many important biological processes such as peptide hormone synthesis, viral protein processing and receptor maturation involve proteolytic processing by these enzymes [
]. The subtilisin-serine protease (SRSP) family hormone and pro-protein convertases (furin, PC1/3, PC2, PC4, PACE4, PC5/6, and PC7/7/LPC) act within the secretory pathway to cleave polypeptide precursors at specific basic sites, generating their biologically active forms. Serum proteins, pro-hormones, receptors, zymogens, viral surface glycoproteins, bacterial toxins, amongst others, are activated by this route []. The SRSPs share the same domain structure, including a signal peptide, the pro-peptide, the catalytic domain, the P/middle or homo B domain, and the C terminus.This entry contains serine peptidases belonging to MEROPS peptidase family S8A (subtilisin family, clan SB). This family of serine peptidases are highly similar in both structural and enzymatic feature to SAM-P45 peptidases, which are known as a target enzyme of Streptomyces subtilisin inhibitor (SSI), from Streptomyces albogriseolus.DPH-A effectively forms chiral intermediates of 1,4-dihydropyridine calcium antagonists, and suggests the feasibility of developing DHP-A as a new commercial enzyme for use in the chiral drug industry [
].
The tryptophan RNA-binding attenuation protein (TRAP) regulates expression of the tryptophan biosynthetic genes in Bacillus sp. by binding to the leader region of the nascent trp operon mRNA [
]. The crystal structure of the Trp RNA-binding attenuation protein of Bacillus subtilis (mtrB, ) has been solved [
]. TRAP forms an oligomeric ring consisting of 11 single-domain subunits, where each subunit adopts a double-stranded β-helix structure with the appearance of a β-sandwich of distinct architecture and jelly-roll fold. The 11 subunits are stabilised by 11 inter-subunit strands, forming a β-wheel with a large central hole. TRAP is activated by binding to tryptophan in clefts between adjacent β-strands, which induces conformational changes in the protein. Activated TRAP binds an mRNA target sequence consisting of 11 (G/U)AG repeats, separated by 2-3 spacer nucleotides. The spacer nucleotides do not make direct contact with the TRAP protein, but they do influence the conformation of the RNA, which might influence the specificity of TRAP [].This superfamily represents a domain with a TRAP-like double-stranded β-helix topology. This domain is found in TRAP proteins, as well as in the hypothetical protein SPyM3_0169 from Streptococcus pyogenes. SPyM3_0169 contains 9 domains per ring-like trimer, where each subunit contains three structural repeats.
Retinoblastoma-like and retinoblastoma-associated proteins may have a function in cell cycle regulation. They form a complex with adenovirus E1A and Simian virus 40 (SV40) large T antigen, and may bind and modulate the function of certain cellular proteins with which T and E1A compete for pocket binding. The proteins may act as tumor suppressors, and are potent inhibitors of E2F-mediated trans-activation.
This domain has the cyclin fold [].The crystal structure of the Rb pocket bound to a nine-residue E7 peptide containing the LxCxE motif, shared by other Rb-binding viral and cellular proteins, shows that the LxCxE peptide binds a highly conserved groove on the B-box portion of the pocket; the A-box portion appears to be required for the stable folding of the B box (see
). Also highly conserved is the extensive A-B interface, suggesting that it may be an additional protein-binding site. The A and B boxes each contain the cyclin-fold structural motif, with the LxCxE-binding site on the B-box cyclin fold being similar to a Cdk2-binding site of cyclin A and to a TBP-binding site of TFIIB [
].The A and B boxes are found at the C-terminal end of the protein; the A-box is on N-terminal side of the B-box.
The MIR domain is named after three of the proteins in which it occurs: protein Mannosyltransferase (
), Inositol 1,4,5-trisphosphate receptor (IP3R) and Ryanodine receptor (RyR). MIR domains have also been found in eukaryotic stromal cell-derived factor 2 (SDF-2) [
] and in Chlamydia trachomatis protein CT153. The MIR domain may have a ligand transferase function. This domain has a closed β-barrel structure with a hairpin triplet, and has an internal pseudo-threefold symmetry. The MIR motifs that make up the MIR domain consist of ~50 residues and are often found in multiple copies.Inositol 1,4,5-trisphosphate (InsP3) is an intracellular second messenger that transduces growth factor and neurotransmitter signals. InsP3 mediates the release of Ca
2+from intracellular stores by binding to specific Ca
2+channel-coupled receptors. Ryanodine receptors are involved in communication between transverse-tubules and the sarcoplamic reticulum of cardiac and skeletal muscle. The proteins function as a Ca
2+-release channels following depolarisation of transverse-tubules [
]. The function is modulated by Ca2+, Mg2+, ATP and calmodulin. Deficiency in the ryanodine receptor may be the cause of malignant hyperthermia (MH) and of central core disease of muscle (CCD) [
]. protein O-mannosyltransferases transfer mannose from DOL-P-mannose to ser or thr residues on proteins.
This entry represents a group of haemoglobin-like proteins found in eubacteria, cyanobacteria, protozoa, algae and plants, but not in animals or yeast. These proteins have a truncated 2-over-2 rather than the canonical 3-over-3 α-helical sandwich fold []. They include:HbN (or GlbN): a truncated haemoglobin-like protein that binds oxygen cooperatively with a very high affinity and a slow dissociation rate, which may exclude it from oxygen transport. It appears to be involved in bacterial nitric oxide detoxification and in nitrosative stress [
].Cyanoglobin (or GlbN): a truncated haemoprotein found in cyanobacteria that has high oxygen affinity, and which appears to serve as part of a terminal oxidase, rather than as a respiratory pigment [
].HbO (or GlbO): a truncated haemoglobin-like protein with a lower oxygen affinity than HbN. HbO associates with the bacterial cell membrane, where it significantly increases oxygen uptake over membranes lacking this protein. HbO appears to interact with a terminal oxidase, and could participate in an oxygen/electron-transfer process that facilitates oxygen transfer during aerobic metabolism [
].Glb3: a nuclear-encoded truncated haemoglobin from plants that appears more closely related to HbO than HbN. Glb3 from Arabidopsis thaliana (Mouse-ear cress) exhibits an unusual concentration-independent binding of oxygen and carbon dioxide [
].
The macrolide antibiotic rapamycin and the cytosol protein FKBP12 can form a complex which specifically inhibits the TORC1 complex, leading to growth arrest. The FKBP12-rapamycin complex interferes with TORC1 function by binding to the FKBP12-rapamycin binding domain (FRB) of the Tor proteins. This entry represents the FRB domain [
,
]. Proteins containing this domain include Tor proteins which are serine/threonine kinases conserved from fungi to humans. While higher eukaryotes such as humans possess a single Tor protein, yeasts contain two (Tor1 and Tor2) []. In budding yeast, the Tor2 protein exists in two distinct multi-component complexes, TORC1 and TORC2. TORC1 regulates cell growth by regulating many growth-related processes and is rapamycin sensitive, while TORC2 regulates the cell cytoskeleton and is rapamycin insensitive. Budding yeast TORC1 consists of either Tor1 or Tor2 in complex with Kog1, Lst8 and Tco89, while TORC2 is composed of Avo1, Avo2, Tsc11, Lst8, Bit61, Slm, Slm2 and Tor2 [,
]. In both yeast and mammals, FKBP12-rapamycin binds to Tor (Tor1, Tor2, or mTOR) in TORC1, but not to Tor (Tor2 or mTOR) in TORC2. It has been suggested that the architecture of TORC2 or its unique composition might be responsible for the observed rapamycin resistance [].
This entry represents a structural domain found in several acyl-CoA acyltransferase enzymes. This domain has a 3-layer α/β/α structure that contains mixed β-sheets, and can be found in the following proteins:N-acetyl transferase (NAT) family members, including aminoglycoside N-acetyltransferases [
], the histone acetyltransferase domain of P300/CBP associating factor PCAF [], the catalytic domain of GCN5 histone acetyltransferase [], and diamine acetyltransferase 1 [].Autoinducer synthetases, such as protein LasI [
] and acyl-homoserinelactone synthase EsaI [].Leucyl/phenylalanyl-tRNA-protein transferase (LFTR), a close relative of the non-ribosomal peptidyltransferases; there is a deletion of the N-terminal half of the N-terminal NAT-like domain after the domain duplication/swapping events [
].Ornithine decarboxylase antizyme, which may have evolved a different function for this domain, although the putative active site maps to the same location in the common fold.Arginine N-succinyltransferase, alpha chain, AstA, which contains an extra C-terminal domain that is similar to the double-psi β-barrel fold domain (missing one strand and untangled ψ-loops).Several proteins carry a duplication of this domain, which consists of two NAT-like domains swapped with the C-terminal strands, including:N-myristoyl transferase (NMT) [
].FemXAB non ribosomal peptidyl transferases, including methicillin-resistance protein FemA (transfer glycyl residue from tRNA-Gly) [
] and peptidyl transferase FemX [].Hypothetical protein cg14615-pa from Drosophila melanogaster (Fruit fly).
This superfamily represents domains with an immunoglobulin-like (Ig-like) fold, which consists of a β-sandwich of seven or more strands in two sheets with a Greek-key topology. Ig-like domains are one of the most common protein modules found in animals, occurring in a variety of different proteins. These domains are often involved in interactions, commonly with other Ig-like domains via their β-sheets [
,
,
,
]. Domains within this fold-family share the same structure, but can diverge with respect to their sequence. Based on sequence, Ig-like domains can be classified as V-set domains (antibody variable domain-like), C1-set domains (antibody constant domain-like), C2-set domains, and I-set domains (antibody intermediate domain-like). Proteins can contain more than one of these types of Ig-like domains. For example, in the human T-cell receptor antigen CD2, domain 1 (D1) is a V-set domain, while domain 2 (D2) is a C2-set domain, both domains having the same Ig-like fold [].Domains with an Ig-like fold can be found in many, diverse proteins in addition to immunoglobulin molecules. For example, Ig-like domains occur in several different types of receptors (such as various T-cell antigen receptors), several cell adhesion molecules, MHC class I and II antigens, as well as the hemolymph protein hemolin, and the muscle proteins titin, telokin and twitchin.
All known carboxypeptidases are either metallo carboxypeptidases or serine
carboxypeptidases (and ). The catalytic activity of the serine carboxypeptidases, like that of the trypsin family serine proteases, is provided by a charge relay system involving an aspartic acid residue hydrogen-bonded to a histidine, which is itself hydrogen-bonded to a serine [
]. Proteins known to be serine carboxypeptidases include:Barley and wheat serine carboxypeptidases I, II, and III [
]. Yeast carboxypeptidase Y (YSCY) (gene PRC1), a vacuolar protease involved
in degrading small peptides. Yeast KEX1 protease, involved in killer toxin and alpha-factor precursor
processing. Fission yeast sxa2, a probable carboxypeptidase involved in degrading or
processing mating pheromones []. Penicillium janthinellum carboxypeptidase S1 [
]. Aspergullus niger carboxypeptidase pepF. Aspergullus satoi carboxypeptidase cpdS. Vertebrate protective protein / cathepsin A [
], a lysosomal protein whichis not only a carboxypeptidase but also essential for the activity of both
beta-galactosidase and neuraminidase. Mosquito vitellogenic carboxypeptidase (VCP) [
]. Naegleria fowleri virulence-related protein Nf314 [
]. Yeast hypothetical protein YBR139w. Caenorhabditis elegans hypothetical proteins C08H9.1, F13D12.6, F32A5.3, F41C3.5 and K10B2.2. In higher plants and fungi serine carboxypeptidases are found in the cell vacuoles. In animal cells serine carboxypeptidases are found lysosomes [
].The sequences surrounding the active site histidine residue are highly conserved in all these serine carboxypeptidases.
Proteins in this entry are E3 ubiquitin-protein ligases that mediate ubiquitination and subsequent proteasomal degradation of target proteins. Proteins in this entry include Sina and Sinah (Sina homologue) from flies and SIAH1/2 from humans.The seven in absentia (sina) gene was first identified in Drosophila. The Drosophila Sina protein is essential for the determination of the R7 pathway in photoreceptor cell development: the loss of functional Sina results in the transformation of the R7 precursor cell to a non-neuronal cell type. The Sina protein contains an N-terminal RING finger domain C3HC4-type. Through this domain, Sina binds E2 ubiquitin-conjugating enzymes (UbcD1). Sina also interacts with Tramtrack (TTK88) via PHYL. Tramtrack is a transcriptional repressor that blocks photoreceptor determination, while PHYL down-regulates the activity of TTK88. In turn, the activity of PHYL requires the activation of the Sevenless receptor tyrosine kinase, a process essential for R7 determination. It is thought that Sina targets TTK88 for degradation, therefore promoting the R7 pathway. Murine and human homologues of Sina have also been identified. The human homologue SIAH1 [
] also binds E2 enzymes (UbcH5) and through a series of physical interactions, targets beta-catenin for ubiquitin degradation. Siah-1 expression is enhanced by p53, itself promoted by DNA damage. Thus this pathway links DNA damage to beta-catenin degradation [,
].
The K homology (KH) domain was first identified in the human heterogeneous
nuclear ribonucleoprotein (hnRNP) K. It is a domain of around 70 amino acidsthat is present in a wide variety of quite diverse nucleic acid-binding
proteins []. It has been shown to bind RNA [,
]. Like many other RNA-binding motifs, KH motifs are found in one or multiple copies (14 copies in chicken vigilin) and, at least for hnRNP K (three copies) and FMR-1 (two copies), each motif is necessary for in vitroRNA binding activity, suggesting that they may function cooperatively or, in the case of single KH motif proteins (for example, Mer1p), independently [
].According to structural analyses [
,
,
], the KH domain can be separated in two groups. The first group or type-1 contain a beta-α-α-β-β-alpha structure, whereas in the type-2 the two last β-sheets are located in the N-terminal part of the domain (α-β-beta-α-α-beta). Sequence similarity between these two folds are limited to a short region (VIGXXGXXI) in the RNA binding motif. This motif is located between helice 1 and 2 in type-1 and between helice 2 and 3 in type-2. Proteins known to contain a type-2 KH domain include eukaryotic and prokaryotic S3 family of ribosomal proteins, and the prokaryotic GTP-binding protein era.
The Ferric uptake regulator (Fur) family includes metal ion uptake regulator proteins, which are responsible for controlling the intracellular concentration of iron in many bacteria. The Fur protein (a dimer having one Fe
2+coordinated per monomer) acts as an iron-responsive,
DNA-binding repressor protein that binds to a 'Furbox' with the consensus sequence GATAATGATAATCATTATC in the promoter of iron-regulated genes. Under low-iron condition, the Fur protein is released from the promoter and transcription resumed []. Some members sense metal ions other than Fe
2+. For example, the zinc uptake regulator (Zur) responds to Zn
2+[
], the manganese uptake regulator (Mur) responds to Mn2+, and the nickel uptake regulator (Nur) responds to Ni
2+[
,
]. Other members sense signals other than metal ions. For example, PerR, a metal-dependent sensor of hydrogen peroxide. PerR regulates DNA-binding activity through metal-based protein oxidation, and co-ordinates Mn2+or Fe
2+at its regulatory site [
]. Furs can also control zinc homeostasis and is the subject of research on the pathogenesis of mycobacteria [,
]. Fur family proteins contain an N-terminal winged-helix DNA-binding domain followed by a dimerization domain; this entry spans both those domains [,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
].
The Escherichia coli RuvC gene is involved in DNA repair and in the late step of RecE and RecF pathway recombination [
]. RuvC protein () cleaves cruciform junctions, which are formed by the extrusion of inverted repeat sequences from a super-coiled plasmid and which are structurally analogous to Holliday junctions, by introducing nicks into strands with the same polarity. The nicks leave a 5'terminal phosphate and a 3'terminal hydroxyl group which are ligated by E. coli or Bacteriophage T4 DNA ligases. Analysis of the cleavage sites suggests that DNA topology rather than a particular sequence determines the cleavage site. RuvC protein also cleaves Holliday junctions that are formed between gapped circular and linear duplex DNA by the function of RecA protein. The active form of RuvC protein is a dimer. This is mechanistically suited for an endonuclease involved in swapping DNA strands at the crossover junctions. It is inferred that RuvC protein is an endonuclease that resolves Holliday structures
in vivo[
]. RuvC is a small protein of about 20 kD. It requires and binds a magnesium ion. The structure of E. coli RuvC is a 3-layer α-β sandwich containing a 5-stranded β-sheet sandwiched between 5 α-helices [
].
This entry represents the C-terminal domain present in Yfir transcription regulator proteins found in Bacillus subtilis [
]. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis []. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain [].TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity [
]. The TetR proteins identified in multiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response [].
Lysosome-associated membrane glycoproteins (lamp) [
] are integral membrane proteins, specific to lysosomes, and whose exact biological function is not yet clear. Structurally, the lamp proteins consist of two internally homologous lysosome-luminal domains separated by a proline-rich hinge region; at the C-terminal extremity there is a transmembrane region (TM) followed by a very short cytoplasmic tail (C). In each of the duplicated domains, there are two conserved disulphide bonds. This structure is schematically represented in the figure below. +-----+ +-----+ +-----+ +-----+
| | | | | | | |xCxxxxxCxxxxxxxxxxxxCxxxxxCxxxxxxxxxCxxxxxCxxxxxxxxxxxxCxxxxxCxxxxxxxx
+--------------------------++Hinge++--------------------------++TM++C+In mammals, there are two closely related types of lamp: lamp-1 and lamp-2, which form major components of the lysosome membrane. In chicken lamp-1 is known as LEP100. Also included in this entry is the macrophage protein CD68 (or macrosialin) [
] is a heavily glycosylated integral membrane protein whose structure consists of a mucin-like domain followed by a proline-rich hinge; a single lamp-like domain; a transmembrane region and a short cytoplasmic tail. Similar to CD68, mammalian lamp-3, which is expressed in lymphoid organs, dendritic cells and in lung, contains all the C-terminal regions but lacks the N-terminal lamp-like region [
]. In a lamp-family protein from nematodes [] only the part C-terminal to the hinge is conserved.
TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity [
]. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response.This entry represents the C-terminal domain found in a number of different TetR transcription regulator proteins found mainly in Actinobacteria [
]. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis []. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain [].
TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity [
]. The TetR proteins identified in over multiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response []. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis []. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain [].This entry represents the C-terminal domain present the TetR Transcriptional Repressor present in sco1712 proteins from Streptomyces coelicolor which act as a regulator of antibiotic production [
].
In eukaryotes, the Nascent polypeptide-Associated Complex (NAC) is a heterodimeric cytosolic protein complex composed of alpha and beta subunits. NAC binds reversibly to the ribosome where it is in contact with nascent chains as they emerge from the ribosome and protects them against inappropriate interaction with cytosolic factors. However, the cellular function of NAC seems to be much more diverse as it is also involved in transcription regulation and mitochondrial translocation [
]. Alpha and beta NACs share homology with each other, both contain a NAC A/B domain, responsible for the complex dimerisation. In archaea no beta NAC proteins are found; the complex is an homodimer of NAC alpha [,
]. These proteins have an additional ubiquitin-associated (UBA) domain which suggests the involvement of NAC in the cellular protein quality control system via the ubiquitination pathway [].This entry represents the UBA domain found at the C-terminal of the Nascent polypeptide-Associated Complex (NAC) subunit alpha. This domain is also found in HYPK (Huntingtin-interacting protein K) in which mediates a constitutively protein interaction with Naa15 auxiliary subunit from N-terminal acetyltransferase NatA. NatA associates with the ribosome and nascent polypeptides, suggesting that HYPK also interacts with these polypeptides possibly to facilitate Nt-acetylation fidelity [
].
Probable serine/threonine-protein kinase GDT family
Type:
Family
Description:
The life-cycle of Dictyostelium consists of two distinct phases: growth and development [
]. The control of the growth-differentiation transition (GDT) is not fully understood, and only a handful of genes involved in the process have been elucidated. Amongst these is a family containing at least 9 closely related genes, of which gdt1 and gdt2 were the first members to be identified [,
]. It is thought that the different family members may control similar cellular processes, but respond to different environmental cues [].The gdt1 and gdt2 gene products, GDT1 and GDT2, are similar in many ways, but there are important differences. GDT1 is a 175kDa protein with 4 putative transmembrane (TM) domains [
]. The C-terminal amino acid sequence shares some similarity with the catalytic domain of protein kinases; while this domain is well conserved in GDT2, it is thought to be non-functional in GDT1 []. Similar observations have been made for other family members: hence, for example, GDT3 and GDT4 encode complete well conserved kinase domains, and are therefore probably functional []; conversely, GDT6 and GDT8 appear to encode proteins with degenerate protein kinase domains; and the C-terminal sequence of GDT5 seems unrelated to protein kinases, while GDT7 appears to be truncated, and stops before the protein kinase domain.
This PIN domain can be found in the Pyrobaculum aerophilum proteins, Pae0151 (also known as VapC3) and Pae2754 (also known as VapC9), and their homologues [
]. They are similar to the PIN domains of the Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB toxins of the prokaryotic toxin/antitoxin operons, VapBC and FitAB, respectively, which are believed to be involved in growth inhibition by regulating translation. These toxins are nearly always co-expressed with an antitoxin, a cognate protein inhibitor, forming an inert protein complex. Disassociation of the protein complex activates the ribonuclease activity of the toxin by an, as yet undefined mechanism [,
].PIN domains are small protein domains identified by the presence of three strictly conserved acidic residues. Apart from these three residues, there is poor sequence conservation []. PIN domains are found in eukaryotes, eubacteria and archaea. In eukaryotes they are ribonucleases involved in nonsense mediated mRNA decay [] and in processing of 18S ribosomal RNA []. In prokaryotes, they are the toxic components of toxin-antitoxin (TA) systems, their toxicity arising by virtue of their ribonuclease activity. The PIN domain TA systems are now called VapBC TAs(virulence associated proteins), where VapB is the inhibitor and VapC, the PIN-domain ribonuclease toxin [].
Copines are a widely distributed class of Ca2+-dependent lipid-binding proteins. Most have a characteristic domain structure: two C2 domains in the N-terminal region and a von Willebrand A (VWA) domain in the C-terminal region. They are potentially involved in membrane trafficking, protein-protein interactions, and perhaps even cell division and growth [,
]. In plants, they are known as BONZAI proteins []. The copine family in plants may have effects in promoting growth and development in addition to repressing cell death [,
]. Caenorhabditis elegans copine, also known as Nra1, is Involved in nicotinic acetylcholine receptor (nAChR)-mediated sensitivity to nicotine and levamisole []. C2 domains fold into an 8-standed β-sandwich that can adopt 2 structural arrangements: type I and type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions. This entry represents the second C2 repeat of copines, C2B, and has a type-I topology. The C2B domains of copine-2, copine-6 and copine-7 have been shown to be responsible for the protein calcium-dependent membrane association [
].
Complement components C3, C4 and C5 are large glycoproteins that have important functions in the immune response and host defence [
]. They have a wide variety of biological activities and are proteolytically activated by cleavage at a specific site, forming a- and b-fragments []. A-fragments form distinct structural domains of approximately 76 amino acids, coded for by a single exon within the complement protein gene. The C3a, C4a and C5a components are referred to as anaphylatoxins [,
]: they cause smooth muscle contraction, histamine release from mast cells, and enhanced vascular permeability []; they also mediate chemotaxis, inflammation, and generation of cytotoxic oxygen radicals []. The proteins are highly hydrophilic, with a mainly α-helical structure held together by 3 disulphide bridges [].Some of the proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Complement C4 belongs to the Chido/Rodgers blood group system and is associated with Ch1 to Ch6, WH, Rg1 and Rg2 antigens.
This entry represents the MIR domain superfamily.The MIR domain is named after three of the proteins in which it occurs: protein Mannosyltransferase (
), Inositol 1,4,5-trisphosphate receptor (IP3R) and Ryanodine receptor (RyR). MIR domains have also been found in eukaryotic stromal cell-derived factor 2 (SDF-2) [
] and in Chlamydia trachomatis protein CT153. The MIR domain may have a ligand transferase function. This domain has a closed β-barrel structure with a hairpin triplet, and has an internal pseudo-threefold symmetry. The MIR motifs that make up the MIR domain consist of ~50 residues and are often found in multiple copies.Inositol 1,4,5-trisphosphate (InsP3) is an intracellular second messenger that transduces growth factor and neurotransmitter signals. InsP3 mediates the release of Ca
2+from intracellular stores by binding to specific Ca
2+channel-coupled receptors. Ryanodine receptors are involved in communication between transverse-tubules and the sarcoplamic reticulum of cardiac and skeletal muscle. The proteins function as a Ca2+-release channels following depolarisation of transverse-tubules [
]. The function is modulated by Ca2+, Mg2+, ATP and calmodulin. Deficiency in the ryanodine receptor may be the cause of malignant hyperthermia (MH) and of central core disease of muscle (CCD) [
]. protein O-mannosyltransferases transfer mannose from DOL-P-mannose to ser or thr residues on proteins.
This entry includes the structural accessory protein ORF7a, also called NS7a, X4 and U122, of Severe Acute Respiratory Syndrome Coronaviruses (SARS-CoV) from betacoronavirus subgenera Sarbecovirus (lineage B), including SARS-CoV-2. ORF7a/NS7a from betacoronavirus in the subgenera Sarbecovirus (B lineage) are not related to NS7a proteins from other coronavirus lineages. The structure of the structural accessory protein ORF7a, shows similarities to the immunoglobulin-like fold with some features resembling those of the Dl domain of ICAM-1 and suggests a binding activity to integrin I domains [
]. In SARS-CoV-infected cells, ORF7a is expressed and retained intracellularly within the Golgi network []. ORF7a is thought to play an important role during the SARS-CoV replication cycle []. Expression studies of ORF7a have shown that biological functions include induction of apoptosis through a caspase-dependent pathway, activation of the p38 mitogen-activated protein kinase signaling pathway, inhibition of host protein translation, and suppression of cell growth progression. These results collectively suggested that ORF7a protein may be involved in virus-host interactions [
]. Studies in SARS-CoV-2 revealed that ORF7a plays a role as antagonist of host tetherin (BST2), disrupting its antiviral effect. ORF7a binds to BST2 and sequesters it to the perinuclear region, thereby preventing its antiviral function at cell membrane [].
One of the major neuropathological hallmarks of Alzheimer's disease (AD)
is the progressive formation in the brain of insoluble amyloid plaques and vascular deposits consisting of beta-amyloid protein (beta-APP) [
].Production of beta-APP requires proteolytic cleavage of the large type-1
transmembrane (TM) protein amyloid precursor protein (APP) []. This processis performed by a variety of enzymes known as secretases. To initiate
beta-APP formation, beta-secretase cleaves APP to release a soluble N-terminal fragment (APPsBeta) and a C-terminal fragment that remains
membrane bound. This fragment is subsequently cleaved by gamma-secretase to liberate beta-APP.
Several independent studies identified a novel TM aspartic protease as the
major beta-secretase [,
,
]. This protein, termed beta-site APP cleavingenzyme 1 (BACE1), shares 64% amino acid sequence similarity with a second
enzyme, termed BACE2. Together, BACE1 and BACE2 define a novel family of aspartyl proteases [
]. Both enzymes share significant sequence similaritywith other members of the pepsin family of aspartyl proteases and contain
the two characteristic D(T/S)G(T/S) motifs that form the catalytic site.However, by contrast with other aspartyl proteases, BACE1 and BACE2 are
type I TM proteins. Each protein comprises a large lumenal domain containingthe active centre, a single TM domain and a small cytoplasmic tail.
Acyl carrier protein (ACP) is an essential cofactor in the synthesis of fatty acids by the fatty acid synthetases systems in bacteria and plants. In addition to fatty acid synthesis, ACP is also involved in many other reactions that require acyl transfer steps, such as the synthesis of polyketide antibiotics, biotin precursor, membrane-derived oligosaccharides, and activation of toxins, and functions as an essential cofactor in lipoylation of pyruvate and alpha-ketoglutarate dehydrogenase complexes [
]. Phosphopantetheine (or pantetheine 4' phosphate) is the prosthetic group of acyl carrier proteins (ACP) in some multienzyme complexes where it serves as a 'swinging arm' for the attachment of activated fatty acid and amino-acid groups []. Phosphopantetheine is attached to a serine residue in these proteins. The core structure of ACP consists of a four-helical bundle, where helix three is shorter than the others.
Several other proteins share structural homology with ACP, such as the bacterial apo-D-alanyl carrier protein, which facilitates the incorporation of D-alanine into lipoteichoic acid by a ligase, necessary for the growth and development of Gram-positive organisms [
]; and the thioester domain of the bacterial peptide carrier protein (PCP) found within large modular non-ribosomal peptide synthetases, which are responsible for the synthesis of a variety of microbial bioactive peptides [].
Vinculin is a eukaryotic protein that seems to be involved in the
attachment of the actin-based microfilaments to the plasma membrane. Vinculinis located at the cytoplasmic side of focal contacts or adhesion plaques
[]. In addition to actin, vinculin interacts with other structuralproteins such as talin and alpha-actinins.
Vinculin is a large protein of 116kDa (about a 1000 residues). Structurally the protein consists of an acidic N-terminal domain of about 90kDa separated from a basic C-terminal domain of about 25kDa by a proline-rich region of about 50 residues. The central part of the N-terminal domain consists of a variable number (3 in vertebrates, 2 in Caenorhabditis elegans) of repeats of a 110 amino acids domain.Alpha-catenins are proteins of about 100kDa which are evolutionary related to vinculin [
]. Catenins are proteins that associate with the cytoplasmic domain of a variety of cadherins. The association of catenins to cadherins produces a complex which is linked to the actin filament network, and which seems to be of primary importance for cadherins cell-adhesion properties. Three different types of catenins seem to exist: alpha, beta, and gamma. In terms of their structure the most significant differences are the absence, in alpha-catenin, of the repeated domain and of the proline-rich segment.
This entry represents the haemagglutinin-esterase fusion glycoprotein (HEF) found specifically in infectious anaemia virus (ISAV), an orthomyxovirus-type virus that is an important fish pathogen in marine aquaculture [
,
]. Other viruses, such as influenza C virus, coronaviruses and toroviruses, also contain surface HEF proteins, but whereas they usually bind 9-O-acetylsialic acid receptors, ISAV HEF appears to bind 4-O- acetylsialic acid receptors []. Haemagglutinin-esterase fusion glycoprotein is a multi-functional protein embedded in the viral envelope of ISAV. HEF is required for infectivity, and functions to recognise the host cell surface receptor, to fuse the viral and host cell membranes, and to destroy the receptor upon host cell infection. The haemagglutinin region of HEF is responsible for receptor recognition and membrane fusion. The serine esterase region of HEF is responsible for the destruction of the receptor, though it appears to be distinct from the esterase domain found in influenza C virus.Haemagglutinin-esterase glycoproteins must usually be cleaved by the host's trypsin-like proteases to produce two peptides (HEF1 and HEF2) necessary for the virus to be infectious. The cleaved HEF protein can then fuse the viral envelope to the cellular membrane of the host cell, which allows the virus to infect the host cell.
E3 ubiquitin-protein ligase TRIM63, RING finger, HC subclass
Type:
Domain
Description:
Tripartite motif-containing protein 63 (TRIM63), also known as MURF-1 is an E3 ubiquitin-protein ligase involved in ubiquitin-mediated muscle protein turnover [
,
]. It is predominantly fast (type II) fibre-associated in skeletal muscle and can bind to many myofibrillar proteins, including titin, nebulin, the nebulin-related protein NRAP, troponin-I (TnI), troponin-T (TnT), myosin light chain 2 (MLC-2), myotilin, and T-cap. The early and robust upregulation of MuRF-1 is triggered by disuse, denervation, starvation, sepsis, or steroid administration resulting in skeletal muscle atrophy. It also plays a role in maintaining titin M-line integrity []. It associates with the periphery of the M-line lattice and may be involved in the regulation of the titin kinase domain []. It also participates in muscle stress response pathways and gene expression [,
]. MuRF-1 belongs to the C-II subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox2, and a coiled coil region, as well as a COS (carboxyl-terminal subgroup one signature) box, and an acidic residue-rich (AR) domain. It also harbours a MURF family-specific conserved box (MFC) between its RING-HC finger and Bbox domains [
].This entry represents the C3HC4-type RING-HC finger found in TRIM63.
Vinculin is a eukaryotic protein that seems to be involved in the
attachment of the actin-based microfilaments to the plasma membrane. Vinculinis located at the cytoplasmic side of focal contacts or adhesion plaques
[]. In addition to actin, vinculin interacts with other structuralproteins such as talin and alpha-actinins.
Vinculin is a large protein of 116kDa (about a 1000 residues). Structurally the protein consists of an acidic N-terminal domain of about 90kDa separated from a basic C-terminal domain of about 25kDa by a proline-rich region of about 50 residues. The central part of the N-terminal domain consists of a variable number (3 in vertebrates, 2 in Caenorhabditis elegans) of repeats of a 110 amino acids domain.Alpha-catenins are proteins of about 100kDa which are evolutionary related to vinculin [
]. Catenins are proteins that associate with the cytoplasmic domain of a variety of cadherins. The association of catenins to cadherins produces a complex which is linked to the actin filament network, and which seems to be of primary importance for cadherins cell-adhesion properties. Three different types of catenins seem to exist: alpha, beta, and gamma. In terms of their structure the most significant differences are the absence, in alpha-catenin,of the repeated domain and of the proline-rich segment.
In the Escherichia coli cytosol, a fraction of the newly synthesised proteins requires the activity of molecular chaperones for folding to the native state. The major chaperones implicated in this folding process are the ribosome-associated Trigger Factor (TF), and the DnaK and GroEL chaperones with their respective co-chaperones. Trigger Factor is an ATP-independent chaperone and displays chaperone and peptidyl-prolyl-cis-trans-isomerase (PPIase) activities
in vitro. It is composed of at least three domains, an N-terminal domain which mediates association with the large ribosomal subunit, a central substrate binding and PPIase domain with homology to FKBP proteins, and a C-terminal domain of unknown function. The positioning of TF at the peptide exit channel, together with its ability to interact with nascent chains as short as 57 residues renders TF a prime candidate for being the first chaperone that binds to the nascent polypeptide chains [
].This entry represents the C-terminal domain superfamily of bacterial trigger factor proteins, which has a multi-helical structure consisting of an irregular array of long and short helices. This domain is structurally similar to the peptide-binding domain of the bacterial porin chaperone SurA.This entry also matches foldase protein PrsA N-terminal region. PrsA plays an important role in protein secretion by helping the post-translocational extracellular folding of several secreted proteins [].
The HhH motif is a stretch of approximately 20 amino acids that is present in prokaryotic and
eukaryotic non-sequence-specific DNA binding proteins [,
,
]. The HhH motif is similar to, but distinct from, the HtH motif. Both of these
motifs have two helices connected by a short turn. In the HtH motif the secondhelix binds to DNA with the helix in the major groove. This allows the contact
between specific base and residues throughout the protein. In the HhH motifthe second helix does not protrude from the surface of the protein and
therefore cannot lie in the major groove of the DNA. Crystallographic studiessuggest that the interaction of the HhH domain with DNA is mediated by amino
acids located in the strongly conserved loop (L-P-G-V) and at the N-terminalend of the second helix [
]. This interaction could involve the formation ofhydrogen bonds between protein backbone nitrogens and DNA phosphate groups
[]. The structural difference between the HtH and HhH domains is reflected at the
functional level: whereas the HtH domain, found primarily in gene regulatoryproteins and binds DNA in a sequence specific manner, the HhH domain is rather
found in proteins involved in enzymatic activities and binds DNA with nosequence specificity [
].
Listerin, also known as RING finger protein 160 or zinc finger protein 294, is the mammalian homologue of yeast Ltn1. It is widely expressed in all tissues, but motor and sensory neurons and neuronal processes in the brainstem and spinal cord are primarily affected in the mutant. Listerin is required for embryonic development and plays an important role in neurodegeneration [
]. It also functions as a critical E3 ligase involving quality control of nonstop proteins [,
]. It mediates ubiquitylation of aberrant proteins that become stalled on ribosomes during translation []. Ltn1 works with several cofactors to form a large ribosomal subunit-associated quality control complex (RQC), which mediates the ubiquitylation and extraction of ribosome-stalled nascent polypeptide chains for proteasomal degradation. It appears to first associate with nascent chain-stalled 60S subunits together with two proteins of unknown function, Tae2 and Rqc1 [].Listerin contains a long stretch of HEAT (Huntingtin, Elongation factor 3, PR65/A subunit of protein phosphatase 2A, and TOR) or ARM (Armadillo) repeats in the N terminus and middle region, and a catalytic RING-CH finger, also known as vRING or RINGv, with an unusual arrangement of zinc-coordinating residues in the C terminus. Its cysteines and histidines are arranged in the sequence as C4HC3-type, rather than the C3H2C3-type in canonical RING-H2 finger [
].
This entry represents the SH3 domain of ZO-1.The zona occuldens proteins (ZO-1, ZO-2 and ZO-3) are a family of tight junction associated proteins that function as cross-linkers, anchoring the TJ strand proteins to the actin-based cytoskeleton [
]. Each protein contains three PDZ (postsynaptic density, disc-large, ZO-1) domains, a single SH3 (Src Homology-3) domain and a catalytically inactive GK (guanylate kinase) domain, the presence of which identifies them as members of the membrane-associated guanylate kinase (MAGUK) protein family. The signature PDZ-SH3-GuK tandem of MAGUKs may form a structural supramodule with three domains interacting with each other to assemble into an integral structural unit [,
]. They also share an acidic domain at the C-terminal region of the molecules not found in other MAGUK proteins. It has been demonstrated that the first PDZ domain is involved in binding the C-terminal -Y-V motif of claudins [
]. By contrast, the occludin-binding domain of ZO-1 has been shown to lie in the GK and acidic domains []. Although the precise location of the actin-binding motif has not been elucidated, it appears to be within the C-terminal half of the molecules, since transfection of this region into fibroblasts induces co-localisation of ZO-1 and ZO-2 with actin fibres.
This entry represents the SH3 domain of ZO-2.The zona occuldens proteins (ZO-1, ZO-2 and ZO-3) are a family of tight junction associated proteins that function as cross-linkers, anchoring the TJ strand proteins to the actin-based cytoskeleton [
]. Each protein contains three PDZ (postsynaptic density, disc-large, ZO-1) domains, a single SH3 (Src Homology-3) domain and a catalytically inactive GK (guanylate kinase) domain, the presence of which identifies them as members of the membrane-associated guanylate kinase (MAGUK) protein family. The signature PDZ-SH3-GuK tandem of MAGUKs may form a structural supramodule with three domains interacting with each other to assemble into an integral structural unit [,
]. They also share an acidic domain at the C-terminal region of the molecules not found in other MAGUK proteins. It has been demonstrated that the first PDZ domain is involved in binding the C-terminal -Y-V motif of claudins [
]. By contrast, the occludin-binding domain of ZO-1 has been shown to lie in the GK and acidic domains []. Although the precise location of the actin-binding motif has not been elucidated, it appears to be within the C-terminal half of the molecules, since transfection of this region into fibroblasts induces co-localisation of ZO-1 and ZO-2 with actin fibres.
PLEKHN1 (also known as CLPABP) is a cardiolipin phosphatidic acid binding protein that associates with microtubules and accumulates in RNA granules, which contain cytochrome-c mRNA [
]. It binds to Bid (a pro-apoptotic protein), removes Bid from transient Bid-Bax complexes, and promotes apoptosis []. This entry represents the PH domain of PLEKHN1. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner [
]. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity []. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane []. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes [].
The HhH motif is a stretch of approximately 20 amino acids that is present in prokaryotic and
eukaryotic non-sequence-specific DNA binding proteins [,
,
]. The HhH motif is similar to, but distinct from, the HtH motif. Both of these
motifs have two helices connected by a short turn. In the HtH motif the secondhelix binds to DNA with the helix in the major groove. This allows the contact
between specific base and residues throughout the protein. In the HhH motifthe second helix does not protrude from the surface of the protein and
therefore cannot lie in the major groove of the DNA. Crystallographic studiessuggest that the interaction of the HhH domain with DNA is mediated by amino
acids located in the strongly conserved loop (L-P-G-V) and at the N-terminalend of the second helix [
]. This interaction could involve the formation ofhydrogen bonds between protein backbone nitrogens and DNA phosphate groups
[]. The structural difference between the HtH and HhH domains is reflected at the
functional level: whereas the HtH domain, found primarily in gene regulatoryproteins and binds DNA in a sequence specific manner, the HhH domain is rather
found in proteins involved in enzymatic activities and binds DNA with nosequence specificity [
].
Acyl carrier protein (ACP) is an essential cofactor in the synthesis of fatty acids by the fatty acid synthetases systems in bacteria and plants. In addition to fatty acid synthesis, ACP is also involved in many other reactions that require acyl transfer steps, such as the synthesis of polyketide antibiotics, biotin precursor, membrane-derived oligosaccharides, and activation of toxins, and functions as an essential cofactor in lipoylation of pyruvate and alpha-ketoglutarate dehydrogenase complexes [
]. Phosphopantetheine (or pantetheine 4' phosphate) is the prosthetic group of acyl carrier proteins (ACP) in some multienzyme complexes where it serves as a 'swinging arm' for the attachment of activated fatty acid and amino-acid groups []. Phosphopantetheine is attached to a serine residue in these proteins. The core structure of ACP consists of a four-helical bundle, where helix three is shorter than the others.Several other proteins share structural homology with ACP, such as the bacterial apo-D-alanyl carrier protein, which facilitates the incorporation of D-alanine into lipoteichoic acid by a ligase, necessary for the growth and development of Gram-positive organisms [
]; and the thioester domain of the bacterial peptide carrier protein (PCP) found within large modular non-ribosomal peptide synthetases, which are responsible for the synthesis of a variety of microbial bioactive peptides [].
This entry represents the IIIC subfamily of the Haloacid Dehalogenase (HAD) superfamily of aspartate nucleophile hydrolases. Subfamily III, which includes subfamily IIIA (
) and subfamily IIIB (
) contains sequences which do not contain either of the insert domains between the 1st and 2nd conserved catalytic motifs, subfamily I (
,
,
, and
), or between the 2nd and 3rd, subfamily II (
, and
). Subfamily IIIC contains five relatively distantly related clades: a family of viral proteins (
), a family of eukaryotic proteins called MDP-1 and a family of archaeal proteins most closely related to MDP-1 (), a family of bacteria including the Streptomyces FkbH protein (
), and a small clade including the Pasteurella multocida BcbF and EcbF proteins. The overall lack of species overlap among these clades may indicate a conserved function, but the degree of divergence between the clades and the differences in architecture outside of the domain in some clades warns against such a conclusion.
No member of this subfamily is characterised with respect to function, however the MDP-1 protein [] is a characterised phosphatase. All of the characterised enzymes within subfamily III are phosphatases, and all of the active site residues characteristic of HAD-superfamily phosphatases [] are present in subfamily IIIC.
All known carboxypeptidases are either metallo carboxypeptidases or serine
carboxypeptidases (and
). The catalytic activity of the serine carboxypeptidases, like that of the trypsin family serine proteases, is provided by a charge relay system involving an aspartic acid residue hydrogen-bonded to a histidine, which is itself hydrogen-bonded to a serine [
]. Proteins known to be serine carboxypeptidases include:Barley and wheat serine carboxypeptidases I, II, and III [
]. Yeast carboxypeptidase Y (YSCY) (gene PRC1), a vacuolar protease involved
in degrading small peptides. Yeast KEX1 protease, involved in killer toxin and alpha-factor precursor
processing. Fission yeast sxa2, a probable carboxypeptidase involved in degrading or
processing mating pheromones []. Penicillium janthinellum carboxypeptidase S1 [
]. Aspergullus niger carboxypeptidase pepF. Aspergullus satoi carboxypeptidase cpdS. Vertebrate protective protein / cathepsin A [
], a lysosomal protein whichis not only a carboxypeptidase but also essential for the activity of both
beta-galactosidase and neuraminidase. Mosquito vitellogenic carboxypeptidase (VCP) [
]. Naegleria fowleri virulence-related protein Nf314 [
]. Yeast hypothetical protein YBR139w. Caenorhabditis elegans hypothetical proteins C08H9.1, F13D12.6, F32A5.3, F41C3.5 and K10B2.2. In higher plants and fungi serine carboxypeptidases are found in the cell vacuoles. In animal cells serine carboxypeptidases are found lysosomes [
].The sequences surrounding the active site histidine residue are highly conserved in all these serine carboxypeptidases.
Correct eggshell formation relies on a complex series of events that relies on expression, cleavage and transport of various proteins at appropriate times. In Drosophila, the eggshell framework is laid down between the developing oocyte and the overlying follicle cells during late oogenesis. Five distinct layers are observed in Drosophila eggshells: the oocyte proximal vitelline membrane, a lipid wax layer, the inner chorion layer, the endochorion and exochorion layers [
].The inner chorion layer is continuous and characterised by its periodic structure. Genes encoding chorion proteins are expressed from oocyte development stage 11 onwards. Chorion synthesis occurs during the last 5-6 hours of oogenesis and demands the production of large amounts of protein. Amplification of the two chorion gene clusters meets demand for large scale protein production and expression is precisely regulated through tight transcriptional control of the chorion genes. Chorion proteins may be described according to the times at which they are expressed in the follicular cells: developmentally early (s36, s38), middle (s19, s16) or late (s18, s15).This family consists of several examples of the Drosophila melanogaster specific chorion protein S16. The chorion genes of Drosophila are amplified in response to developmental signals in the follicle cells of the ovary [
].
Phosphotyrosyl phosphatase activator, C-terminal lid domain
Type:
Homologous_superfamily
Description:
Phosphotyrosyl phosphatase activator (PTPA, also known as protein phosphatase 2A activator) proteins stimulate the phosphotyrosyl phosphatase (PTPase) activity of the dimeric form of protein phosphatase 2A (PP2A). PTPase activity in PP2A (in vitro) is relatively low when compared to the better recognised phosphoserine/ threonine protein phosphorylase activity. It also reactivates the serine/threonine phosphatase activity of an inactive form of PP2A. The specific biological role of PTPA is unknown. PTPA has been suggested to play a role in the insertion of metals to the PP2A catalytic subunit (PP2Ac) active site, to act as a chaperone, and more recently, to have peptidyl prolyl cis/trans isomerase activity that specifically targets human PP2Ac [
,
,
,
,
,
]. Together, PTPA and PP2A constitute an ATPase and it has been suggested that PTPA alters the relative specificity of PP2A from phosphoserine/phosphothreonine substrates to phosphotyrosine substrates in an ATP-hydrolysis-dependent manner. Basal expression of PTPA depends on the activity of a ubiquitous transcription factor, Yin Yang 1 (YY1). The tumour suppressor protein p53 can inhibit PTPA expression through an unknown mechanism that negatively controls YY1 [].PTPA is an α-helical protein which consists of three distinct domains: the core, the linker and the lid. This superfamily consists of the lid domain of PTPA, located usually at the C-terminal.
Membrane contacts sites (MCSs), regions where two organelles come in close
proximity to one another, act as molecular hubs for the exchange of smallmolecules (e.g. lipids) and signals (e.g. calcium ions). Synaptotagmin-like
Mitochondrial lipid-binding Proteins (SMP) domains are exclusively found atMCSs between different organelles such as endoplasmic reticulum (ER)-
Mitochondrion, ER-Plasma membrane (PM) and Nucleus-Vacuole junctions. The SMPdomain is able to homo- or heterodimerize, harbors lipids in a hydrophobic
cavity and mediates lipid transfer between the two adjacent bilayersindependently of membrane fusion and fission reactions. SMP proteins are
widespread amongst eukaryotic species with a particular enrichment in plantsand features suggestive of species-specific functional variations. SMP domain-
containing proteins have been classified into four broad groups: C2 domainsynaptotagmin-like, PH domain-containing HT-008, PDZK8 and mitochondrial
protein families [,
,
,
,
,
].The SMP domain consists of 6 β-strands and 3 helices arranged to form a
barrel whose interior is lined almost exclusively by hydrophobic residues. The resulting elongated barrel-shaped cylindrical structureharbors a lateral opening and a central hydrophobic cavity where phospholipids
can bind. It dimerizes in an anti-parallel fashion to form a cylindertraversed by a deep hydrophobic groove [
,
,
]. The SMP domain belongs to theTULIP (for TUbular LIPid-binding) protein superfamily of lipid transfer
proteins [].
This entry includes the alpha-protein kinase 1, a serine/threonine-protein kinase that has no homology to conventional protein kinases and phosphorylate amino acids located within α-helices [
]. The N-terminal region had several conserved secondary α-helix structures with amphipathic properties []. The structure contains 18 helices (α1 to α18), forming seven antiparallel pairs. The one-by-one-packed seven helix pairs form a right-hand solenoid, similar to the tetratricopeptide repeat (TPR) domain structure [].This kinase detects bacterial pathogen-associated molecular pattern metabolites (PAMPs) and initiates an innate immune response, a critical step for pathogen elimination and engagement of adaptive immunity. Specifically recognises and binds ADP-D-glycero-beta-D-manno-heptose (ADP-Heptose), a potent PAMP present in all Gram-negative and some Gram-positive bacteria. ADP-Heptose binds to a narrow pocket of the concave side of the alpha-protein kinase 1 N-terminal which stimulates its kinase activity to phosphorylate and activate TIFA, triggering proinflammatory NF-kappa-B signalling [
,
]. In addition, alpha-protein kinase 1 may play a role in regulating intracellular trafficking processes through the phosphorylation of myosin MYO1A []. It has been shown that this kinase is also involved in monosodium urate monohydrate (MSU)-induced inflammation and gout by phosphorylation of myosin motor protein which enable the vesicle transport of of certain cytokines (like TNF-alpha) to the plasma membrane [].
This entry represents the PX domain found in Sorting nexin-3 (SNX3) from vertebrates. In budding yeasts, Snx3 homologue has been shown to associate with early endosomes through a PX domain-mediated interaction with phosphatidylinositol-3-phosphate (PI3P) [
]. It associates with the retromer complex, a membrane coat multimeric complex required for endosomal retrieval of lysosomal hydrolase receptors to the Golgi, and functions as a cargo-specific adaptor for the retromer. SNX3 is required for the formation of multivesicular bodies, which function as transport intermediates to late endosomes [,
]. It also promotes cell surface expression of the amiloride-sensitive epithelial Na+ channel (ENaC), which is critical in sodium homeostasis and maintenance of extracellular fluid volume [].The Phox Homology (PX) domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds phosphoinositides (PIs) and targets the protein to PI-enriched membranes [
,
]. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway [,
,
].
Type III secretion system substrate exporter, C-terminal
Type:
Homologous_superfamily
Description:
Salmonella, and related proteobacteria, secrete large amounts of proteins into the culture media. The major secreted proteins are either flagellar proteins or virulence factors [
], secreted through the flagellar or virulence export structures respectively. Both secretion systems penetrate the inner and outer membranes and their components bear substantial sequence similarity. Both the flagellar and needle like pilus look fairly similar to each other []. The type III secretion system is of great interest, as it is used to transport virulence factors from the pathogen directly into the host cell [
] and is only triggered when the bacterium comes into close contact with the host. It is believed that the family of type III flagellar and pilus inner membrane proteins are used as structural moieties in a complex with several other subunits [
]. One such set of inner membrane proteins, labeled "S"here for nomenclature purposes, includes the Salmonella and Shigella SpaS, the Yersinia YscU, Rhizobium Y4YO, and the Erwinia HrcU genes, Salmonella FlhB and Escherichia coli EscU [
,
,
,
].This superfamily represent the C-terminal domain of the type III secretion system substrate exporters. Many of the proteins containing this domain undergo autocatalytic cleavage promoted by cyclisation of a conserved asparagine.
The type III secretion system of Gram-negative bacteria is used to transport virulence factors from the pathogen directly into the host cell [
] and is only triggered when the bacterium comes into close contact with the host. Effector proteins secreted by the type III system do not possess a secretion signal, and are considered unique because of this. Yersinia spp. secrete effector proteins called YopB and YopD that facilitate the spread of other translocated proteins through the type III needle and the host cell cytoplasm []. In turn, the transcription of these moieties is thought to be regulated by another gene, lcrV, found on the Yops virulon that encodes the entire type III system []. The product of this gene, LcrV protein, also regulates the secretion of YopD through the type III translocon [], and itself acts as a protective "V"antigen for Yersinia pestis, the causative agent of plague [
].Recently, a homologue of the Y. pestis LcrV protein (PcrV) was found in Pseudomonas aeruginosa, an opportunistic pathogen. In vivo studies using mice found that immunisation with the protein protected burned animals from infection by P. aeruginosa, and enhanced survival. In addition, it is speculated that PcrV determines the size of the needle pore for type III secreted effectors [
].
This group of proteins belong to the cysteine peptidase family C1, sub-family C1A (papain family, clan CA). The papain family has a wide variety of activities, including broad-range (papain) and narrow-range endo-peptidases, aminopeptidases, carboxypeptidases, dipeptidyl peptidases and enzymes with both exo- and endo-peptidase activity [
]. Members of the papain family are widespread, found in bacteria, archaea, fungi, and practically all protozoa, plants and mammals [], and some viruses such as baculoviruses []. The proteins are typically lysosomal or secreted. The catalytic residues of papain are Cys-25 and His-159, other important residues being Gln-19, which helps form the 'oxyanion hole', and Asn-175, which orientates the imidazole ring of His-159. Most papain-like cysteine peptidases are irreversibly inhibited by the synthetic inhibitor E64 []. Leupeptin is a reversible inhibitor but is also an inhibitor of chymotrypsin-like serine peptidases.A papain-like cysteine proteinase is typically synthesised as an inactive precursor (or zymogen) with an N-terminal propeptide. Activation requires removal of the propeptide. The propeptide is required for the proper folding of the newly synthesised enzyme, maintaining the peptidase in an inactive state and stabilisation of the enzyme against denaturing at neutral to alkaline pH conditions. Amino acid residues within the pro-region mediate their membrane association, and play a role in the transport of the proenzyme to lysosomes. A propeptide can exhibit high selectivity for inhibition of the peptidase from which it originates [
].The subfamily includes the following well characterised peptidases:
Animal lysosomal peptidases such as cathepsins B (EC 3.4.22.1), L (EC 3.4.22.15), H (EC 3.4.22.16), S (EC 3.4.22.27), K (EC 3.4.22.38), F (EC 3.4.22.41), O (EC 3.4.22.42), V (EC 3.4.22.43) and X (a carboxypeptidase, EC 3.4.18.1).Plant peptidases such as papain (EC 3.4.22.2), ficin (EC 3.4.22.3), chymopapain (EC 3.4.22.6), asclepain A (EC 3.4.22.7), actinidin (EC 3.4.22.14), glycyl endopeptidase (EC 3.4.22.25), caricain (EC 3.4.22.30), ananain (EC 3.4.22.31), stem bromelain (EC 3.4.22.32 and fruit bromelain (EC 3.4.22.33). Protozoan peptidases such as histolysain (EC 3.4.22.35) and cruzipain (EC 3.4.22.51).Viral peptidases such as V-cath (EC 3.4.22.50).There are also proteins in the family that are not peptidases because one or more of the active site residues is not conserved. These include testin, tubulointerstitial nephritis antigen and silicatein.A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families [
]. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid,
N-ethylmaleimide or
p-chloromercuribenzoate.
Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [
].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [
]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [
]. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.
Arteriviruses are enveloped, positive-stranded RNA viruses and include
pathogens of major economic concern to the swine- and horse-breedingindustries:Equine arteritis virus (EAV).Porcine reproductive and respiratory syndrome virus (PRRSV).Mice actate dehydrogenase-elevating virus.Simian hemorrhagic fever virus.The arterivirus cysteine protease (AV CP) is the most carboxyl-terminally
located member of the array of three cysteine proteinase domains present inthe amino-terminal 500 residues of the replicase polyproteins. The AV CP is
located in the amino-terminal region of nsp2 and is highly conserved amongarteriviruses. The cleavage of the nsp2|3 junction appears to be the single
processing step mediated by the AV CP. For EAV, it has been shown that cleavednsp2 is an essential co-factor for cleavage of the nsp4|6 site by the nsp4
proteinase domain. The AV CP is an unusual Cys protease withamino acid sequence similarities to both papain-like and chymotrypsin-like
proteases. The catalytic dyad is composed of Cys and His residues [,
,
]. The AV CP domain forms MEROPS peptidase family C33.The entire AV CP domain is highly conserved among arteriviruses.
Among the conserved residues are a number of cysteines and one aspartateresidues [
].A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families [
]. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid,
N-ethylmaleimide or
p-chloromercuribenzoate.
Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [
]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [
]. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.
This group of sequences defined by this cysteine peptidase domain belong to the MEROPS peptidase family C39 (clan CA). It is found in a wide range of ABC transporters, which are maturation proteases for peptide bacteriocins, the proteolytic domain residing in the N-terminal region of the protein [
]. A number of the proteins are classified as non-peptidase homologues as they either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity.Lantibiotic and non-lantibiotic bacteriocins are synthesised as precursor peptides containing N-terminal extensions (leader peptides) which are cleaved off during maturation. Most non-lantibiotics and also some lantibiotics have leader peptides of the so-called double-glycine type. These leader peptides share consensus sequences and also a common processing site with two conserved glycine residues in positions -1 and -2. The double- glycine-type leader peptides are unrelated to the N-terminal signal sequences which direct proteins across the cytoplasmic membrane via the sec pathway. Their processing sites are also different from typical signal peptidase cleavage sites, suggesting that a different processing enzyme is involved. A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families [
]. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid,
N-ethylmaleimide or
p-chloromercuribenzoate.
Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [
].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [
]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [
]. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.
ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [
].The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [,
,
].The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly β-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel β-sheet of armI by a two-fold axis [
,
,
,
,
,
].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions [
]. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [,
]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [,
,
].A number of bacterial transport systems have been found to contain integral
membrane components that have similar sequences []: these systems fit thecharacteristics of ATP-binding cassette transporters [
]. Theproteins form homo- or hetero-oligomeric channels, allowing ATP-mediated
transport. Hydropathy analysis of the proteins has revealed the presenceof 6 possible transmembrane regions. These proteins belong to family 2 of ABC transporters.
ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [
].The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [,
,
].The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly β-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel β-sheet of armI by a two-fold axis [
,
,
,
,
,
].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions []. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [,
]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [,
,
].This model describes the ATP binding subunits of nitrate transport in bacteria and archaea. This protein belongs to the ATP-binding cassette (ABC) superfamily. It is thought that the two subunits encoded by ntrC and ntrD form the binding surface for interaction with ATP. This model is restricted in identifying ATP binding subunit associated with the nitrate transport. Nitrate assimilation is aided by other proteins derived from the operon which among others include products of ntrA - a regulatory protein; ntrB - a hydropbobic transmembrane permease and narB - a reductase.
This entry contains the ATP-binding subunit, TagH, of the ABC transporter complex involved in the export of teichoic acids. The Bacillus subtilis complex is composed of two ATP-binding proteins (TagH) and two transmembrane proteins (TagG) [
]. Protein containing this domain also include KpsT, which is involved with the export of capsular polysialic acid in Escherichia coli K1 [].ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [
].The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [,
,
].The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly β-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel β-sheet of armI by a two-fold axis [
,
,
,
,
,
].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions [
]. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [,
]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [,
,
].
This entry represents the ATP-binding domain of MacB, an ABC transporter involved in the export of in macrolide antibiotics [
]. It is also found in proteins involved in cell division (FtsE) and release of lipoproteins from the cytoplasmic membrane (LolCDE). An FtsE null mutants showed filamentous growth and appeared viable on high salt medium only, indicating a role for FtsE in cell division and/or salt transport [
]. The LolCDE complex catalyzes the release of lipoproteins from the cytoplasmic membrane prior to their targeting to the outer membrane [].ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [
].The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [
,
,
].The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly β-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel β-sheet of armI by a two-fold axis [
,
,
,
,
,
].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions [
]. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [,
]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [,
,
].
ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [
].The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [,
,
].The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly β-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel β-sheet of armI by a two-fold axis [,
,
,
,
,
].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions [
]. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [,
]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [,
,
].LolD is the part of the LolCDE complex. LolCDE is an ATP-binding cassette (ABC) transporter, releasing lipoproteins from the inner membrane of Escherichia coli, thereby initiating lipoprotein sorting to the outer membrane. The LolCDE complex is composed of two copies of an ATPase subunit, LolD, and one copy each of integral membrane subunits LolC and LolE. LolD hydrolyses ATP on the cytoplasmic side of the inner membrane, while LolC and/or LolE recognise and release lipoproteins anchored to the periplasmic leaflet of the inner membrane [
,
].
Zinc finger, large T-antigen D1 domain superfamily
Type:
Homologous_superfamily
Description:
Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [
,
,
,
,
]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents the zinc finger domain superfamily found in the large T-antigen (T-Ag) as the D1 domain. The T-Ag is found in a group of polyomaviruses consisting of the homonymous murine virus (Py) as well as other representative members such as the Simian virus 40 (SV40) and the human BK polyomavirus (BKPyV) and JC polyomavirus (JCPyV) [
]. T-antigen and replication protein E1 share the same domain architecture and functionality despite low sequence similarity []. Their large T antigen (T-Ag) protein binds to and activates DNA replication from the origin of DNA replication (ori). Insofar as is known, the T-Ag binds to the origin first as a monomer to its pentanucleotide recognition element. The monomers are then thought to assemble into hexamers and double hexamers, which constitute the form that is active in initiation of DNA replication. When bound to the ori, T-Ag double hexamers encircle DNA []. T-Ag is a multi-domain protein that contains an N-terminal J domain, a central origin-binding domain (OBD), and a C-terminal superfamily 3 helicase domain []. The helicase domain actually contains three distinct structural domains: D1 (domain 1), D2 and D3. D1 is the Zn domain at the N terminus and contains five α-helices (α1-α5). The Zn atom coordinated by a Zn motif is important in holding α3 (α-helix 3) and α4 together, which in turn provide an anchor for α1 and α2. The beginning of α5 packs with α1 and α2 of D1, but its C terminus extends to α6 of D3. The D2 domain contains three conserved helicase motifs related to SF3 helicases, namely the modified version of Walker A and B motifs and motif C. D2 folds into a core β-sheet consisting of five parallel β-strands sandwiched by α-helices. The third domain, D3, is all α-helical. Its seven α-helices originate from both the N-terminal region (α6-α8) and the C terminus (α13-α16), with D2 inserted between [].The Zn motif of T-Ag was proposed to form a canonical zinc-finger structure for DNA binding. However, the Zn domain (D1) has a globular fold stabilised by the coordination of a Zn atom through the Zn motif, and no classical zinc-finger structure specialised for DNA binding is present. The Zn motif is not directly involved in binding DNA but is instead important for stabilising the Zn-domain structure [
].
This entry represents the cysteine proteinase of hepatitis E virus (HEV), which is a papain-like protease that cleaves the viral polyprotein encoded by ORF1 of the hepatitis E virus [
,
,
,
].A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families [
]. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid,
N-ethylmaleimide or
p-chloromercuribenzoate.
Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [
]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [
]. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Lysophospholipids (LPs), such as lysophosphatidic acid (LPA), sphingosine
1-phosphate (S1P) and sphingosylphosphorylcholine (SPC), have long been known to act as signalling molecules in addition to their roles as intermediates in membrane biosynthesis []. They have roles in the regulation of cell growth, differentiation, apoptosis and development, and have been implicated in a wide range of pathophysiological conditions, including: blood clotting, corneal wounding, subarachinoid haemorrhage, inflammation and colitis []. A number of G protein-coupled receptors bind members of the lysophopholipid family - these include: the cannabinoid receptors; platelet activating factor receptor; OGR1, an SPC receptor identified in ovarian cancer cell lines; PSP24, an orphan receptor that has been proposed to bind LPA; and at least 8 closely related receptors, the EDG family, that bind LPA and S1P [].S1P is released from activated platelets and is also produced by a number of other cell types in response to growth factors and cytokines [
]. It is proposed to act both as an extracellular mediator and as an intracellularsecond messenger. The cellular effects of S1P include growth related effects, such as proliferation, differentiation, cell survival and apoptosis, and cytoskeletal effects, such as chemotaxis, aggregation, adhesion, morphological change and secretion. The molecule has been implicated in control of angiogenesis, inflammation, heart-rate and tumour progression, and may play an important role in a number of disease states, such as atherosclerosis, and breast and ovarian cancer [
]. Recently, 5 G protein-coupled receptors have been identified that act as high affinity receptors for S1P, and also as low affinity receptors for the related lysophospholipid, SPC []. EDG-1, EDG-3, EDG-5 and EDG-8 share a high degree of similarity, and are also referred to as lpB1, lpB3, lpB2 and lpB4, respectively. EDG-6 is referred to as lpC1, reflecting its more distant relationship to the other S1P receptors.EDG-6 is expressed predominantly in lymphoid and haematopoietic tissues and
in lung, a distribution that is quite restricted relative to other EDGfamily members [
]. Binding of S1P to the receptor leads to activation of phospholipase C and MAP kinases in a pertussis toxin sensitive manner, through coupling to proteins of the Gi class. Whether EDG-6 can couple to any other G protein families is currently not known [].
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Lysophospholipids (LPs), such as lysophosphatidic acid (LPA), sphingosine
1-phosphate (S1P) and sphingosylphosphorylcholine (SPC), have long been known to act as signalling molecules in addition to their roles as intermediates in membrane biosynthesis []. They have roles in the regulation of cell growth, differentiation, apoptosis and development, and have been implicated in a wide range of pathophysiological conditions, including: blood clotting, corneal wounding, subarachinoid haemorrhage, inflammation and colitis []. A number of G protein-coupled receptors bind members of the lysophopholipid family - these include: the cannabinoid receptors; platelet activating factor receptor; OGR1, an SPC receptor identified in ovarian cancer cell lines; PSP24, an orphan receptor that has been proposed to bind LPA; and at least 8 closely related receptors, the EDG family, that bind LPA and S1P [].S1P is released from activated platelets and is also produced by a number of other cell types in response to growth factors and cytokines [
]. It is proposed to act both as an extracellular mediator and as an intracellularsecond messenger. The cellular effects of S1P include growth related effects, such as proliferation, differentiation, cell survival and apoptosis, and cytoskeletal effects, such as chemotaxis, aggregation, adhesion, morphological change and secretion. The molecule has been implicated in control of angiogenesis, inflammation, heart-rate and tumour progression, and may play an important role in a number of disease states, such as atherosclerosis, and breast and ovarian cancer [
]. Recently, 5 G protein-coupled receptors have been identified that act as high affinity receptors for S1P, and also as low affinity receptors for the related lysophospholipid, SPC []. EDG-1, EDG-3, EDG-5 and EDG-8 share a high degree of similarity, and are also referred to as lpB1, lpB3, lpB2 and lpB4, respectively. EDG-6 is referred to as lpC1, reflecting its more distant relationship to the other S1P receptors.This entry represents EDG-1 (Sphingosine 1-phosphate receptor 1, also known as Endothelial differentiation G-protein coupled receptor 1) [
]. EDG-1 is expressed widely, with highest levels in the brain, heart, lung, liver and spleen. Moderate levels are also found in the thymus, kidney and muscle []. Within these regions, EDG-1 is expressed in endothelial cells, vascular smooth muscle, fibroblasts, melanocytes and cells of epithelioid origin []. Upon binding of S1P, the receptor can couple to Gi1, Gi2, Gi3, Go and Gz type G proteins, leading to inhibition of adenylyl cylase, phospholipase C activation and MAP kinase activation [,
].
Potassium channels are the most diverse group of the ion channel family [
,
]. They are important in shaping the action potential, and in neuronal excitability and plasticity []. The potassium channel family is composed of several functionally distinct isoforms, which can be broadly separated into 2 groups []: the practically non-inactivating 'delayed' group and the rapidly inactivating 'transient' group.These are all highly similar proteins, with only small amino acid changes causing the diversity of the voltage-dependent gating mechanism, channel conductance and toxin binding properties. Each type of K
+channel is activated by different signals and conditions depending on their type of regulation: some open in response to depolarisation of the plasma membrane; others in response to hyperpolarisation or an increase in intracellular calcium concentration; some can be regulated by binding of a transmitter, together with intracellular kinases; while others are regulated by GTP-binding proteins or other second messengers [
]. In eukaryotic cells, K+channels are involved in neural signalling and generation of the cardiac rhythm, act as effectors in signal transduction pathways involving G protein-coupled receptors (GPCRs) and may have a role in target cell lysis by cytotoxic T-lymphocytes [
]. In prokaryotic cells, they play a role in the maintenance of ionic homeostasis [
].All K
+channels discovered so far possess a core of alpha subunits, each comprising either one or two copies of a highly conserved pore loop domain (P-domain). The P-domain contains the sequence (T/SxxTxGxG), which has been termed the K
+selectivity sequence. In families that contain one P-domain, four subunits assemble to form a selective pathway for K
+across the membrane. However, it remains unclear how the 2 P-domain subunits assemble to form a selective pore. The functional diversity of these families can arise through homo- or hetero-associations of alpha subunits or association with auxiliary cytoplasmic beta subunits. K
+channel subunits containing one pore domain can be assigned into one of two superfamilies: those that possess six transmembrane (TM) domains and those that possess only two TM domains. The six TM domain superfamily can be further subdivided into conserved gene families: the voltage-gated (Kv) channels; the KCNQ channels (originally known as KvLQT channels); the EAG-like K
+channels; and three types of calcium (Ca)-activated K
+channels (BK, IK and SK) [
]. The 2TM domain family comprises inward-rectifying K+channels. In addition, there are K
+channel alpha-subunits that possess two P-domains. These are usually highly regulated K
+selective leak channels.
Inwardly-rectifying potassium channels (Kir) are the principal class of two-TM domain potassium channels. They are characterised by the property of inward-rectification, which is described as the ability to allow large inward currents and smaller outward currents. Inwardly rectifying potassium channels (Kir) are responsible for regulating diverse processes including: cellular excitability, vascular tone, heart rate, renal salt flow, and insulin release [
]. To date, around twenty members of this superfamily have been cloned, which can be grouped into six families by sequence similarity, and these are designated Kir1.x-6.x [,
].Cloned Kir channel cDNAs encode proteins of between ~370-500 residues, both N- and C-termini are thought to be cytoplasmic, and the N terminus lacks a signal sequence. Kir channel alpha subunits possess only 2TM domains linked with a P-domain. Thus, Kir channels share similarity with the fifth and sixth domains, and P-domain of the other families. It is thought that four Kir subunits assemble to form a tetrameric channel complex, which may be hetero- or homomeric [
].The Kir3.x channel family is gated by G-proteins following G-protein
coupled receptor (GPCR) activation. They are widely distributed inneuronal, atrial, and endocrine tissues and play key roles in generating
late inhibitory postsynaptic potentials, slowing the heart rate andmodulating hormone release. They are directly activated by G-protein
beta-gamma subunits released from G-protein heterotrimers of the G(i/o)family upon appropriate receptor stimulation.Kir3.1 channels are thought to form heteromers in vivo: in heart,
consisting of Kir3.1 and Kir3.2, and in brain, Kir3.1 with Kir 3.4.
Potassium channels are the most diverse group of the ion channel family [
,
]. They are important in shaping the action potential, and in neuronal excitability and plasticity []. The potassium channel family is composed of several functionally distinct isoforms, which can be broadly separated into 2 groups []: the practically non-inactivating 'delayed' group and the rapidly inactivating 'transient' group.These are all highly similar proteins, with only small amino acid changes causing the diversity of the voltage-dependent gating mechanism, channel conductance and toxin binding properties. Each type of K
+channel is activated by different signals and conditions depending on their type of regulation: some open in response to depolarisation of the plasma membrane; others in response to hyperpolarisation or an increase in intracellular calcium concentration; some can be regulated by binding of a transmitter, together with intracellular kinases; while others are regulated by GTP-binding proteins or other second messengers [
]. In eukaryotic cells, K+channels are involved in neural signalling and generation of the cardiac rhythm, act as effectors in signal transduction pathways involving G protein-coupled receptors (GPCRs) and may have a role in target cell lysis by cytotoxic T-lymphocytes [
]. In prokaryotic cells, they play a role in the maintenance of ionic homeostasis [].All K
+channels discovered so far possess a core of alpha subunits, each comprising either one or two copies of a highly conserved pore loop domain (P-domain). The P-domain contains the sequence (T/SxxTxGxG), which has been termed the K
+selectivity sequence. In families that contain one P-domain, four subunits assemble to form a selective pathway for K
+across the membrane. However, it remains unclear how the 2 P-domain subunits assemble to form a selective pore. The functional diversity of these families can arise through homo- or hetero-associations of alpha subunits or association with auxiliary cytoplasmic beta subunits. K
+channel subunits containing one pore domain can be assigned into one of two superfamilies: those that possess six transmembrane (TM) domains and those that possess only two TM domains. The six TM domain superfamily can be further subdivided into conserved gene families: the voltage-gated (Kv) channels; the KCNQ channels (originally known as KvLQT channels); the EAG-like K
+channels; and three types of calcium (Ca)-activated K
+channels (BK, IK and SK) [
]. The 2TM domain family comprises inward-rectifying K+channels. In addition, there are K
+channel alpha-subunits that possess two P-domains. These are usually highly regulated K
+selective leak channels.
Inwardly-rectifying potassium channels (Kir) are the principal class of two-TM domain potassium channels. They are characterised by the property of inward-rectification, which is described as the ability to allow large inward currents and smaller outward currents. Inwardly rectifying potassium channels (Kir) are responsible for regulating diverse processes including: cellular excitability, vascular tone, heart rate, renal salt flow, and insulin release [
]. To date, around twenty members of this superfamily have been cloned, which can be grouped into six families by sequence similarity, and these are designated Kir1.x-6.x [,
].Cloned Kir channel cDNAs encode proteins of between ~370-500 residues, both N- and C-termini are thought to be cytoplasmic, and the N terminus lacks a signal sequence. Kir channel alpha subunits possess only 2TM domains linked with a P-domain. Thus, Kir channels share similarity with the fifth and sixth domains, and P-domain of the other families. It is thought that four Kir subunits assemble to form a tetrameric channel complex, which may be hetero- or homomeric [
].The Kir3.x channel family is gated by G-proteins following G-protein
coupled receptor (GPCR) activation. They are widely distributed inneuronal, atrial, and endocrine tissues and play key roles in generating
late inhibitory postsynaptic potentials, slowing the heart rate andmodulating hormone release. They are directly activated by G-protein
beta-gamma subunits released from G-protein heterotrimers of the G(i/o)family upon appropriate receptor stimulation.Kir3.3 does not generate receptor-evoked, or constitutively-active K+
currents, when heterologously expressed in Xenopus oocytes. It maytherefore contribute to Kir channel diversity by associating with other
Kir3.x family members [].
Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [,
,
,
,
]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short β hairpin and an α helix (β/β/α structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf's can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2 [
]. C2H2 Znf's are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes []. Transcription factors usually contain several Znf's (each with a conserved β/β/α structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short α-helix in the Znf, the Znf spanning 3-4 bases of the DNA []. C2H2 Znf's can also bind to RNA and protein targets [].A specific C2H2 Zn-finger is conserved in matrin and several RNA-binding proteins. The Zn-finger follows the general pattern C-x2-C-x(12,16)-H-x5-H, and is different from the 'classical' DNA-binding C2H2 Zn-finger.
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes [
]. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Many families of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases [].Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base [
]. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [,
].This group of serine peptidases belong to MEROPS peptidase family S49 (protease IV family, clan S-). The predicted active site serine for members of this family occurs in a transmembrane domain.Signal peptides of secretory proteins seem to serve at least two important biological functions. First, they are required for
protein targeting to and translocation across membranes, such as the eubacterial plasma membrane and the endoplasmicreticular membrane of eukaryotes. Second, in addition to their role as determinants for protein
targeting and translocation, certain signal peptides have a signalling function.During or shortly after pre-protein translocation, the signal peptide is removed by signal peptidases. The integral membrane protein, SppA (protease IV), of Escherichia coli was shown experimentally to degrade signal peptides. The member of this family from Bacillus subtilis has only been shown to be required for efficient processing of
pre-proteins under conditions of hyper-secretion []. These enzymes have a molecular mass around 67kDa and a duplication such that the N-terminal half shares extensive homology with the C-terminal half and was shown in E. coli to form homotetramers. E. coli SohB, which is most closely homologous to the C-terminal duplication of SppA, is predicted to perform a similar function of small peptide degradation, but in the periplasm.Many prokaryotes have a single SppA/SohB homologue that may perform the function of either or both.
Low-density lipoprotein (LDL) receptor class A repeat
Type:
Repeat
Description:
The low-density lipoprotein receptor (LDLR) is the major cholesterol-carrying lipoprotein of plasma, acting to regulate cholesterol homeostasis in mammalian cells. The LDL receptor binds LDL and transports it into cells by acidic endocytosis. In order to be internalized, the receptor-ligand complex must first cluster into clathrin-coated pits. Once inside the cell, the LDLR separates from its ligand, which is degraded in the lysosomes, while the receptor returns to the cell surface [
]. The internal dissociation of the LDLR with its ligand is mediated by proton pumps within the walls of the endosome that lower the pH. The LDLR is a multi-domain protein, containing: The ligand-binding domain contains seven or eight 40-amino acid LDLR class A (cysteine-rich) repeats, each of which contains a coordinated calcium ion and six cysteine residues involved in disulphide bond formation [
]. Similar domains have been found in other extracellular and membrane proteins []. The second conserved region contains two EGF repeats, followed by six LDLR class B (YWTD) repeats, and another EGF repeat. The LDLR class B repeats each contain a conserved YWTD motif, and is predicted to form a β-propeller structure [
]. This region is critical for ligand release and recycling of the receptor [].The third domain is rich in serine and threonine residues and contains clustered O-linked carbohydrate chains.The fourth domain is the hydrophobic transmembrane region.The fifth domain is the cytoplasmic tail that directs the receptor to clathrin-coated pits.LDLR is closely related in structure to several other receptors, including LRP1, LRP1b, megalin/LRP2, VLDL receptor, lipoprotein receptor, MEGF7/LRP4, and LRP8/apolipoprotein E receptor2); these proteins participate in a wide range of physiological processes, including the regulation of lipid metabolism, protection against atherosclerosis, neurodevelopment, and transport of nutrients and vitamins [
].This entry represents the LDLR class A (cysteine-rich) repeat, which contains 6 disulphide-bound cysteines and a highly conserved cluster of negatively charged amino acids, of which many are clustered on one face of the module [
]. In LDL receptors, the class A domains form the binding site for LDL and calcium. The acidic residues between the fourth and sixth cysteines are important for high-affinity binding of positively charged sequences in LDLR's ligands. The repeat consists of a β-hairpin structure followed by a series of beta turns. In the absence of calcium, LDL-A domains are unstructured; the bound calcium ion imparts structural integrity. Following these repeats is a 350 residue domain that resembles part of the epidermal growth factor (EGF) precursor. Numerous familial hypercholesterolemia mutations of the LDL receptor alter the calcium coordinating residue of LDL-A domains or other crucial scaffolding residues.
The low-density lipoprotein receptor (LDLR) is the major cholesterol-carrying lipoprotein of plasma, acting to regulate cholesterol homeostasis in mammalian cells. The LDL receptor binds LDL and transports it into cells by acidic endocytosis. In order to be internalized, the receptor-ligand complex must first cluster into clathrin-coated pits. Once inside the cell, the LDLR separates from its ligand, which is degraded in the lysosomes, while the receptor returns to the cell surface [
]. The internal dissociation of the LDLR with its ligand is mediated by proton pumps within the walls of the endosome that lower the pH. The LDLR is a multi-domain protein, containing: The ligand-binding domain contains seven or eight 40-amino acid LDLR class A (cysteine-rich) repeats, each of which contains a coordinated calcium ion and six cysteine residues involved in disulphide bond formation [
]. Similar domains have been found in other extracellular and membrane proteins []. The second conserved region contains two EGF repeats, followed by six LDLR class B (YWTD) repeats, and another EGF repeat. The LDLR class B repeats each contain a conserved YWTD motif, and is predicted to form a β-propeller structure [
]. This region is critical for ligand release and recycling of the receptor [
].The third domain is rich in serine and threonine residues and contains clustered O-linked carbohydrate chains.The fourth domain is the hydrophobic transmembrane region.The fifth domain is the cytoplasmic tail that directs the receptor to clathrin-coated pits.LDLR is closely related in structure to several other receptors, including LRP1, LRP1b, megalin/LRP2, VLDL receptor, lipoprotein receptor, MEGF7/LRP4, and LRP8/apolipoprotein E receptor2); these proteins participate in a wide range of physiological processes, including the regulation of lipid metabolism, protection against atherosclerosis, neurodevelopment, and transport of nutrients and vitamins [
].This entry represents the LDLR class A (cysteine-rich) repeat, which contains 6 disulphide-bound cysteines and a highly conserved cluster of negatively charged amino acids, of which many are clustered on one face of the module [
]. In LDL receptors, the class A domains form the binding site for LDL and calcium. The acidic residues between the fourth and sixth cysteines are important for high-affinity binding of positively charged sequences in LDLR's ligands. The repeat consists of a β-hairpin structure followed by a series of beta turns. In the absence of calcium, LDL-A domains are unstructured; the bound calcium ion imparts structural integrity. Following these repeats is a 350 residue domain that resembles part of the epidermal growth factor (EGF) precursor. Numerous familial hypercholesterolemia mutations of the LDL receptor alter the calcium coordinating residue of LDL-A domains or other crucial scaffolding residues.
SETD3 is a protein-histidine N-methyltransferase that specifically mediates methylation of actin at 'His-73' [
]. It was initially reported to have histone methyltransferase activity and methylate 'Lys-4' and 'Lys-36' of histone H3 (H3K4me and H3K36me). However, this conclusion was based on mass spectrometry data wherein mass shifts were inconsistent with a bona fide methylation event. In vitro, the protein-lysine methyltransferase activity is weak compared to the protein-histidine methyltransferase activity [].Methyltransferases (EC [intenz:2.1.1.-]) constitute an important class of enzymes present in every life form. They transfer a methyl group most frequently from S-adenosyl L-methionine (SAM or AdoMet) to a nucleophilic acceptor such as oxygen leading to S-adenosyl-L-homocysteine (AdoHcy) and a methylated molecule [,
,
]. All these enzymes have in common a conserved region of about 130 amino acid residues that allow them to bind SAM []. The substrates that are methylated by these enzymes cover virtually every kind of biomolecules ranging from small molecules, to lipids, proteins and nucleic acids [,
,
]. Methyltransferase are therefore involved in many essential cellular processes including biosynthesis, signal transduction, protein repair, chromatin regulation and gene silencing [,
,
]. More than 230 families of methyltransferases have been described so far, of which more than 220 use SAM as the methyl donor.A review published in 2003 [
] divides allmethyltransferases into 5 classes based on the structure of their catalytic
domain (fold):class I: Rossmann-like alpha/betaclass II: TIM beta/α-barrel alpha/betaclass III: tetrapyrrole methylase alpha/betaclass IV: SPOUT alpha/beta class V: SET domain all betaA more recent paper [
] based on a study of the Saccharomyces cerevisiaemethyltransferome argues for four more folds:
class VI: transmembrane all alpha class VII: DNA/RNA-binding 3-helical bundle all alphaclass VIII: SSo0622-like alpha betaclass IX: thymidylate synthetase alpha betaClass V proteins contain the SET domain usually flanked by
other domains forming the so-called pre- and post-SET regions. Except themembers of the STD3 family which N-methylate histidine in beta-actin (EC
2.1.1.85) [,
], enzymes belonging to this class N-methylatelysine in proteins. Most of them are histone methyltransferases (EC 2.1.1.43)
like the histone H3-K9 methyltransferase dim-5 or the histoneH3-K4 methyltransferase SETD7 [
,
]. Some others methylatethe large subunit of the enzyme ribulose-bisphosphate-carboxylase/oxygenase
(RuBisCO) (EC 2.1.1.127) in plants; in these enzymes the SET domain isinterrupted by a novel domain [
]. Cytochrome c lysine N-methyltransferases(EC 2.1.1.59) do not possess a SET domain, or at least not a SET domain
detected by any of the detection methods; however they do display a SET-likeregion and for this reason they are also assigned to this class [
].
This entry represents a six transmembrane helix rhomboid domain. This entry also includes derlins, inactive members of the rhomboid family of intramembrane proteases which lack an active site Ser-His dyad but retain the overall rhomboid architecture [
].This domain is found in serine peptidases belonging to the MEROPS peptidase family S54 (Rhomboid, clan ST). They are integral membrane proteins related to the Drosophila melanogaster (Fruit fly) rhomboid protein
. Members of this family are found in archaea, bacteria and eukaryotes.
The rhomboid protease cleaves type-1 transmembrane domains using a catalytic dyad composed of serine and histidine. The active site is embedded within the membrane and the active site residues are on different transmembrane regions. From the tertiary structure of the Escherichia coli homologue GlpG [
] it was shown that hydrolysis occurs in a fluid filled cavity within the membrane. Initially, a catalytic triad including a highly conserved asparagine had been proposed, but this residue has been shown not to be essential []. Drosophila rhomboid cleaves the transmembrane proteins Spitz, Gurken and Keren within their transmembrane domains to release a soluble TGFalpha-like growth factor. Cleavage occurs in the Golgi, following translocation of the substrates from the endoplasmic reticulum membrane by Star, another transmembrane protein. The growth factors are then able to activate the epidermal growth factor receptor [
,
].Few substrates of mammalian rhomboid homologues have been determined, but rhomboid-like protein 2 has been shown to cleave ephrin B3 [
]. Parasite-encoded rhomboid enzymes are also important for invasion of host cells by Toxoplasma and the malaria parasite. Invasion of host cells first requires their recognition and this is achieved by parasite transmembrane adhesins interacting with host cell receptors. Before the parasite can enter a host cell the adhesins must be released by cleavage. In Toxoplasma rhomboid TgROM5 cleaves the adhesins, and in Plasmodium, which lacks a TgROM5 orthologue, PfROMs 1 and 4 cleave the diverse array of malaria parasite adhesins [].This entry also includes catalytically inactive rhomboid protease homologues, iRhom1/2, which are metazoan-specific and play crucial roles within the secretory pathway, including protein degradation, trafficking regulation, and inflammatory signaling [
]. They regulate ADAM17 protease, acting as trafficking factors that escort ADAM17 from the ER to the later secretory pathway. They are required for the cleavage and release of a variety of membrane-associated proteins [,
]. iRhombs have been linked to the development and progression of several autoimmune diseases including rheumatoid arthritis, lupus nephritis, as well as hemophilic arthropathy [] and also in neurological disorders such as Alzheimer's and Parkinson's diseases, inflammation, cancer and skin diseases [].
A large ribonuclear protein complex is required for the processing of the small-ribosomal-subunit rRNA - the small-subunit (SSU) processome [
,
]. This preribosomal complex contains the U3 snoRNA and at least 40 proteins, which have the following properties: They are nucleolar.They are able to coimmunoprecipitate with the U3 snoRNA and Mpp10 (a protein specific to the SSU processome). They are required for 18S rRNA biogenesis.There appears to be a linkage between polymerase I transcription and the formation of the SSU processome; as some, but not all, of the SSU processome components are required for pre-rRNA transcription initiation. These SSU processome components have been termed t-Utps. They form a pre-complex with pre-18S rRNA in the absence of snoRNA U3 and other SSU processome components. It has been proposed that the t-Utp complex proteins are both rDNA and rRNA binding proteins that are involved in the initiation of pre18S rRNA transcription. Initially binding to rDNA then associating with the 5' end of the nascent pre18S rRNA. The t-Utpcomplex forms the nucleus around which the rest of the SSU processome components, including snoRNA U3, assemble [
]. From electron microscopy the SSU processome may correspond to the terminal knobs visualized at the 5' ends of nascent 18S rRNA. This entry contains Utp11, a large ribonuclear protein that associates with snoRNA U3 [
].
A large ribonuclear protein complex is required for the processing of the small-ribosomal-subunit rRNA - the small-subunit (SSU) processome [
,
]. This preribosomal complex contains the U3 snoRNA and at least 40 proteins, which have the following properties: They are nucleolar.They are able to coimmunoprecipitate with the U3 snoRNA and Mpp10 (a protein specific to the SSU processome). They are required for 18S rRNA biogenesis.There appears to be a linkage between polymerase I transcription and the formation of the SSU processome; as some, but not all, of the SSU processome components are required for pre-rRNA transcription initiation. These SSU processome components have been termed t-Utps. They form a pre-complex with pre-18S rRNA in the absence of snoRNA U3 and other SSU processome components. It has been proposed that the t-Utp complex proteins are both rDNA and rRNA binding proteins that are involved in the initiation of pre18S rRNA transcription. Initially binding to rDNA then associating with the 5' end of the nascent pre18S rRNA. The t-Utpcomplex forms the nucleus around which the rest of the SSU processome components, including snoRNA U3, assemble [
]. From electron microscopy the SSU processome may correspond to the terminal knobs visualized at the 5' ends of nascent 18S rRNA. This entry contains Utp14, a large ribonuclear protein associated with snoRNA U3 [
].
Assembly of a robust microtubule-based mitotic spindle is essential for accurate segregation of chromosomes to progeny [
]. Spindle assembly relies on the concerted action of centrosomes, spindle microtubules, molecular motors and non-motor spindle proteins []. A number of novel regulators of spindle assembly have been identified: one of these is HAUS, an 8-subunit protein complex that shares similarity with Drosophila Augmin [,
]. Plants also have a augmin complex consisting of eight subunits. Subunits AUG1 to AUG6 can be aligned with the human HAUS1 to HAUS6 proteins [].HAUS localises to interphase centrosomes and to mitotic spindle micro- tubules; its disruption induces microtubule-dependent fragmentation of centrosomes, and an increase in centrosome size. HAUS disruption results in the destabilisation of kinetochore microtubules and eventual formation of multipolar spindles. Such severe mitotic defects are alleviated by co-depletion of NuMA, indicating that both factors regulate opposing activities. HAUS disruption alters NuMA localisation, suggesting that mis-localised NuMA activity contributes to the observed spindle and centrosome defects. The Augmin complex (HAUS) is thus a critical, evolutionarily conserved multi-subunit protein complex that regulates centrosome and spindle integrity [
].The HAUS (Homologous to AUgmin Subunits) individual subunits have been designated HAUS1 to HAUS8 [
]. HAUS augmin-like complex subunit 1 (also known as enhancer of invasion-cluster, HEI-C [], and coiled-coil domain-containing protein 5) is a coiled-coil protein required for passage through mitosis [].
Radical SAM proteins are found in all domains of life and share an unusual Fe-S cluster associated with generation of a free radical by reductive cleavage of SAM and often provide an anaerobic or oxygen-independent mechanism that is found as an aerobic reaction in other proteins. Radical SAM proteins catalyse diverse reactions, including unusual methylations, isomerization, sulphur insertion, ring formation, anaerobic oxidation and protein radical formation. These proteins function in DNA precursor, vitamin, cofactor, antibiotic and herbicide biosynthesis and in biodegradation pathways [
,
].Radical SAM proteins share several common features, notably three strictly conserved cysteine residues generally included in the CxxxCxxC motif. These critical cysteines coordinate the unusual [4Fe-4S]2+/1+ cluster, while SAM serves as ligand for the fourth iron atom and acts as a cofactor or a cosubstrate []. The radical SAM enzymes biochemically characterised to date have in common the cleavage of the [4Fe-4S]1+-SAM complex to [4Fe-4S]2+-Met and the 5'-deoxyadenosyl radical, which abstracts a hydrogen atom from the substrate to initiate a radical mechanism [,
].The Radical SAM domain is organised in a fold related to the β-barrel or TIM barrel, in which β-strands are arranged in a barrel-like array, with peripheral helices intervening between β-strands. The [4Fe-4S] clusters and substrates are bound within the barrels, as is typical of TIM barrel enzymes [,
].
Members of the Pumilio family of proteins (Puf) regulate translation and mRNA stability in a wide variety of eukaryotic organisms including mammals, flies, worms, slime mold, and yeast [
]. Pumilio family members are characterised by the presence of eight tandem copies of an imperfectly repeated 36 amino acids sequence motif, the Pumilio repeat, surrounded by a short N- and C-terminal conserved region. The eight repeats and the N- and C-terminal regions form the Pumilio homology domain (PUM-HD). The PUM-HD domain is a sequence-specific RNA binding domain. The Puf family of proteins are mainly post-transcriptional regulators. Several Puf members have been shown to bind specific RNA sequences mainly found in the 3' UTR of mRNA and repress their translation [,
]. Frequently, Puf proteins function asymmetrically to create protein gradients, thus causing asymmetric cell division and regulating cell fate specification [].Crystal structure of Pumilio repeats has been solved [
]. The PUM repeat with the N- and C-terminal regions pack together to form a right-handed superhelix that approximates a half doughnut structurally similar to the Armadillo (ARM) repeat proteins, beta-catenin andkaryopherin alpha. The RNA binds the concave surface of the molecule, whereeach of the protein's eight repeats makes contacts with a different RNA base
via three amino acid side chains at conserved positions [].This entry represents the Pumilio repeat.
VAMPs (and its homologue synaptobrevins) define a group of SNARE proteins that contain a C-terminal coiled-coil/SNARE motif, in combination with variable N-terminal domains that are used to classify VAMPs: those containing longin N-terminal domains (~150 aa) are referred to as longins, while those with shorter N-termini are referred to as brevins [
]. Longins are the only type of VAMP protein found in all eukaryotes, suggesting that their longin domain is essential. The longin domain is thought to exert a regulatory function. Longin domains have been shown to share the same structural fold, a profilin-like globular domain consisting of a five-stranded antiparallel β-sheet that is sandwiched by an α-helix on one side, and two α-helices on the other (beta(2)-α-β(3)-alpha(2)).Other families have been shown to contain domains that structurally resemble the VAMP longin domain. An example is the eukaryotic conserved protein, SEDL, which is a component of the transport protein particle (TRAPP), critically involved in endoplasmic reticulum-to-Golgi vesicle transport; mutations in the SEDL gene are associated with an X-linked skeletal disorder, spondyloepiphyseal dysplasia tarda []. Another example is the assembly domain of clathrin coat proteins, such as Mu2 adaptin (AP50) and Sigma2 adaptin (AP17), which structurally resemble the longin domain. AP50 and AP17 are two of the proteins that make up the core of AP2, a complex that functions in clathrin-mediated endocytosis [].
Molecular chaperones recognize unfolded or misfolded proteins by binding to hydrophobic surface patches not normally exposed in the native proteins. Members of the Clp/Hsp100 family of chaperones are present in eubacteria and within organelles of all eukaryotes, promoting disaggregation and disassembly of protein complexes and participating in energy-dependent protein degradation. The ClpA, ClpB, and ClpC subfamilies of the Clp/Hsp100 ATPases contain a conserved N-terminal domain of ~150 amino acids, which in turn consists of two repeats of ~75 residues. Although the Clp repeat (R) domain contains two approximate sequence repeats, it behaves as a single cooperatively folded unit. The Clp R domain is thought to provide a means for regulating the specificity of and to enlarge the substrate pool available to Clp/Hsp100 chaperone or protease complexes. These roles can be assisted through the binding of an adaptor protein. Adaptor proteins bind to the Clp R domain, modulate the target specificity of the Clp/Hsp100 complex to a particular substrate of interest, and may also regulate the activity of the complex [,
,
,
,
,
].The Clp R domain is monomeric and partially alpha helical. It is a single folding unit with pseudo 2-fold symmetry. The Clp R domain structure consists of two four-helix bundles connected by a flexible loop [
,
,
]. This entry represents the Clp repeat (R) domain [
].
An operon encoding 4 proteins required for bacterial cellulose biosynthesis
(bcs) in Acetobacter xylinus (Gluconacetobacter xylinus) has been isolated via genetic complementationwith strains lacking cellulose synthase activity [
]. Nucleotide sequence analysis showed the cellulose synthase operon to consist of 4 genes, designated bcsA, bcsB, bcsC and bcsD, all of which are required for maximal bacterial cellulose synthesis in A. xylinum.
The calculated molecular mass of the protein encoded by bcsA is 84.4kDa [
]. Sequence analysis suggests that the gene product is an integral membrane protein with several transmembrane (TM) domains []. It is postulated that the protein is anchored in the membrane at the N-terminal end by a single hydrophobic helix. Two potential N-glycosylation sites are predicted from sequence analysis, consistent with earlier observations that BcsA is a glycoprotein. The function of BcsA is unknown. The sequence shares a high degree of similarity with Escherichia coli YhjO.Cellulose synthase catalyzes the beta-1,4 polymerisation of glucose residues in the formation of cellulose. In bacteria, the substrate is UDP-glucose. The synthase consists of two subunits (or domains in the frequent cases where it is encoded as a single polypeptide), the catalytic domain modelled here and the regulatory domain (
). The regulatory domain binds the allosteric activator cyclic di-GMP [
,
]. The protein is membrane-associated and probably assembles into multimers such that the individual cellulose strands can self-assemble into multi-strand fibrils.
The many bacterial transcription regulation proteins which bind DNA through a
'helix-turn-helix' motif can be classified into subfamilies on the basis ofsequence similarities. One such family is the AsnC/Lrp subfamily [
]. The Lrp family of transcriptional regulators appears to be widely distributed among bacteria andarchaea, as an important regulatory system of the amino acid metabolism and related processes [
]. Members of the Lrp family are small DNA-binding proteins with molecular masses of around
15kDa. Target promoters often contain anumber of binding sites that typically lack obvious inverted repeat elements, and to which binding is
usually co-operative. LrpA from Pyrococcus furiosus is the first Lrp-like protein to date of which a three-dimensional structurehas been solved. In the crystal structure LrpA forms an octamer consisting
of four dimers. The structure revealed that the N-terminal part of the protein consists of ahelix-turn-helix (HTH) domain, a fold generally involved in DNA binding.
The C terminus of Lrp-like proteins has a β-fold, where the two α-helices are located at one side of the four-stranded antiparallel β-sheet.LrpA forms a homodimer mainly through interactions between the β-strands of this C-terminal
domain, and an octamer through further interactions between the second α-helix and fourth β-strandof the motif. Hence, the C-terminal domain of Lrp-like proteins appears to
be involved in ligand-response and activation [].This entry also includes some siroheme decarboxylases, which contain the AsnC-like ligand binding domain.
The Factor for Inversion Stimulation (FIS) protein is a regulator of bacterial functions, and binds specifically to weakly related DNA sequences
[,
]. It activates ribosomal RNA transcription, and is involved in upstream activation of rRNA promoters. The protein has been shown to play a role in the regulation of virulence factors in both Salmonella typhimurium and Escherichia coli []. Some of its functions include inhibition of the initiation of DNA replication from the OriC site, and promotion of Hin-mediated DNA inversion.In its C-terminal extremity, FIS encodes a helix-turn-helix (HTH) DNA-binding motif, which shares a high degree of similarity with other HTH
motifs of more primitive bacterial transcriptional regulators, such as the nitrogen assimilation regulatory proteins (NtrC) from species like Azotobacter, Rhodobacter and Rhizobium. This has led to speculation that both evolved from a single common ancestor [].The 3-dimensional structure of the E. coli FIS DNA-binding protein has been determined by means of X-ray diffraction to 2.0A resolution [
,
]. FIS is composed of four α-helices tightly intertwined to form a globular dimer with two protruding HTH motifs. The 24 N-terminal amino acids are poorly defined, indicating that they might act as `feelers' suitable for DNA or protein (invertase) recognition []. Other proteins belonging to this subfamily include:E. coli: atoC, hydG, ntrC, fhlA, tyrR,Rhizobium spp.: ntrC, nifA, dctD
HUWE1 (also known as HECT, UBA and WWE domain-containing protein 1, or Mcl-1 ubiquitin ligase E3, amongst other names) may function as a ubiquitin-protein ligase involved in the ubiquitination cascade that targets specific substrate proteins in proteolysis. It can ubiquitylate DNA polymerase beta (Pol beta), the major BER DNA polymerase, and modulates base excision repair (BER) [
]. HUWE1 also acts as a critical mediator of both the p53-independent and p53-dependent tumor suppressor functions of ARF tumor suppressor in p53 regulation []. Moreover, HUWE1 is both required and sufficient for the polyubiquitination of Mcl-1, an anti-apoptotic Bcl-2 family member involved in DNA damage-induced apoptosis []. Furthermore, HUWE1 plays an important role in the regulation of Cdc6 stability after DNA damage. In addition, HUWE1 works as a partner of N-Myc oncoprotein in neural cells. It ubiquitinates N-Myc and primes it for proteasomal-mediated degradation [].HUWE1 contains a ubiquitin-associated (UBA) domain, a WWE domain, and a Bcl-2 homology region 3 (BH3) domain at the N terminus and a HECT domain at the C terminus. WWE domain plays a role in the regulation of specific protein-protein interactions in a ubiquitin conjugation system. BH3 domain is responsible for the specific binding to Mcl-1. HECT domain is involved in the inhibition of the transcriptional activity of p53 via a ubiquitin-dependent degradation pathway [
].
Protein tyrosine (pTyr) phosphorylation is a common post-translational modification which can create novel recognition motifs for protein interactions and cellular localisation, affect protein stability, and regulate enzyme activity. Consequently, maintaining an appropriate level of protein tyrosine phosphorylation is essential for many cellular functions. Tyrosine-specific protein phosphatases (PTPase;
) catalyse the removal of a phosphate group attached to a tyrosine residue, using a cysteinyl-phosphate enzyme intermediate. These enzymes are key regulatory components in signal transduction pathways (such as the MAP kinase pathway) and cell cycle control, and are important in the control of cell growth, proliferation, differentiation and transformation [
,
]. The PTP superfamily can be divided into four subfamilies []:(1) pTyr-specific phosphatases(2) dual specificity phosphatases (dTyr and dSer/dThr)(3) Cdc25 phosphatases (dTyr and/or dThr)(4) LMW (low molecular weight) phosphatasesBased on their cellular localisation, PTPases are also classified as:Receptor-like, which are transmembrane receptors that contain PTPase domains [
]
Non-receptor (intracellular) PTPases [
]
All PTPases carry the highly conserved active site motif C(X)5R (PTP signature motif), employ a common catalytic mechanism, and share a similar core structure made of a central parallel β-sheet with flanking α-helices containing a β-loop-α-loop that encompasses the PTP signature motif [
]. Functional diversity between PTPases is endowed by regulatory domains and subunits. This entry represents a tyrosine-protein phosphatase, non-receptor types 14 and 21.
Rab4 is a member of the large Rab GTPase family. It has been implicated in numerous functions within the cell. It helps regulate endocytosis through the sorting, recycling, and degradation of early endosomes. Mammalian Rab4 is involved in the regulation of many surface proteins including G-protein-coupled receptors, transferrin receptor, integrins, and surfactant protein A. Experimental data implicate Rab4 in regulation of the recycling of internalized receptors back to the plasma membrane. It is also believed to influence receptor-mediated antigen processing in B-lymphocytes, in calcium-dependent exocytosis in platelets, in alpha-amylase secretion in pancreatic cells, and in insulin-induced translocation of Glut4 from internal vesicles to the cell surface [
,
]. Rab4 is known to share effector proteins with Rab5 and Rab11 []. Rabs are regulated by GTPase activating proteins (GAPs), which interact with GTP-bound Rab and accelerate the hydrolysis of GTP to GDP. Guanine nucleotide exchange factors (GEFs) interact with GDP-bound Rabs to promote the formation of the GTP-bound state. Rabs are further regulated by guanine nucleotide dissociation inhibitors (GDIs), which facilitate Rab recycling by masking C-terminal lipid binding and promoting cytosolic localization. Most Rab GTPases contain a lipid modification site at the C terminus, with sequence motifs CC, CXC, or CCX. Lipid binding is essential for membrane attachment, a key feature of most Rab proteins [
,
].
Protein tyrosine (pTyr) phosphorylation is a common post-translational modification which can create novel recognition motifs for protein interactions and cellular localisation, affect protein stability, and regulate enzyme activity. Consequently, maintaining an appropriate level of protein tyrosine phosphorylation is essential for many cellular functions. Tyrosine-specific protein phosphatases (PTPase;
) catalyse the removal of a phosphate group attached to a tyrosine residue, using a cysteinyl-phosphate enzyme intermediate. These enzymes are key regulatory components in signal transduction pathways (such as the MAP kinase pathway) and cell cycle control, and are important in the control of cell growth, proliferation, differentiation and transformation [
,
]. The PTP superfamily can be divided into four subfamilies []:(1) pTyr-specific phosphatases(2) dual specificity phosphatases (dTyr and dSer/dThr)(3) Cdc25 phosphatases (dTyr and/or dThr)(4) LMW (low molecular weight) phosphatasesBased on their cellular localisation, PTPases are also classified as:Receptor-like, which are transmembrane receptors that contain PTPase domains [
]
Non-receptor (intracellular) PTPases [
]
All PTPases carry the highly conserved active site motif C(X)5R (PTP signature motif), employ a common catalytic mechanism, and share a similar core structure made of a central parallel β-sheet with flanking α-helices containing a β-loop-α-loop that encompasses the PTP signature motif [
]. Functional diversity between PTPases is endowed by regulatory domains and subunits. This entry represents a protein-tyrosine phosphatase, non-receptor types 3 and 4.
The type III secretion system of Gram-negative bacteria is used to transport virulence factors from the pathogen directly into the host cell [
] and is only triggered when the bacterium comes into close contact with the host. Effector proteins secreted by the type III system do not possess a secretion signal, and are considered unique because of this. Yersinia spp. secrete effector proteins called YopB and YopD that facilitate the spread of other translocated proteins through the type III needle and the host cell cytoplasm []. In turn, the transcription of these moieties is thought to be regulated by another gene, lcrV, found on the Yops virulon that encodes the entire type III system []. The product of this gene, LcrV protein, also regulates the secretion of YopD through the type III translocon [], and itself acts as a protective "V"antigen for Yersinia pestis, the causative agent of plague [
].Recently, a homologue of the Y. pestis LcrV protein (PcrV) was found in Pseudomonas aeruginosa, an opportunistic pathogen. In vivo studies using mice found that immunisation with the protein protected burned animals from infection by P. aeruginosa, and enhanced survival. In addition, it is speculated that PcrV determines the size of the needle pore for type III secreted effectors [
].The structure of the virulence-associated V antigen consists of an all-alpha and alpha+beta domains connected by antiparallel coiled coil.
The PAS fold corresponds to the structural domain that has previously been defined as PAS and PAC motifs [
]. The PAS fold appears in archaea, eubacteria and eukarya, and is involved in a variety of functions within sensory proteins promoting protein-protein interactions, signal transfer or as a stimuli sensor []. The PAS domain contains a sensory box, or S-box domain that occupies the central portion of the PAS domain but is more widely distributed. It is often tandemly repeated. Known prosthetic groups bound in the S-box domain include haem in the oxygen sensor FixL [], FAD in the redox potential sensor NifL [], and a 4-hydroxycinnamyl chromophore in photoactive yellow protein []. Proteins containing the domain often contain other regulatory domains such as response regulator or sensor histidine kinase domains. Other S-box proteins include phytochromes and the aryl hydrocarbon receptor nuclear translocator. This domain has been found in the gene product of the madA gene of the filamentous zygomycete fungus Phycomyces blakesleeanus. It has been shown that MadA encodes a blue-light photoreceptor for phototropism and other light responses. The gene is involved in the phototropic responses associated with sporangiophore growth; they exhibit phototropism by bending toward near-UV and blue wavelengths and away from far-UV wavelengths in a manner that is physiologically similar to plant phototropic responses [
].
This entry represents Tyrosine-protein phosphatase non-receptor type 12 (PTN12).Protein tyrosine (pTyr) phosphorylation is a common post-translational modification which can create novel recognition motifs for protein interactions and cellular localisation, affect protein stability, and regulate enzyme activity. Consequently, maintaining an appropriate level of protein tyrosine phosphorylation is essential for many cellular functions. Tyrosine-specific protein phosphatases (PTPase;
) catalyse the removal of a phosphate group attached to a tyrosine residue, using a cysteinyl-phosphate enzyme intermediate. These enzymes are key regulatory components in signal transduction pathways (such as the MAP kinase pathway) and cell cycle control, and are important in the control of cell growth, proliferation, differentiation and transformation [
,
]. The PTP superfamily can be divided into four subfamilies []:(1) pTyr-specific phosphatases(2) dual specificity phosphatases (dTyr and dSer/dThr)(3) Cdc25 phosphatases (dTyr and/or dThr)(4) LMW (low molecular weight) phosphatasesBased on their cellular localisation, PTPases are also classified as:Receptor-like, which are transmembrane receptors that contain PTPase domains [
]
Non-receptor (intracellular) PTPases [
]
All PTPases carry the highly conserved active site motif C(X)5R (PTP signature motif), employ a common catalytic mechanism, and share a similar core structure made of a central parallel β-sheet with flanking α-helices containing a β-loop-α-loop that encompasses the PTP signature motif [
]. Functional diversity between PTPases is endowed by regulatory domains and subunits.
In Escherichia coli the TonB protein interacts with outer membrane receptor proteins that carry out high-affinity binding and energy-dependent uptake of specific substrates into the periplasmic space [
]. These substrates are either poorly permeable through the porin channels or are encountered at very low concentrations. In the absence of TonB, these receptors bind their substrates but do not carry out active transport. TonB-dependent regulatory systems consist of six components: a specialised outer membrane-localised TonB-dependent receptor (TonB-dependent transducer) that interacts with its energising TonB-ExbBD protein complex, a cytoplasmic membrane-localised anti-sigma factor and an extracytoplasmic function (ECF)-subfamily sigma factor []. The TonB complex senses signals from outside the bacterial cell and transmits them via two membranes into the cytoplasm, leading to transcriptional activation of target genes. The proteins that are currently known or presumed to interact with TonB include BtuB [], CirA, FatA, FcuT, FecA [], FhuA [], FhuE, FepA [], FptA, HemR, IrgA, IutA, PfeA, PupA and Tbp1. The TonB protein also interacts with some colicins. Most of these proteins contain a short conserved region at their N terminus [].This entry covers the conserved part of the β-barrel domain structure at the C terminus. This β-barrel domain is also found in vitamin B12 transporter BtuB [
] and ferric citrate outer membrane transporter FecA [] among others.
Intermediate filaments (IF) are primordial components of the cytoskeleton and the nuclear envelope [
]. They generally form filamentous structures 8 to 14 nm wide. IF proteins are members of a very large multigene family of proteins which has been subdivided in five major subgroups, type I: acidic cytokeratins, type II: basic cytokeratins, type III: vimentin, desmin, glial fibrillary acidic protein (GFAP), peripherin, and plasticin, type IV: neurofilaments L, H and M, alpha-internexin and nestin, and type V: nuclear lamins A, B1, B2 and C. The lamins are components of the nuclear lamina, a fibrous layer on the nucleoplasmic side of the inner nuclear membrane that may provide a framework for the nuclear envelope and may interact with chromatin.All IF proteins are structurally similar in that they consist of a central rod domain arranged in coiled-coil α-helices, with at least two short characteristic interruptions; a N-terminal non-helical domain (head) of variable length; and a C-terminal domain (tail) which is also non-helical, and which shows extreme length variation between different IF proteins. The C-terminal domain has been characterised for the lamins [
].The lamin-tail domain (LTD), which has an immunoglobulin (Ig) fold, is found in nuclear lamins and several bacterial proteins where it occurs with membrane associated hydrolases of the metallo-beta-lactamase, synaptojanin, and calcineurin-like phosphoesterase superfamilies [
].
In Escherichia coli the TonB protein interacts with outer membrane receptor proteins that carry out high-affinity binding and energy-dependent uptake of specific substrates into the periplasmic space [
]. These substrates are either poorly permeable through the porin channels or are encountered at very low concentrations. In the absence of TonB, these receptors bind their substrates but do not carry out active transport. TonB-dependent regulatory systems consist of six components: a specialised outer membrane-localised TonB-dependent receptor (TonB-dependent transducer) that interacts with its energising TonB-ExbBD protein complex, a cytoplasmic membrane-localised anti-sigma factor and an extracytoplasmic function (ECF)-subfamily sigma factor []. The TonB complex senses signals from outside the bacterial cell and transmits them via two membranes into the cytoplasm, leading to transcriptional activation of target genes. The proteins that are currently known or presumed to interact with TonB include BtuB [], CirA, FatA, FcuT, FecA [], FhuA [], FhuE, FepA [], FptA, HemR, IrgA, IutA, PfeA, PupA and Tbp1. The TonB protein also interacts with some colicins. Most of these proteins contain a short conserved region at their N terminus [].This entry covers the conserved part of the β-barrel domain structure at the C terminus. This β-barrel domain is also found in vitamin B12 transporter BtuB [
] and ferric citrate outer membrane transporter FecA [] among others.
This family includes the neurogenic mastermind-like proteins 1-3 (MAML1-3) from chordates, which act as critical transcriptional co-activators for Notch signaling [
, ]. Notch receptors are cleaved upon ligand engagement and the intracellular domain of Notch shuttles to the nucleus. MAMLs form a functional DNA-binding complex with the cleaved Notch receptor and the transcription factor CSL, thereby regulating transcriptional events that are specific to the Notch pathway. MAML proteins may also play roles as key transcriptional co-activators in other signal transduction pathways as well, including: muscle differentiation and myopathies (MEF2C) [], tumour suppressor pathway (p53) [] and colon carcinoma survival (beta-catenin) []. MAML proteins could mediate cross-talk among the various signaling pathways and the diverse activities of the MAML proteins converge to impact normal biological processes and human diseases, including cancers.They consist of an N-terminal domain which adopt an elongated kinked helix that wraps around ANK and CSL forming one of the complexes in the build-up of the Notch transcriptional complex for recruiting general transcription factors [
,
]]. This N-terminal domain is responsible for its interaction with the ankyrin repeat region of the Notch proteins NOTCH1 [], NOTCH2 [], NOTCH3 [] and NOTCH4. It forms a DNA-binding complex with Notch proteins and RBPSUH/RBP-J kappa/CBF1, and also binds CREBBP/CBP [] and CDK8 []. The C-terminal region is required for transcriptional activation.