This entry consists of the C-terminal domain of eukaryotic Aar2 and Aar2-like proteins. This domain consists of 9 alpha helices, 1 pi helix and 1 3(10)-helix.Aar2 is a U5 small nuclear ribonucleoprotein (snRNP) particle assembly factor and part of Prp8, which forms a large complex containing U5 snRNA, Snu114, and seven Sm proteins (B, D1, D2, D3, E, F and G). Upon import of the complex into the nucleus, Aar2 phosphorylation leads to its release from Prp8 and replacement by Brr2p, thus playing an important role in Brr2p regulation and possibly safeguarding against non-specific RNA binding to Prp8 [
,
,
,
,
]. Aar2p binds directly with the RNaseH-like domain in the C-terminal region of Prp8p []. In yeast, Aar2 protein is involved in splicing pre-mRNA of the a1 cistron and other genes important for cell growth [].
This entry represents the C-terminal domain found in the Rab3 GTPase-activating protein catalytic subunit (Rab3GAP1) predominantly in animals.
Small G proteins of the Rab family are regulators of intracellular vesicle traffic. Their rate of GTP hydrolysis is enhanced by specific GTPase-activating proteins (GAPs) that switch G proteins to their inactive form [
]. Rab3GAP1 (catalytic subunit) has been shown to form a heterodimeric complex with Rab3GAP2 (the regulatory subunit), and this complex acts as a guanosine nucleotide exchange factor for Rab3 subfamily (RAB3A, RAB3B, RAB3C and RAB3D). Rab3GAP complex may participate in neurodevelopmental processes such as proliferation, migration and differentiation before synapse formation, and non-synaptic vesicular release of neurotransmitters [,
]. It also activates Rab18 and promotes autolysosome maturation through the Vps34 Complex I [].Mutations in the Rab3GAP1/2 gene cause Warburg micro syndrome (WMS), a hereditary autosomal neuromuscular disorder [
].
Proteins with this domain are DNA primases, a ubiquitous bacteria protein. Most DNA primases contain nearly two hundred additional residues C-terminal to the region represented here, but conservation between species is poor.DNA primase synthesises the RNA primers for the Okazaki fragments in lagging strand DNA synthesis. Escherichia coli family member DnaG has been shown to interact with the replicative DnaB helicase, single-stranded DNA binding protein (SSB), and DNA polymerase III holoenzyme [
,
,
,
]. Although DnaG is capable of synthesizing 60-nucleotide-long primers in vitro [], this primer length is restrained to 11 (+/-1) nucleotides in the context of the replisome [].On the basis of sequence analysis, these proteins appear structurally distinct from primases known to act in archaeal and eukaryotic replication [
], or to either of the two subunits, p50 and p60, of the heterodimeric eukaryotic DNA primase.
This domain is found at the N-terminal of a group of cell wall synthesis proteins, such as Knh1 and Kre9 from budding yeast, involved in cell wall beta 1,6-glucan synthesis. Kre9 is also involved in cell wall beta-glucan assembly [
]. In Saccharomyces cerevisiae, a kre9 null mutation leads to severe growth defects, aberrant multi-budded morphology, and mating defects [,
]. Over-expression of Knh1 suppresses growth defects of a kre9 null mutant []. Knh1 is required for propionic acid resistance []. In Aspergillus fumigatus, proteins containing this domain play a role in fungal cell wall organisation []. This entry includes proteins from Lentinula edodes, involved in fruiting body formation, and may have a more general role in signalling in other organisms as it interacts with MAPK []. This entry also includes uncharacterised proteins from fission yeast, archaea and bacteria.
Tectonins I and II are two dominant proteins in the nuclei and nuclear matrix from plasmodia of Physarum polycephalum (Slime mold) which encode 217 and 353 amino acids, respectively. Tectonin I is homologous to the C-terminal two-thirds of tectonin II. Both proteins contain six tandem repeats that are each 33-37 amino acids in length and define a new consensus sequence. Homologous repeats are found in L-6, a bacterial lipopolysaccharide-binding lectin from horseshoe crab hemocytes. The repetitive sequences of the tectonins and L-6 are reminiscent of the WD repeats of the beta-subunit of G proteins, suggesting that they form β-propeller domains. The tectonins may be lectins that function as part of a transmembrane signalling complex during phagocytosis [
]. It has been demonstrated that tectonin β-propeller repeat-containing protein 1 (TECPR1) has a critical function during autophagosome maturation and autophagosome-lysosome fusion [].
This family includes insulin-cleaving membrane protease (imelysin, ICMP)-like protein (IPPA from Psychrobacter arcticus), the Pseudomonas aeruginosa PA4372 and Vibrio cholera VC1266 Fur-regulated imelysin-like protein. They share the overall fold and a similar functional site as the insulin-cleaving membrane protease (ICMP) [
]. However, IPPA adopts a structure distinctive from the known HxxE metallopeptidases or iron-binding proteins, suggesting this protein may not be a peptidase; the histidine in the GxHxxE motif region is no longer conserved (GxxxxE), indicating a possible loss of enzymatic function or a change in substrate preference. A putative functional site for this non-peptidase homologue is located at the domain interface. The tertiary structure shows a fold consisting of two domains, each of which consists of a bundle of four helices that are similar to each other, implying an ancient gene duplication and fusion event [].
The T4 bacteriophage of E.coli protects its DNA via two glycosyltransferases which glucosylate 5-hydroxymethyl cytosines (5-HMC) using UDP-glucose. These two proteins are the retaining alpha-glucosyltransferase (AGT) and the inverting beta-glucosyltransferase (BGT) [
].This entry represents AGT and similar proteins from Uroviricota. AGT catalyses the transfer of glucose from uridine diphosphoglucose to 5-hydroxymethyl cytosine of T4 DNA to yield glucosyl 5-hydroxymethyl cytosine [
]. This protein adopts the GT-B fold and binds both the sugar donor and acceptor to the C-terminal domain. There is evidence for a role of AGT in the base-flipping mechanism and for its specific recognition of the acceptor base []. AGT interacts with the clamp protein gp45 []. The modification performed by this enzyme protects the phage genome against its own nucleases, the host restriction endonuclease system and against the host CRISPR-Cas9 defence system [].
Tellurite resistance protein TehB is encoded by tellurite-reducing operon
tehAB[
]. Members of this entry are two-domain proteins with a C-terminal S-adenosyl-L-methionine (SAM)-dependent methyltransferase domain and an N-terminal domain of unknown function. In Escherichia coli, Salmonella and some other organisms, the two domains exist in stand-alone form: YeaR protein () and single-domain TehB methyltransferase (
).
When upregulated or present in high copy number, TehB is responsible for potassium tellurite resistance, which is probably caused by increasing the reduction rate of tellurite to metallic tellurium within the bacterium. TehB is a cytoplasmic protein that possesses three conserved motifs (I, II, and III) found in SAM-dependent non-nucleic acid methyltransferases [
]. Conformational changes in TehB are observed upon binding of both tellurite and SAM, suggesting that TehB utilises a methyltransferase activity in the detoxification of tellurite [].
The leucine zipper tumor suppressor (LZTS) family members are thought to play roles in cell growth modulation, mitosis and cancer cell growth suppression [
,
]. This entry includes the protein families LZTS 1-3 from animals. LZTS1 (also known as FEZ1) plays a role in mitosis [
]. It is associated with assembled microtubules and is involved in the stabilization of active p34CDC2 (Cdk1) []. LZTS2 (LAPSER1) interacts with p80 katanin and regulates katanin-mediated microtubule severing. It alters cell proliferation by regulating cytokinesis. It is localised to the centrosome and the midbody in mitotic cells and modulates beta-catenin signalling and localisation []. LZTS3 (ProSAP-interacting protein 1, ProSAPiP1) interacts with the PDZ domain of ProSAP2/Shank3, a scaffolding protein for components of the postsynaptic density (PSD) of excitatory brain synapses []. It also interacts with the postsynaptic protein Sipa1l3/SPAR3 [].
The calx-beta motif is present as a tandem repeat in the cytoplasmic domains of Calx Na-Ca exchangers, which are used to expel calcium from cells. This motif overlaps domains used for calcium binding and regulation. The calx-beta motif is also present in the cytoplasmic tail of mammalian integrin-beta4, which mediates the bi-directional transfer of signals across the plasma membrane, as well as in some cyanobacterial proteins. This motif is also found in Fras1/Frem family of extracellular proteins (extracellular matrix organizing protein FRAS1 and FRAS1-related extracellular matrix proteins FRAM1, 2 and 3) required for proper organogenesis during embryonic development and whose mutations lead to Fraser Syndrome, a rare congenital disorder characterised by multisystem malformation usually comprising abnormal brain formation, cryptophthalmos, syndactyly and renal defects [
]. This motif contains a series of β-strands and turns that form a self-contained β-sheet [,
].
This entry represents Solute carrier family 22 members 4/5 (SLC22A4/5) and similar proteins from chordates. This protein family belong to the organic cation transporter family and to the major facilitator superfamily (MFS) of membrane transport proteins. MSF proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement [
,
,
].SLC22A4/5 are sodium-ion dependent transporters. SLC22A4 is a low affinity carnitine transporter, highly specific for the uptake of ergothioneine (ET), a thiolated derivative of histidine with antioxidant properties. ET is a natural compound produced only by certain fungi and bacteria and must be absorbed from the diet by humans and other vertebrates [
,
]. SLC22A5 (also known as Organic cation/carnitine transporter 2) is a high affinity carnitine transporter involved in the active cellular uptake of carnitine [,
].
This family belong to the Glucose transporter-like (GLUT-like) family of the Major Facilitator Superfamily (MFS) of membrane transport proteins and consists of sugar transporters including plant early dehydration-induced gene ERD6-like proteins [
], and similar insect proteins such as facilitated trehalose transporter Tret1-1. ERD6-like transporters can be induced under abiotic stress conditions. EDR6 might transport sucrose, while ESL1 transports monosaccharides across the vacuolar membrane and functions together with the vacuolar invertase to regulate osmotic pressure by affecting the accumulation of sugar in the cells under abiotic stress conditions []. Insect Tret1-1 is a low-capacity facilitative transporter for trehalose that mediates the transport of trehalose synthesised in the fat body and the incorporation of trehalose into other tissues that require a carbon source [,
]. Proteins in this family show similarities with mammalian glucose transporters GLUT6/ GLUT8 and myo-inositol transporters (HMIT) [,
].
This family includes the N terminus of the Rab3 GTPase-activating protein (GAP) non-catalytic subunit.Small G proteins of the Rab family are regulators of intracellular vesicle traffic. Their rate of GTP hydrolysis is enhanced by specific GTPase-activating proteins (GAPs) that switch G proteins to their inactive form [
]. Rab3GAP1 (catalytic subunit) has been shown to form a heterodimeric complex with Rab3GAP2 (the regulatory subunit), and this complex acts as a guanosine nucleotide exchange factor for Rab3 subfamily (RAB3A, RAB3B, RAB3C and RAB3D). Rab3GAP complex may participate in neurodevelopmental processes such as proliferation, migration and differentiation before synapse formation, and non-synaptic vesicular release of neurotransmitters [,
]. It also activates Rab18 and promotes autolysosome maturation through the Vps34 Complex I [].Mutations in the Rab3GAP1/2 gene cause Warburg micro syndrome (WMS), a hereditary autosomal neuromuscular disorder [
].
This bacterial family includes the uncharacterized Escherichia coli YegD [
]. It belongs to the heat shock protein 70 (HSP70) family of chaperones that assist in protein folding and assembly and can direct incompetent 'client' proteins towards degradation. Typically, HSP70s have a nucleotide-binding domain (NBD) and a substrate-binding domain (SBD). The nucleotide sits in a deep cleft formed between the two lobes of the NBD. The two subdomains of each lobe change conformation between ATP-bound, ADP-bound, and nucleotide-free states. ATP binding opens up the substrate-binding site; substrate-binding increases the rate of ATP hydrolysis. YegD lacks the SBD. HSP70 chaperone activity is regulated by various co-chaperones: J-domain proteins and nucleotide exchange factors (NEFs). Some family members are not chaperones but instead, function as NEFs for their Hsp70 partners, other family members function as both chaperones and NEFs [,
].
Involucrin [
,
] is a highly reactive, soluble, transglutaminase substrate protein present in keratinocytes of epidermis and other stratified squamous epithelia. Involucrin first appears in the cell cytosol, but ultimately becomes cross-linked to membrane proteins by transglutaminase thus helping in the formation of an insoluble envelope beneath the plasma membrane [] functioning as a glutamyldonor during assembly of the cornified envelope.
Structurally involucrin consists of a conserved region of about 75 amino acid
residues followed by two extremely variable length segments that containglutamine-rich tandem repeats. The glutamine residues in the tandem repeats
are the substrate for the tranglutaminase in the cross-linking reaction. Thetotal size of the protein varies from 285 residues (in dog) to 835 residues
(in orangutan).This entry represents the signature pattern for involucrin, which is located at the N-terminal extremity of these proteins.
This family represents one small subunit, C1a-32, of the C1a projection (the seventh projection of the flagellum in Chlamydomonas) [
]. Numerous studies have indicated that each of the seven projections associated with the central pair of microtubules in the flagellum plays a distinct role in regulating eukaryotic ciliary/flagellar motility. The C1a projection is a complex of proteins including PF6, C1a-86, C1a-34, C1a-32, C1a-18, and calmodulin. C1a projection is involved in modulating flagella beat frequency and this is mediated via the C1a-34, C1a-32, and C1a-18 sub-complex by modulating the activity of both the inner and outer dynein arms [].This entry also includes Cilia- and flagella-associated protein 119, whose function is unknown and ciliary-associated calcium-binding coiled-coil protein 1 (Cabcoco1) from vertebrates, a calcium-binding protein which may have a role in control of sperm flagellar movement [
].
This entry includes Hsk3, Hos3 and Golgin-45. Hsk3 is a subunit of the DASH complex, which is a ~10 subunit microtubule-binding complex that is transferred to the kinetochore prior to mitosis [
]. In Saccharomyces cerevisiae DASH forms both rings and spiral structures on microtubules in vitro [
,
]. This family also includes several higher eukaryotic proteins. However, other DASH subunits do not appear to be conserved in higher eukaryotes.Golgin-45 is required for normal Golgi structure and for protein transport from the endoplasmic reticulum (ER) through the Golgi apparatus to the cell surface. It interacts with GORASP2 and with the GTP-bound form of RAB2, but not with other Golgi Rab proteins [
].Hos3 (High osmolarity sensitivity protein 3) may play a role in the progression of mitosis in an environment of high osmotic stress [
].
Era (E. coli Ras-like protein) is a small G-protein widely conserved in eubacteria and eukaryotes. It is essential for bacterial cell viability and is required for the maturation of 16S rRNA and assembly of the 30S ribosomal subunit [
]. Era couples cell growth with cytokinesis and plays a role in cell division and energy metabolism. Era contains an N-terminal GTPase domain and a C-terminal distinct derivative of the type-II RNA-binding KH domain [,
,
,
]. Both domains are important for Era function. Era is functionally able to compensate for deletion of RbfA, a cold-shock adaptation protein that is required for efficient processing of the 16S rRNA.The Era-type GTPase domain consists of a central six-stranded β-sheet
flanked by five α-helices, in which the GTP-binding site is located. Guanine nucleotide molecules interact with highly conserved G protein regions G1-G5 [].
This entry represents the C terminus of the Rab3 GTPase-activating protein non-catalytic subunit.Small G proteins of the Rab family are regulators of intracellular vesicle traffic. Their rate of GTP hydrolysis is enhanced by specific GTPase-activating proteins (GAPs) that switch G proteins to their inactive form [
]. Rab3GAP1 (catalytic subunit) has been shown to form a heterodimeric complex with Rab3GAP2 (the regulatory subunit), and this complex acts as a guanosine nucleotide exchange factor for Rab3 subfamily (RAB3A, RAB3B, RAB3C and RAB3D). Rab3GAP complex may participate in neurodevelopmental processes such as proliferation, migration and differentiation before synapse formation, and non-synaptic vesicular release of neurotransmitters [,
]. It also activates Rab18 and promotes autolysosome maturation through the Vps34 Complex I [
].Mutations in the Rab3GAP1/2 gene cause Warburg micro syndrome (WMS), a hereditary autosomal neuromuscular disorder [
].
FYN-binding protein 1 (FYB1), also known as ADAP or SLAP, is an adapter protein in beta 1 integrin signalling and T lymphocyte migration [
]. It has been found to co-localise with F-actin in membrane ruffles, adhesion plaques/podosomes and phagocytic cups [,
]. In activated T cells, Fyb/SLAP associates with Ena/VASP family proteins and may link T cell signalling to the actin cytoskeleton remodelling []. FYB1 also interacts with mammalian actin binding protein 1 (mAbp1) that affects F-actin dynamics [].The SH3 domain of FYB adopts an altered fold referred to as a helically extended SH3 (hSH3) domain characterised by clusters of positive charges. The hSH3 domain can no longer bind conventional proline-rich peptides; instead, it functions as a novel lipid interaction domain and can bind acidic lipids such as phosphatidylserine, phosphatidylinositol, phosphatidic acid, and polyphosphoinositides [
,
,
].
This superfamily includes toluene tolerance protein Ttg2 and intermembrane phospholipid transport system binding protein MlaC.MlaC is a component of the Mla pathway, an ABC transport system that functions to maintain the asymmetry of the outer membrane. These proteins are involved in toluene tolerance, which is mediated by increased cell membrane rigidity resulting from changes in fatty acid and phospholipid compositions, exclusion of toluene from the cell membrane, and removal of intracellular toluene by degradation [
]. Ttg2 is one of the many proteins involved in these processes [].MlaC actively prevents phospholipid accumulation at the cell surface. It probably maintains lipid asymmetry in the outer membrane by retrograde trafficking of phospholipids from the outer membrane to the inner membrane. It may transfer phospholipids across the periplasmic space and deliver them to the mlaFEDB complex at the inner membrane [
].
These proteins are members of the ATP:ADP Antiporter (AAA) family, which consists of nucleotide transporters that have 12 GES predicted transmembrane regions. One protein from Rickettsia prowazekii functions to take up ATP from the eukaryotic cell cytoplasm into the bacterium in exchange for ADP. Five AAA family paralogues are encoded within the genome of R. prowazekii. This organism transports UMP and GMP but not CMP, and it seems likely that one or more of the AAA family paralogues are responsible. The genome of Chlamydia trachomatis encodes two AAA family members, Npt1 and Npt2, which catalyse ATP/ADP exchange and GTP, CTP, ATP and UTP uptake probably employing a proton symport mechanism. Two homologous adenylate translocators of Arabidopsis thaliana are postulated to be localized to the intracellular plastid membrane where they function as ATP importers.This family contains bacterial proteins as well as chloroplastic proteins found in plants.
This is the AAA ATPase domain found at the C-terminal of plant senescence-associated proteins and spartin. In Hemerocallis, petals have a genetically based program that leads to senescence and cell death approximately 24 hours, after the flower opens, and it is believed that senescence proteins produced around that time have a role in this program [
]. This domain is also found at the C-terminal of Spartin, a protein from higher vertebrates associated with endosomal trafficking and microtubule dynamics []. Spartin functions presynaptically with endocytic adaptor Eps15 to regulate synaptic growth and function. Mutations in human spartin gene cause Troyer syndrome, a hereditary spastic paraplegia []. This AAA ATPase domain similar to other AAA proteins contain an α/β nucleotide-binding domain (NBD) and a smaller four-helix bundle domain (HBD) []. Uniquely among AAA structures, spastin has two helices (N-terminal α1 and C-terminal α11) that embrace the NBD [].
The DP proteins function as binding partners for E2F transcription factors. The association of DP with E2F can either enhances or repress E2F-dependent transcriptional activity.The activities of both DP and E2F proteins are under cell cycle control, being influenced by the level of phosphorylation imparted through the cell cycle regulated activity of cyclin-dependent kinases. Molecules that function to positively regulate the G1/S transition, such as cyclin-cdk complexes, and negatively regulate it, such as the cdk inhibitors, converge on the Rb pathway. A principal role of pRb is in the regulation of the E2F/DP transcription factors, activity of which determines cell-cycle progression [
,
].Both DP and E2F proteins are endowed with proto-oncogenic activity and, conversely, have been implicated in regulating apoptosis [
].The DP proteins have been widely evolutionarily conserved and can be found in organisms ranging from Arabidopsis thaliana (Mouse-ear cress) to Homo sapiens (Human).
This group of enzymes utilise NADP or NAD, and is known as the GFO/IDH/MOCA family (GFO: glucose--fructose oxidoreductase, IDH: inositol 2-dehydrogenase and MOCA which catalyses a dehydrogenase reaction involved in rhizopine catabolism) in UniProtKB/Swiss-Prot, which includes enzymes that catalyse different chemical reactions such as oxidation and reduction of carbohydrates, oxidation of trans-dihydrodiols, reduction of biliverdin and hydrolysation of glycosidic bonds [
]. Other proteins belonging to this family include Gal80, a negative regulator for the expression of lactose and galactose metabolic genes, although it does not have enzymatic activity; and several hypothetical proteins from yeast, Escherichia coli and Bacillus subtilis.The Gfo/Idh/MocA protein family members have very low sequence identity but the 3D structures of the proteins are very similar, consisting of two main domains: an N-terminal dinucleotide-binding domain containing a typical Rossmann fold3 and a C-terminal α/β-domain participating in substrate binding and oligomerisation. This entry represents the N-terminal domain [].
Protein folding is thought to be the sole result of properties inherent in polypeptide primary sequences. Sometimes, however, additional proteins are required to mediate correct folding and subsequent oligomer assembly [
]. These `helpers', or chaperones, bind to specific protein surfaces, preventing incorrect folding and formation of non-functional structures [].The tailless complex polypeptide 1 (TCP-1) is a highly structurally conserved molecular chaperone located in the cytosol [
]. The protein has also been shown to bind to Golgi membranes and to microtubules, this latter property suggesting a role in mitotic spindle formation in dividing cells (especially in sperm, where it is highly abundant) []. TCP-1 forms a double ring structure, similar to the 10kDa and 60kDa chaperonins, with 6-8 subunits per ring. The amino acid sequence is significantly similar to the 60kDa chaperonin, and to TF55, a chaperone from the archaebacterium Sulfolobus shibatae [].
Desulfoferrodoxin is a non-haem iron protein which contains two types of iron atoms per molecule, a desulfoferrodoxin-like FES(4) site, and an octahedral coordinated high-spin ferrous site with nitrogen/oxygen-containing ligands. The short N-terminal domain contains four conserved Cys for binding of the ferric iron atom, and is homologous to the small protein desulforedoxin. The remainder of the molecule binds the ferrous iron atom and is similar to neelaredoxin, a monomeric blue non-haem iron protein. The homologue from Treponema pallidum, although essentially a full length homologue, lacks three of the four Cys residues in the N-terminal domain; the domain may have lost ferric binding ability but may have some conserved structural role such as dimerisation, or some new function. This protein is described in some articles as rubredoxin oxidoreductase (rbo), and its gene shares an operon with the rubredoxin gene in Desulfovibrio vulgaris (strain Hildenborough / ATCC 29579 / NCIMB 8303).
This is a superfamily of cognate antitoxins to the CbtA toxins that act by inhibiting the polymerisation of cytoskeletal proteins (see
). These are classified as a type IV toxin-antitoxin system [
]. The superfamily includes three proteins from E. coli YagB, YeeU and YfjZ, which act not by forming a complex with CbtA but through acting as antagonists to the CbtA toxicity, by stabilising the CbtA target proteins. For example, YeeU binds directly to both MreB and FtsZ and enhances the bundling of their filaments in vitro. YeeU is also able to neutralise the toxicity caused by other MreB and FtsZ inhibitors, such as A22 [S-(3, 4-dichlorobenzyl)isothiourea]for MreB, and SulA and DicB for FtsZ [
]. Thus CbeA, for cytoskeleton bundling-enhancing factor A, is proposed as a general name for all of these antitoxin proteins.The molecular structure has a beta(2)-α-β(3) fold arranged in two layers (alpha/beta).
This macro domain is found in proteins similar to human GDAP2, the ganglioside induced differentiation associated protein 2, whose gene is expressed at a higher level in differentiated Neuro2a cells compared with non-differentiated cells [
]. GDAP2 contains an N-terminal macro domain and a C-terminal Sec14p-like lipid binding domain. It is specifically expressed in brain and testis.The macro domain is a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes [].
The KASH (Klarsicht/ANC-1/Syne-1 homology), or KLS domain is a highly hydrophobic nuclear envelope localization domain of approximately 60 amino acids comprising a 20-amino-acid transmembrane region and a 30-35-residue C-terminal region that lies between the inner and the outer nuclear membranes. The KASH domain is found in association with other domains, such as spectrin repeats and CH, at the C terminus of proteins tethered to the nuclear membrane in diverse cell types [
,
,
,
,
].Some proteins known to contain a KASH domain are listed below:Caenorhabditis elegans nuclear anchorage protein 1 (ANC-1).Drosophila Klarsicht (Klar), a protein associated with nuclei and required
for a subset of nuclear migrations.Drosophila MSP-300.Vertebrate nesprin-1, -2, -3 and -4 (also known as Syne1-4). They are
components of the linker of the nucleoskeleton and cytoskeleton (LINC) complex, which plays critical roles in nuclear positioning, cell polarisation and cellular stiffness [].
Nitrogenase (
) [
] is the enzyme system responsible for biological nitrogen fixation. Nitrogenase is an oligomeric complex which consists of two components: component 2 is an homodimer of an iron-sulphur protein, while component 1 which contains the active site for the reduction of nitrogen to ammonia exists in three different forms: the molybdenum-iron containing protein (MoFe) is a hetero-tetramer consisting of two pairs of alpha (nifD) and beta (nifK) subunits; the vanadium-iron containing protein (VFe) is a hexamer of two pairs each of alpha (vnfD), beta (vnfK), and delta (vnfG) subunits; the third form seems to only contain iron and is a hexamer composed of alpha (anfD), beta (anfK), and delta (anfG) subunits.The alpha and beta chains of the three types of component 1 are evolutionary related and they are also related to proteins nifE and nifN, which are most probably involved in the iron-molybdenum cofactor biosynthesis [
].
This group of plasma glycoproteins includes coagulation factors VII, IX, and X, and proteins C and Z, which belong to MEROPS peptidase family S1, subfamily S1A (chymotrypsin, clan PA(S)). All but protein Z are peptidases and are involved in blood coagulation. The precursors contain a signal sequence, propeptide, Gla domain, two EGF domains (although sometimes only one is detected by Pfam), and a trypsin domain. Except for protein Z, they are further cleaved between the second EGF domain and the trypsin domain into light and heavy chains, which are connected by a disulphide bond. Glutamic acid residues in the Gla domain undergo vitamin K-dependent carboxylation, allowing this region to bind calcium and membrane phospholipid [
]. The propeptide region is important in providing a recognition site for the gamma-carboxylase []. Typically one aspartic acid residue in the light chain is post-translationally modified to erythro-beta-hydroxyaspartic acid [,
].
The link domain [
] is a hyaluronan(HA)-binding region found in proteins of vertebrates that are involved in the assembly of extracellular matrix, cell adhesion, and migration. The structure has been shown [] to consist of two alpha helices and two antiparallel beta sheets arranged around a large hydrophobic core similar to that of C-typelectin. This domain contains four conserved cysteines involved in two disulphide bonds. The link domain has also been termed HABM [
] (HA binding module) and PTR [] (proteoglycan tandem repeat). Proteins with such a domain include the proteoglycans aggrecan, brevican, neurocan and versican, which are expressed in the CNS; the cartilage link protein (LP), a proteoglycan that together with HA and aggrecan forms multimolecular aggregates; Tumour necrosis factor-inducible protein TSG-6, which may be involved in cell-cell and cell-matrix interactions during inflammation and tumourgenesis; and CD44 antigen, the main cell surface receptor for HA.
This entry represents the RNA recognition motif (RRM) of eIF-3G (Eukaryotic translation initiation factor 3 subunit G) and similar proteins. eIF-3G is the RNA-binding subunit of eIF3, a large multisubunit complex that plays a central role in the initiation of translation by binding to the 40 S ribosomal subunit and promoting the binding of methionyl-tRNAi and mRNA [
,
]. eIF-3G binds 18 S rRNA and beta-globin mRNA, and therefore appears to be a nonspecific RNA-binding protein []. eIF-3G is one of the cytosolic targets and interacts with mature apoptosis-inducing factor (AIF) []. eIF-3G contains one RNA recognition motif (RRM). Proteins containing this motif include yeast eIF3-p33 (also known as Tif35) []. eIF3-p33 is a homologue of vertebrate eIF-3G, and plays an important role in the initiation phase of protein synthesis. It binds both mRNA and rRNA fragments due to an RRM near its C terminus [].
This entry represents the RNA recognition motif 2 (RRM2) of heterogeneous nuclear ribonucleoprotein R (hnRNP R). hnRNP R is an ubiquitously expressed RNA-binding protein. hnRNP R has been implicated in transcription regulation [] and AANAT mRNA translation []. It is predominantly located in axons of motor neurons and to a much lower degree in sensory axons []. In axons of motor neurons, it also functions as a cytosolic protein and interacts with wild type of survival motor neuron (SMN) proteins directly, further providing a molecular link between SMN and the spliceosome []. Moreover, hnRNP R plays an important role in neural differentiation and development, and in retinal development and light-elicited cellular activities [].hnRNP R contains an acidic auxiliary N-terminal region, followed by two well defined and one degenerated RNA recognition motifs (RRMs), and a C-terminal RGG motif; it binds RNA through its RRM domains.
Guanylate-binding protein is a GTPase that is induced by interferon (IFN)-gamma. GTPases induced by IFN-gamma are key to the protective immunity against microbial and viral pathogens. These GTPases are classified into three groups: the small 47-kd GTPases, the Mx proteins, and the large 65- to 67-kd GTPases. Guanylate-binding proteins (GBP) fall into the last class. In humans, there are seven GBPs (hGBP1-7) []. Structurally, hGBP1 consists of two domains: a compact globular N-terminal domain harbouring the GTPase function (), and an α-helical finger-like C-terminal domain. Human GBP1 is secreted from cells without the need of a leader peptide, and has been shown to exhibit antiviral activity against Vesicular stomatitis virus and Encephalomyocarditis virus, as well as being able to regulate the inhibition of proliferation and invasion of endothelial cells in response to IFN-gamma [
].This entry represents the C-terminal domain superfamily of the guanylate-binding protein.
The crystal proteins of Bacillus thuringiensis have been extensively studied because of their pesticidal properties and their high natural levels of production [
]. When an insect ingests these proteins, they are activated by proteolytic cleavage. The N terminus is cleaved in all of the proteins and a C-terminal extension is cleaved in some members. Once activated, the endotoxin binds to the gut epithelium and causes cell lysis by the formation of cation-selective channels, which leads to death. The activated region of the toxin is composed of three distinct structural domains: an N-terminal helical bundle domain () involved in membrane insertion and pore formation; a β-sheet central domain involved in receptor binding; and a C-terminal β-sandwich domain (
) that interacts with the N-terminal domain to form a channel [
,
].This entry represents the conserved N-terminal domain superfamily of the pesticidal crystal protein.
Disulphide bond isomerase DsbC/G, N-terminal domain superfamily
Type:
Homologous_superfamily
Description:
This superfamily represents the N-terminal domain of the disulphide bond isomerase DsbC and DsbG.The disulphide bond isomerase (DsbC) is one of five Escherichia coli proteins required for disulphide bond formation, and functions to rearrange incorrect disulphide bonds during oxidative protein folding in the periplasm. DsbC acts as a homodimer with both disulphide isomerase and chaperone activity. It is selectively activated by the transmembrane electron transporter DsbDalpha, which functions as a thiol oxireductase [
]. Like other Dsb proteins, DsbC contains active site Cys-X-X-Cys sequences that form disulphide bonds, characteristic of thioredoxin proteins. DbsC consists of thioredoxin-like domains joined by a flexible hinge region to an N-terminal dimerisation domain. The crystal structure of the N-terminal domain reveals an α-β(4) core, where the helix packs against the coiled antiparallel β-sheet [].It has been suggested that DsbG functions as a disulfide isomerase with a narrower substrate specificity than DsbC.
This entry represents E3 ubiquitin-protein ligases MARCHF4, 9 and 11 predominantly found in chordates. E3 ubiquitin ligases accept ubiquitin from an E2 ubiquitin-conjugating enzyme in the form of a thioester and then directly transfer the ubiquitin to targeted substrates. Members of this small protein family contain a RING-CH-type zinc finger domain and are integrated into the cellular membrane [
,
,
].MARCHF4 and 9 are closely related. These proteins reduce the surface expression of major histocompatibility complex class I (MHC I), and downregulate CD4. They promote their subsequent endocytosis and sorting to lysosomes via multivesicular bodies. MARCHF4 appears to be mainly expressed in the brain. MARCHF9 controls the expression of the critical cell adhesion molecule ICAM-1 [
,
].MARCHF11, identified later, is also involved in the regulation of the poly-ubiquitination of CD4. This protein seems to be specifically expressed in the early developmental stages of spermatids [
].
Lantibiotic protection ABC transporter permease subunit, MutG family
Type:
Family
Description:
This entry includes lantibiotic ABC transporter permease subunit MutG which is a highly hydrophobic, integral membrane protein, and part of the bacitracin ABC transport system that confers resistance to the Gram-positive bacteria in which this system operates, specifically to lantibiotic mutacin. This protein transports mutacin to the surface and expels it from the membrane. Lantibiotics are small peptides, produced by Gram-positive bacteria, which are ribosomally-synthesized as pre-peptides and act by disrupting membrane integrity. Genes encoding the lantibiotic ABC transporter subunits are highly organized in operons containing all the genes required for maturation, transport, immunity, and synthesis [
,
,
]. This protein family is largely restricted to gallidermin-family lantibiotic cassettes, but also include orphan transporter cassettes in species that lack candidate lantibiotic precursor and synthetase genes [,
]. Members of this group of proteins are predominantly found in Firmicutes and some species of Actinobacteria.
This entry represents the RNA recognition motif 1 (RRM1) of heterogeneous nuclear ribonucleoprotein R (hnRNP R). hnRNP R is an ubiquitously expressed RNA-binding protein. hnRNP R has been implicated in transcription regulation [
] and AANAT mRNA translation []. It is predominantly located in axons of motor neurons and to a much lower degree in sensory axons []. In axons of motor neurons, it also functions as a cytosolic protein and interacts with wild type of survival motor neuron (SMN) proteins directly, further providing a molecular link between SMN and the spliceosome []. Moreover, hnRNP R plays an important role in neural differentiation and development, and in retinal development and light-elicited cellular activities [].hnRNP R contains an acidic auxiliary N-terminal region, followed by two well defined and one degenerated RNA recognition motifs (RRMs), and a C-terminal RGG motif; it binds RNA through its RRM domains.
Xeroderma pigmentosum (XP) [
] is a human autosomal recessive disease,characterised by a high incidence of sunlight-induced skin cancer. Skin cells of individual's with this condition are hypersensitive to ultraviolet light, due
to defects in the incision step of DNA excision repair. There are a minimum ofseven genetic complementation groups involved in this pathway: XP-A to XP-G.
XP-A is the most severe form of the disease and is due to defects in a 30kDanuclear protein called XPA (or XPAC) [
].The sequence of the XPA protein is conserved from higher eukaryotes [
] toyeast (gene RAD14) [
]. XPA is a hydrophilic protein of 247 to 296 amino-acidresidues which has a C4-type zinc finger motif in its central section.
This entry corresponds to the second conserved site in the XPA protein. It is a highly conserved region located some 12 residues after the zinc finger region
This entry represents a bacterial repeated motif of around 30 residues in length. These repeats are often found in multiple copies in the curlin proteins CsgA and CsgB. Curli fibres are thin aggregative surface fibres, connected with adhesion, which bind laminin, fibronectin, plasminogen, human contact phase proteins, and major histocompatibility complex (MHC) class I molecules. Curli fibres are coded for by the csg gene cluster, which is comprised of two divergently transcribed operons. One operon encodes the csgB, csgA, and csgC genes, while the other encodes csgD, csgE, csgF, and csgG. The assembly of the fibres is unique and involves extracellular self-assembly of the curlin subunit (CsgA), dependent on a specific nucleator protein (CsgB). CsgD is a transcriptional activator essential for expression of the two curli fibre operons, and CsgG is an outer membrane lipoprotein involved in extracellular stabilisation of CsgA and CsgB [
].
Tom40 forms a channel in the mitochondrial outer membrane with a pore about 1.5 to 2.5 nanometers wide. It functions as a transport channel for unfolded protein chains and forms a complex with Tom5, Tom6, Tom7, and Tom22. The primary receptors Tom20 and Tom70 recruit the unfolded precursor protein from the mitochondrial-import stimulating factor (MSF) or cytosolic Hsc70. The precursor passes through the Tom40 channel and through another channel in the inner membrane, formed by Tim23, to be finally translocated into the mitochondrial matrix. The process depends on a proton motive force across the inner membrane and requires a contact site where the outer and inner membranes come close [
]. Tom40 is also involved in inserting outer membrane proteins into the membrane, most likely not via a lateral opening in the pore, but by transfering precursor proteins to an outer membrane sorting and assembly machinery [
].
The zinc finger protein (ZPR1) is a eukaryotic protein that comprises tandem ZPR1 domains and which, in response to growth stimuli, binds to eukaryotic translation elongation factor 1A (eEF1A), assembles into multiprotein complexes with the survival motor neurons (SMN) protein, and accumulates in subnuclear structures, such as gems and Cajal bodies. ZPR1 has a conserved tandem architecture consisting of a duplicated module, the ZPR1 domain, comprised of two apparently modular domains: an elongation initiation factor 2-like zinc finger (Znf) and a double-stranded beta helix with a helical hairpin insertion (A/B domain). In consequence, the N- and C-terminal ZPR1 domains are referred to as the Znf1-A domain and Znf2-B domain modules, respectively. The Znf2-B domain module is required for viability, whilst the Znf1-A domain module is required for normal cell growth and proliferation [
].This superfamily represents the zinc finger domain (Znf1/2) found in ZPR1.
Grainyhead/CP2 are highly conserved transcription factors in metazoa that function as key regulators of epithelial differentiation and organ development. The family comprises two distinct groups, CP2 (CCAAT box-binding protein 2) and Grh (grainyhead). All family members share a common domain architecture [
,
].Grainyhead/Elf1 was first identified in Drosophila [
]. Three grainyhead-like (Grhl) homologues constitute the Grh subfamily in mammals []. Grhl proteins are predominantly expressed in epithelial tissues and are essential regulators of epithelial development and extracellular barrier repair after tissue damage [,
,
].TFCP2 (LSF, CP2 transcription factor) is a critical regulator of erythroid gene expression [
,
]. It also regulates the expressison of the male-dominant sex-determining gene SRY []. TFCP2 can interact with NF-E4 proteins forming heteromeric stage selector protein complex (SSP); this complex is able to bind stage selector element (SSE) and regulate embryonic globin expression in fetal-erythroid cells [].
By homology, members of this family are Zn metallohydrolases in the same family as the SoxH protein associated with sulfate metabolism, Bacillus cereus beta-lactamase II (see PDB:1bc2), and, more distantly, hydroxyacylglutathione hydrolase (glyoxalase II). All members occur in genomes with both PQQ biosynthesis and a PQQ-dependent (quinoprotein) dehydrogenase that has a motif of two consecutive Cys residues (
). The Cys-Cys motif is associated with electron transfer by specialised cytochromes such as c551. All these genomes also include a fusion protein (
) whose domains resemble SoxY and SoxZ from thiosulfate oxidation. A conserved Cys in this fusion protein aligns to the Cys residue in SoxY that carries sulfur cycle intermediates. In many genomes, the genes for PQQ biosynthesis enzymes, PQQ-dependent enzymes, their associated cytochromes, and members of this family are clustered. Note that one to three closely related Zn metallohydrolases may occur; this family represents a specific clade among them.
This OB-fold domain folds into a five-stranded β-barrel [
]. Proteins containing this domain are found in various staphylococcal toxins described as staphylococcal superantigen-like (SSL) proteins that are related to the staphylococcal enterotoxins (SEs) or superantigens. These SSL proteins of which 11 have so far been characterised have a typical SE tertiary structure consisting of a distinct oligonucleotide/oligosaccharide binding (OB-fold), this domain, linked to a β-grasp domain, family Stap_Strp_tox_C (). SSLs do not bind to T-cell receptors or major histocompatibility complex class II molecules and do not stimulate T cells. SSLs target components of innate immunity, such as complement, Fc receptors, and myeloid cells [,
,
,
,
,
,
]. SSL protein 7 (SSL7) is the best characterised of the SSLs and binds complement factor C5 and IgA with high affinity and inhibits the end stage of complement activation and IgA binding to FcgammaR [].
Protein piccolo, also known as aczonin, is a neuron-specific presynaptic active zone scaffolding protein that mainly interacts with a detergent-resistant cytoskeletal-like subcellular fraction and is involved in the organization of the interplay between neurotransmitter vesicles, the cytoskeleton, and the plasma membrane at synaptic active zones [
]. It binds profilin, an actin-binding protein implicated in actin cytoskeletal dynamics []. It also functions as a presynaptic low-affinity Ca2 sensor and has been implicated in Ca2 regulation of neurotransmitter release []. Piccolo is a multi-domain protein containing two N-terminal FYVE zinc fingers, a polyproline tract, and a PDZ domain and two C-terminal C2 domains. This entry represents the first FYVE domain, which resembles a FYVE-related domain that is structurally similar to the canonical FYVE domains but lacks the three signature sequences: an N-terminal WxxD motif (x for any residue), the central basic R(R/K)HHCRxCG patch, and a C-terminal RVC motif.
Three different forms of catenin (designated alpha, beta and gamma) comprise the cytoplasmic domain of the cadherin cell-cell adhesion complex [
]. Beta-catenin forms a cadherin/beta-catenin/alphaE-catenin complex that can tether the tripartite adhesion complex and regulate actin dynamics []. This entry represents the beta catenins and homologues, plakoglobin and the Drosophila Armadillo protein [
], which are implicated in cell adhesion and Wnt signalling []. Originally identified as downstream elements (armadillo phenotype) of the Wnt signalling pathway (wingless phenotype) [,
]. The beta-catenin structure has been determined [,
]. Beta catenin family proteins contain several ARM repeats, sequences of approximately 50 amino acids involved in protein-protein interactions. Each repeat consists of three helices [,
,
], with helix 1 and 3 antiparallel to each other and perpendicular to helix 2 [,
,
]. A conserved glycine residue allows the sharp turn between helices 1 and 2 [,
,
].
Nebulette is a cardiac-specific protein that localizes to the Z-disc [
]. It interacts with tropomyosin and is important in stabilizing actin thin filaments in cardiac muscles []. Polymorphisms in the nebulette gene are associated with dilated cardiomyopathy, with some mutations resulting in severe heart failure []. Nebulette is a 107kDa protein that contains an N-terminal acidic region, multiple nebulin repeats, and a C-terminal SH3 domain. LIM-nebulette, also called Lasp2 (LIM and SH3 domain protein 2), is an alternatively spliced variant of nebulette. Although it shares a gene with nebulette, Lasp2 is not transcribed from a muscle-specific promoter, giving rise to its multiple tissue expression pattern with highest amounts in the brain [,
]. It can crosslink actin filaments and it affects cell spreading []. Lasp2 is a 34kDa protein containing an N-terminal LIM domain, three nebulin repeats, and a C-terminal SH3 domain. This entry represents the SH3 domain.
This entry represents the SOCS box domain of SOCS6.Suppressor of cytokine signaling (SOCS) family proteins form part of a classical negative feedback system that regulates cytokine signal transduction. However, SOCS6 does not interact with cytokine signaling intermediate molecules or inhibit cytokine receptor signaling. SOCS6 can interact with c-KIT, a receptor tyrosine kinase that mediates the cellular response to stem cell factor (SCF). SOCS6 has ubiquitin ligase activity toward c-KIT and regulates c-KIT protein turnover in cells, suppressing c-KIT-dependent pathways [
]. SOCS6 negatively regulates receptor tyrosine kinase Flt3 activation, the downstream Erk signaling pathway, and cell proliferation [].The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions [
,
].
FLO and LFY proteins are floral meristem identity proteins [
,
]. Mutations in the sequences of these proteins affect flower and leaf development. LFY has been shown to bind semi-palindromic 19-bp DNA elements through its highly conserved C-terminal DNA-binding domain (DBD). In addition to its well-characterized DBD, LFY possesses a second conserved domain at its amino terminus (LFY-N). Crystallographic structure determination shows that LFY-N is a sterile alpha motif (SAM) domain that mediates LFY oligomerization. It allows LFY to bind to regions lacking high-affinity LFY-binding sites and to access closed chromatin regions. Experiments revealed that altering the capacity of LFY to oligomerize compromised floral function. It has been suggested that the biochemical properties of the SAM domain are evolutionary conserved in all plant species [].This entry represents a SAM domain found in various plant proteins which are homologues of floricaula (FLO) and Leafy (LFY).
Putative auto-transporter adhesin, head GIN domain
Type:
Domain
Description:
Proteins containing this domain show structural similarity to other pectin lyase families. Although these proteins align with acetyl-transferases, there is no conservation of catalytic residues found. It is likely that the function is one of cell-adhesion. In PDB:3jx8, it is interesting to note that the sequence of contains several well defined sequence repeats, centred around GSG motifs defining the tight beta turn between the two sheets of the super-helix; there are 8 such repeats in the C-terminal half of the protein, which could be grouped into 4 repeats of two. It seems likely that proteins in this entry belongs to the superfamily of trimeric auto-transporter adhesins (TAAs), which are important virulence factors in Gram-negative pathogens [
,
]. In the case of Parabacteroides distasonis, which is a component of the normal distal human gut microbiota, TAA-like complexes probably modulate adherence to the host (information derived from TOPSAN).
DOCK family members are evolutionarily conserved guanine nucleotide exchange factors (GEFs) for Rho-family GTPases [
]. DOCK proteins are required during several cellular processes, such as cell motility and phagocytosis. The N-terminal SH3 domain of the DOCK proteins functions as an inhibitor of GEF, which can be relieved upon its binding to the ELMO1-3 adaptor proteins, after their binding to active RhoG at the plasma membrane [,
]. DOCK family proteins are categorised into four subfamilies based on their sequence homology: DOCK-A subfamily (DOCK1/180, 2, 5), DOCK-B subfamily (DOCK3, 4), DOCK-C subfamily (DOCK6, 7, 8), DOCK-D subfamily (DOCK9, 10, 11) []. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1 (CED-5, Dock180, and MBC-zizimin homology 1), and DHR-2 (also called CZH2 or Docker). This entry represents the N-terminal domain of the DOCK-C subfamily (DOCK 6, 7, 8) and DOCK-D subfamily (DOCK 9, 10, 11).
Post-transcriptional RNA editing in Trypanosomatids (pathogenic protozoa) is catalyzed by a large multiprotein complex, the editosome. A key editosome enzyme, RNA editing terminal uridylyl transferase 2 (TUTase 2; RET2) catalyzes the uridylate addition reaction. RET2 structure consists of three domains: the N-terminal domain (NTD), the middle domain (MD) and the C-terminal domain (CTD). This MD domain is mainly composed of six helices and a four-stranded antiparallel β-sheet. structural comparison reveals that the fold of this MD is topologically similar to the binding domains of several RNA-binding proteins such as the RNA-binding domain of the U1A spliceosomal protein, the RRM domain of the human La protein and the CTD of an archaeal CCA-adding enzyme. The CTD of the archaeal CCA-adding enzyme has been shown to bind double-stranded tRNA stem substrate through the α-helices regions. Hence it is suggested that this domain might be an RNA-binding domain [].
This entry represents the C-terminal domain found in the tetracycline transcriptional repressor TetR, which binds to the Tet(A) gene to repress its expression in the absence of tetracycline [
]. Tet(A) is a membrane-associated efflux protein that exports tetracycline from the cell before it can attach to ribosomes and inhibit polypeptide chain growth. TetR occurs as a homodimer and uses two helix-turn-helix (HTH) motifs to bind tandem DNA operators, thereby blocking the expression of the associated genes, TetA and TetR. The structure of the class D TetR repressor protein [] involves 10 α-helices, with connecting turns and loops. The three N-terminal helices constitute the DNA-binding HTH domain, which has an inverse orientation compared with HTH motifs in other DNA-binding proteins. The core of the protein, formed by helices 5-10, is responsible for dimerisation and contains, for each monomer, a binding pocket that accommodates tetracycline in the presence of a divalent cation.
ClpX is a member of the HSP (heat-shock protein) 100 family. Gel filtration and electron microscopy showed that ClpX subunits associate to form a six-membered ring that is stabilised by binding of ATP or nonhydrolysable analogs of ATP [
]. It functions as an ATP-dependent [] molecular chaperone and is the regulatory subunit of the ClpXP protease [].ClpXP is involved in DNA damage repair, stationary-phase gene expression, and ssrA-mediated protein quality control. To date more than 50 proteins include transcription factors, metabolic enzymes, and proteins involved in the starvation
and oxidative stress responses have been identified as substrates []. The N-terminal domain of ClpX is a C4-type zinc binding domain (ZBD) involved in substrate recognition. ZBD forms a very stable dimer that is essential for promoting the degradation of some typical ClpXP substrates such as lO and MuA [
]. This entry represents ClpX subunit from bacteria.
This entry includes Ubiquitin fusion degradation protein Ufd1 from fungi and Ufd1-like proteins from animals and plants. Ufd1 is part of the Ufd1-Npl4 complex that functions as the substrate-recruiting cofactor for Cdc48 segregase. The Cdc48-Ufd1-Npl4 complex is involved in degradation of misfolded ER proteins [
]. The Ufd1-Npl4 complex has been found to recruit Cdc48 to ubiquitylated CMG (Cdc45-MCM-GINS) helicase at the end of chromosome replication, thereby driving the disassembly reaction [].In humans, Npl4-Ufd1 acts as a cofactor in reducing antiviral innate immune responses by facilitating proteasomal degradation of RIG-I (a viral RNA sensor) [
]. The Ufd1 N-terminal fragment is composed of two readily identifiable subdomains designated as Nn and Nc subdomains. The Nn subdomain adapts a double-psi beta barrel fold, and the Nc subdomain has a mixed alpha/beta roll structure [
]. This superfamily represents the Nn domain found in Ufd1 proteins.
Packaging of double-stranded viral DNA concatemers requires interaction of the prohead with virus DNA. This process is mediated by a phage-encoded DNA recognition and terminase protein. The terminase enzymes described so far, which are hetero-oligomers composed of a small and a large subunit, do not have a significant level of sequence homology. The small terminase subunit is thought to form a nucleoprotein structure that helps to position the terminase large subunit at the packaging initiation site [
]. The small terminase protein is essential for the initial recognition of viral DNA and regulates the motor's ATPase and nuclease activities during DNA translocation [] and for switching between viral DNA replication and packaging. DNA packaging in tailed bacteriophages and in evolutionarily related herpesviruses is controlled by a viral-encoded terminase. The terminase complex characterised in Bacillus subtilis bacteriophages SF6 and SPP1 consists of two proteins: G1P and G2P [,
].
Plants are attacked by a range of phytopathogenic organisms, including viruses, mycoplasma, bacteria, fungi, nematodes, protozoa and parasites. Resistance to a pathogen is manifested in several ways and is often correlated with a hypersensitive response (HR), localised induced cell death in the host plant at the site of infection [
,
]. The induction of the plant defence response that leads to HR is initiated by the plants recognition of specific signal molecules (elicitors) produced by the pathogen; R genes are thought to encode receptors for these elicitors. RPS2, N and L6 genes confer resistance to bacterial, viral and fungal pathogens.Sequence analysis has shown that they contain C-terminal leucine-rich repeats, which are characteristic of plant and animal proteins involved in protein-protein interactions [
]. In addition, the sequences contain a conserved nucleotide-binding site towards their N-terminal.This entry represents a group of plant disease resistance proteins.
The BED finger, which was named after the Drosophila proteins BEAF and DREF, is found in one or more copies in cellular regulatory factors and transposases from plants, animals and fungi. The BED finger is an about 50 to 60 amino acid residues domain that contains a characteristic motif with two highly conserved aromatic positions, as well as a shared pattern of cysteines and histidines that is predicted to form a zinc finger. As diverse BED fingers are able to bind DNA, it has been suggested that DNA-binding is the general function of this domain [
].Some proteins known to contain a BED domain are listed below:Animal, fungal and plant AC1 and Hobo-like transposases.Caenorhabditis elegans protein dpy-20, a predicted cuticular-gene transcriptional regulator.Drosophila BEAF (boundary element-associated factor), which is thought to be involved in chromatin insulation.Drosophila DREF, a transcriptional regulator for S-phase genes.Tobacco 3AF1 and tomato E4/E8-BP1, which are light- and ethylene-regulated DNA binding proteins that contain two BED fingers [
,
].
Some oxygen-dependent oxidoreductases are flavoproteins that contains a
covalently bound FAD group which is attached to a histidine via an 8-alpha-(N3-histidyl)-riboflavin linkage. These proteins include:(R)-6-hydroxynicotine oxidase (EC 1.5.3.6) (6-HDNO) [
], a bacterial enzymethat catalyzes the oxygen-dependent degradation of 6-hydroxynicotine into
6-hydroxypyrid-N-methylosmine.Plant reticuline oxidase (EC 1.21.3.3) [
] (berberine-bridge-formingenzyme), an enzyme that catalyzes the oxidation of (S)-reticuline into (S)-
scoulerine in the pathway leading to benzophenanthridine alkaloids.L-gulonolactone oxidase (EC 1.1.3.8) (l-gulono-gamma-lactone oxidase) [
],a mammalian enzyme which catalyzes the oxidation of L-gulono-1,4-lactone to
L-xylo-hexulonolactone which spontaneously isomerizes to L-ascorbate.D-arabinono-1,4-lactone oxidase (EC 1.1.3.24) (L-galactonolactone oxidase),
a yeast enzyme involved in the biosynthesis of D-erythroascorbic acid [].Mitomycin radical oxidase [
], a bacterial protein involved in mitomycinresistance and that probably oxidizes the reduced form of mitomycins.
Cytokinin oxidase (EC 1.4.3.18), a plant enzyme.Rhodococcus fascians fasciation locus protein fas5.This entry represents the conserved region around the histidine that binds the FAD group is conserved in these enzymes.
Some bacterial regulatory proteins activate the expression of genes from
promoters recognised by core RNA polymerase associated with the alternativesigma-54 factor. These have a conserved domain of about 230 residues involved
in the ATP-dependent [,
] interaction with sigma-54. About half of the proteins in which this domain is found (algB, dcdT, flbD, hoxA, hupR1, hydG, ntrC, pgtA and pilR) belong to signal transduction two-component systems [] and possess a domain that can be phosphorylated by a sensor-kinase protein in their N-terminal section. Almost all of these proteins possess a helix-turn-helix DNA-binding domain in their C-terminal section.The domain involved in interaction with the sigma-54 factor has an ATPase activity. This may be required to promote a conformational change necessary for the interaction [
]. The domain contains an atypical ATP-binding motif A (P-loop) as well as a form of motif B. This entry represents a conserved site corresponding to the first ATP-binding motif located in the N-terminal section of the sigma-54 interaction domain.
This entry represents an N-terminal domain found in metallopeptidases and non-peptidase homologues belonging to MEROPS peptidase family M16 (clan ME), subfamilies M16A, M16B and M16C. Members of this family include:Insulinase, insulin-degrading enzyme (
)
Mitochondrial processing peptidase alpha subunit, (Alpha-MPP,
)
Pitrlysin, Protease III precursor (
)
Nardilysin, (
)
Ubiquinol-cytochrome C reductase complex core protein I,mitochondrial precursor (
)
Coenzyme PQQ synthesis protein F (
)
These proteins do not share many regions of sequence similarity; the most noticeable is in the N-terminal section. This region includes a conserved histidine followed, two residues later by a glutamate and another histidine. In pitrilysin, it has been shown [
] that this H-x-x-E-H motif is involved in enzymatic activity; the two histidines bind zinc and the glutamate is necessary for catalytic activity.The proteins classified as non-peptidase homologues either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity.
This entry represents the dimerisation domain found in the transferrin receptor, as well as in a number of other proteins including glutamate carboxypeptidase II and N-acetylated-alpha-linked acidic dipeptidase like protein.The transferrin receptor (TfR) assists iron uptake into vertebrate cells through a cycle of endo- and exocytosis of the iron transport protein transferrin (Tf). TfR binds iron-loaded (diferric) Tf at the cell surface and carries it to the endosome, where the iron dissociates from Tf. The apo-Tf remains bound to TfR until it reaches the cell surface, where apo-Tf is replaced by diferric Tf from the serum to begin the cycle again. Human TfR is a homodimeric type II transmembrane protein. The crystal structure of a TfR monomer reveals a 3-domain structure: a protease-like domain that closely resembles carboxy- and amino-peptidases; an apical domain consisting of a β-sandwich; and a helical dimerisation domain. The dimerisation domain consists of a 4-helical bundle that makes contact with each of the three domains in the dimer partner [
].
The scavenger receptor cysteine-rich (SRCR) domain is an ancient and highly
conserved domain of about 110 residues which is found in diverse secreted andcell-surface proteins, like the type I scavenger receptor, the speract
receptor, CD5/Ly-1, CD6, or complement factor I []. Tandem repeats of SRCRdomains are common in the membrane bound proteins. Most SRCR domains have six
to eight cysteines that participate in intradomain disulfide bonds. SRCRdomains have been subdivided into two groups, A and B, primarily on the
differences in the spacing pattern between the cysteine residues [,
].Although the biochemical functions of SRCR domains have not been established
with certainty, they are likely to mediate protein-protein interactions andligand binding [
,
].Determination of the crystal structure of the SRCR domain of M2BP reveals that
the M2NP SRCR adopts a compact fold of approximate dimensions 22 x 26 x 30Angstrom, organised around a curved six-stranded β-sheet cradling an alpha-
helix [].
Ero1 and PDI form the disulfide relay system of the ER that supports correct disulfide bond formation of secretory proteins. This entry represents Ero1 (endoplasmic oxidoreductin-1) from yeasts and its homologues from mammals, Ero1-alpha and Ero1-beta. Ero1 is an flavoprotein that directly transfers disulfide bonds to disulfide isomerase PDI [
,
,
]. Ero1 acts as an thiol oxidoreductase responsible for catalyzing disulfide bond formation in nascent polypeptide substrates via electron transfer through protein disulfide isomerase (PDI) with oxygen acting as the final electron acceptor []. Newly generated disulfides are transferred from a FAD (flavin adenine dinucleotide)-associated active site via a "shuttle disulfide"cysteine pair in Ero1 to PDI and from there on to substrate proteins [
,
,
]. The activity of Ero1 is regulated by PDI (also known as Pdi1). This regulation of Ero1 through reduction and oxidation of regulatory bonds within Ero1 is essential for maintaining the proper redox balance in the ER [,
].
Methyltransferases (Mtases) are responsible for the transfer of methyl groups between two molecules. The transfer of the methyl group from the ubiquitous S-adenosyl-L-methionine (AdoMet) to either nitrogen, oxygen or carbon atoms is frequently employed in diverse organisms. The reaction is catalysed by Mtases and modifies DNA, RNA, proteins or small molecules, such as catechol, for regulatory purposes. Proteins in this entry belong to the RsmE family of Mtases, this is supported by crystal structural studies, which show a close structural homology to other known methyltransferases [
].This group of proteins includes Ribosomal RNA small subunit methyltransferase E (RsmE) from Escherichia coli, which specifically methylates the uridine in position 1498 of 16S rRNA in the fully assembled 30S ribosomal subunit [
,
]. This enzyme has two distinct but structurally related domains: the N-terminal PUA domain and the conserved MTase domain at the C-terminal end. This protein adopts a dimeric configuration that is functionally critical for substrate binding and catalysis [].
Members of this family are involved in asparagine-linked protein glycosylation. In particular, dolichyl-diphosphooligosaccharide-protein glycosyltransferase (DDOST), also known as oligosaccharyltransferase (
), transfers the high-mannose sugar GlcNAc(2)-Man(9)-Glc(3) from a dolichol-linked donor to an asparagine acceptor in a consensus Asn-X-Ser/Thr motif. In most eukaryotes, the DDOST complex is composed of three subunits, which in humans are described as a 48kDa subunit, ribophorin I, and ribophorin II [
]. However, the yeast DDOST appears to consist of six subunits (alpha, beta, gamma, delta, epsilon, zeta). The yeast beta subunit is a 45kDa polypeptide, previously discovered as the Wbp1 protein, with known sequence similarity to the human 48kDa subunit and the other orthologues. This family includes the 48kDa-like subunits from several eukaryotes; it also includes the yeast DDOST beta subunit Wbp1.Dolichyl-diphosphooligosaccharide--protein glycosyltransferase subunit Wbp1 is the beta subunit of the OST complex, one of the original six subunits purified [
]. Wbp1 is essential [,
], but conditional mutants have decreased transferase activity [,
].
Presenilin 1 (PSN1) and presenilin 2 (PSN2) are membrane proteins, whose genes are mutated in some individuals with Alzheimer's disease. They undergo tightly regulated endolytic processing to generate stable PSN C-terminal and N-terminal fragments that form the catalytic core of the gamma-secretase complex, an endoprotease complex that catalyses the intramembrane cleavage of integral membrane proteins such as Notch receptors [
].Presenelins are related to the signal peptide peptidase (SPP) family of aspartic proteases that promote intramembrane proteolysis to release biologically important peptides. However, the SPPs work as single polypeptides. SPP catalyses intramembrane proteolysis of some signal peptides after they have been cleaved from a preprotein. In humans, SPP activity is required to generate signal sequence-derived human lymphocyte antigen-E epitopes that are recognised by the immune system, and are required in the processing of the hepatitis C virus core protein [
,
].This group of aspartic peptidases belong to MEROPS peptidase family A22 (presenilin family, clan AD).
The ankyrin repeat is one of the most common protein-protein interaction motifs in nature. Ankyrin repeats are tandemly repeated modules of about 33 amino acids. They occur in a large number of functionally diverse proteins mainly from eukaryotes. The few known examples from prokaryotes and viruses may be the result of horizontal gene transfers. The repeat has been found in proteins of diverse function such as transcriptional initiators, cell-cycle regulators [], cytoskeletal, ion transporters and signal transducers [,
]. The ankyrin fold appears to be defined by its structure rather than its function since there is no specific sequence or structure which is universally recognised by it.The conserved fold of the ankyrin repeat unit is known from several crystal and solution structures [
,
,
,
]. Each repeat folds into a helix-loop-helix structure with a β-hairpin/loop region projecting out from the helices at a 90oangle. The repeats stack together to form an L-shaped structure [
,
].
The retroviral oncogene v-myb, and its cellular counterpart c-myb, encode nuclear DNA-binding proteins. In myb, one of the most conserved regions consisting of three tandem repeats has been shown to be involved in DNA-binding [
].The SANT domain is present in nuclear receptor co-repressors and in the subunits of many chromatin-remodelling complexes [
]. It has a strong structural similarity to the DNA-binding domain of Myb-related proteins []. Both consist of tandem repeats of three α-helices that are arranged in a helix-turn-helix motif, each α-helix containing a bulky aromatic residue. Despite the overall similarity there are differences that indicate that the SANT domain is functionally divergent from the canonical Myb DNA-binding domain [].The myb/SANT domains can be classified into three groups: the myb-type HTH domain, which binds DNA, the SANT domain, which is a protein-protein interaction module, and the myb-like domain that can be involved in either of these functions. This entry represents a myb-like domain.
This group of metallopeptidases belong to MEROPS peptidase family M24 (clan MG), subfamily M24A.Methionine aminopeptidase (
) (MAP) catalyses the hydrolytic cleavage of the N-terminal methionine from newly synthesised polypeptides if the penultimate amino acid is small, with different tolerance to Val and Thr at this position [
]. All MAP studied to date are monomeric proteins that require cobalt ions for activity. Two subfamilies of MAP enzymes are known to exist [,
]. While being evolutionary related, they only share a limited amount of sequence similarity mostly clustered around the residues shown, in the Escherichia coli MAP [], to be involved in cobalt-binding. The first family consists of enzymes from prokaryotes as well as eukaryotic MAP-1 (), while the second group is made up of archaeal MAP and eukaryotic MAP-2 [
] and includes proteins which do not seem to be MAP, but that are clearly evolutionary related such as mouse proliferation-associated protein 1 and fission yeast curved DNA-binding protein.
UBA domains are a commonly occurring sequence motif of approximately 45 amino acid residues that are found in diverse proteins involved in the ubiquitin/proteasome pathway, DNA excision-repair, and cell signalling via protein kinases [
]. The human homologue of yeast Rad23A is one example of a nucleotide excision-repair protein that contains both an internal and a C-terminal UBA domain. The solution structure of human Rad23A UBA(2) showed that the domain forms a compact three-helix bundle []. Comparison of the structures of UBA(1) and UBA(2) reveals that both form very similar folds and have a conserved large hydrophobic surface patch which may be a common protein-interacting surface present in diverse UBA domains. Evidence that ubiquitin binds to UBA domains leads to the prediction that the hydrophobic surface patch of UBA domains interacts with the hydrophobic surface on the five-stranded β-sheet of ubiquitin [].This domain is similar in sequence to the N-terminal domain of translation elongation factor EF1B (or EF-Ts) from bacteria, mitochondria and chloroplasts [
].
Casein kinase, a ubiquitous well-conserved protein kinase involved in cell metabolism and differentiation, is characterised by its preference for Ser or Thr in acidic stretches of amino acids. The enzyme is a tetramer of 2 alpha- and 2 beta-subunits [
,
]. However, some species (e.g., mammals) possess 2 related forms of the alpha-subunit (alpha and alpha'), while others (e.g., fungi) possess 2 related beta-subunits (beta and beta') []. The alpha-subunit is the catalytic unit and contains regions characteristic of serine/threonine protein kinases. The beta-subunit is believed to be regulatory, possessing an N-terminal auto-phosphorylation site, an internal acidic domain, and a potential metal-binding motif []. The beta subunit contains, in its central section, a cysteine-rich motif, CX(n)C, that could be involved in binding a metal such as zinc [
]. The mammalian beta-subunit gene promoter shares common features with those of other mammalian protein kinases and is closely related to the promoter of the regulatory subunit of cAMP-dependent protein kinase [].
Proteins containing the ancient conserved domain protein/cyclin M (CNNM) are integral membrane proteins that are conserved from bacteria to humans. CNNM family members influence metal ion homeostasis through mechanisms that may not involve direct membrane transport of the ions. Structurally, CNNMs are complex proteins that contain an extracellular N-terminal domain preceding a transmembrane domain, a "Bateman module", which consists of two cystathionine-beta-synthase (CBS) domains, and a C-terminal cNMP (cyclic nucleotide monophosphate) binding domain [
,
,
,
,
].The CNNM transmembrane domain contains four hydrophobic regions and forms a dimer through hydrophobic contacts between TM2 and TM3, in which each chain is composed of three transmembrane helices (TM1-3), a pair of short helices exposed on the intracellular side, and a juxtamembrane (JM) helix that forms a belt-like structure [
,
]. The homodimer adopts an inward-facing conformation with a negatively charged cavity containing a conserved pi-helical turn in TM3 that coordinates a Mg2 ion [].
The Crp-type HTH domain is a DNA-binding, winged helix-turn-helix (wHTH) domain of about 70-75 amino acids present in transcription regulators of the crp-fnr family, involved in the control of virulence factors, enzymes of aromatic ring degradation, nitrogen fixation, photosynthesis, and various
types of respiration. The Crp-Fnr family is named after the first members identified in Escherichia coli: the well characterised cyclic AMP receptor protein CRP or CAP (catabolite activator protein) and the fumarate and nitrate reductase regulator Fnr. Crp-type HTH domain proteins occur in most bacteria and in chloroplasts of red algae. The DNA-binding HTH domain is located in the C-terminal part; the N-terminal part of the proteins of the Crp-Fnr family contains a nucleotide-binding domain and a dimerization/linker helix occurs in between. The Crp-Fnr regulators predominantly act as transcription activators, but can also be important repressors, and respond to diverse intracellular and exogenous signals, such as cAMP, anoxia, redox state, oxidative and nitrosative stress, carbon monoxide, nitric oxide or temperature [,
].
Some bacterial regulatory proteins activate the expression of genes from
promoters recognised by core RNA polymerase associated with the alternativesigma-54 factor. These have a conserved domain of about 230 residues involved
in the ATP-dependent [,
] interaction with sigma-54. About half of the proteins in which this domain is found (algB, dcdT, flbD, hoxA, hupR1, hydG, ntrC, pgtA and pilR) belong to signal transduction two-component systems [] and possess a domain that can be phosphorylated by a sensor-kinase protein in their N-terminal section. Almost all of these proteins possess a helix-turn-helix DNA-binding domain in their C-terminal section.The domain involved in interaction with the sigma-54 factor has an ATPase activity. This may be required to promote a conformational change necessary for the interaction [
]. The domain contains an atypical ATP-binding motif A (P-loop) as well as a form of motif B. This entry represents a conserved site corresponding to the second ATP-binding motif located in the N-terminal section of the sigma-54 interaction domain.
Some bacterial regulatory proteins activate the expression of genes from
promoters recognised by core RNA polymerase associated with the alternativesigma-54 factor. These have a conserved domain of about 230 residues involved
in the ATP-dependent [,
] interaction with sigma-54. About half of the proteins in which this domain is found (algB, dcdT, flbD, hoxA, hupR1, hydG, ntrC, pgtA and pilR) belong to signal transduction two-component systems [] and possess a domain that can be phosphorylated by a sensor-kinase protein in their N-terminal section. Almost all of these proteins possess a helix-turn-helix DNA-binding domain in their C-terminal section.The domain which interacts with the sigma-54 factor has an ATPase activity. This may be required to promote a conformational change necessary for the interaction [
]. The domain contains an atypical ATP-binding motif A (P-loop) as well as a form of motif B. The two ATP-binding motifs are located in the N-terminal section of the domain.
Some bacterial regulatory proteins activate the expression of genes from
promoters recognised by core RNA polymerase associated with the alternativesigma-54 factor. These have a conserved domain of about 230 residues involved
in the ATP-dependent [,
] interaction with sigma-54. About half of the proteins in which this domain is found (algB, dcdT, flbD, hoxA, hupR1, hydG, ntrC, pgtA and pilR) belong to signal transduction two-component systems [] and possess a domain that can be phosphorylated by a sensor-kinase protein in their N-terminal section. Almost all of these proteins possess a helix-turn-helix DNA-binding domain in their C-terminal section.The domain involved in interaction with the sigma-54 factor has an ATPase activity. This may be required to promote a conformational change necessary for the interaction [
]. The domain contains an atypical ATP-binding motif A (P-loop) as well as a form of motif B. This entry represents a conserved site found in the C-terminal section of the sigma-54 interaction domain.
The SPRE (also known as SPRED) proteins have the following domains: an N-terminal EVH1 domain, a unique KBD (c-Kit kinase binding) domain which is phosphorylated by the stem cell factor receptor c-Kit, and a C-terminal cysteine-rich SPR (Sprouty-related) domain which is involved in membrane localization. This entry represents the EVH1 domain of SPRE. Proteins containing this domain include Spred1 which interacts with both Ras and Raf through its SPR domain; Spred2 which is the most abundant isoform; and Spred3 which has a non-functional KBD and maintains the inhibitory action on Raf. Legius syndrome is caused by heterozygous mutations in Spred1. Both EVH1 and SPR domains are involved in the inhibition of the MAP kinase pathway by SPRE proteins. The specific function of the EVH1 domain is unknown and there are no known interacting proteins to date. EVH1 has a PH-like fold, despite having minimal sequence similarity to PH or PTB domains [
,
,
].
The GW domain or cell wall targeting (CWT) signal is a module of about 80-90 amino acids named for a conserved Gly-Trp (GW) dipeptide. GW domains have only been identified in Gram-positive bacteria and form a small protein family. They are divergent members of the SH3 family. However, GW
domains are unlikely to mimic SH3 domains functionally, as their potential peptide-binding sites are destroyed or blocked. GW domains may constitute a motif for cell-surface anchoring in Listeria and other Gram-positive bacteria [,
].The GW domain is composed of seven β-strands, five of which are organized into an open barrel conformation like the SH3 domains one. The eponymous GW dipeptide, located in the fourth β-strand, is more conserved in GW domains than in SH3 domains. Both the glycine and tryptophan are buried in GW proteins, while the equivalent residues in SH3 proteins are surface accessible, perhaps explaining the greater conservation in GW proteins [
,
].
Signal recognition particles (SRPs) are ribonucleoprotein complexes that target particular nascent pre-secretory proteins to the endoplasmic reticulum. The SRP complex targets the ribosome-nascent chain complex to the SRP receptor (SR), which is anchored in the ER, where SR compaction and GTPase rearrangement drive cotranslational protein translocation into the ER [
]. SRP68 is one of the two largest proteins found in SRPs (the other being SRP72), and it forms a heterodimer with SRP72. Heterodimer formation is essential for SRP function []. SRP68 binds to SRP RNA directly, while SRP72 binds the SRP RNA largely via nonspecific electrostatic interaction. The binding of SRP72 with SRP RNA enhances the affinity of SRP68 for the RNA. This entry describes the N-terminal RNA-binding domain (RBD) of SRP68, a tetratricopeptide-like module. Interactions between SRP68-RBD and SRP RNA (7SL RNA) are thought to facilitate a conformation of SRP RNA that is required for interactions with ribosomal RNA [,
,
].
This entry represents the substrate binding domainfound in the putative ABC transporter substrate-binding lipoprotein YvgL. It is a ModA-like protein that belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge [
]. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap [].ModA proteins, which serve as initial receptors in the ABC transport of molybdate in eubacteria and archaea [
]. Bacteria and archaea import molybdenum and tungsten from the environment in the form of the oxyanions molybdate (MoO(4) (2-)) and tungstate (WO(4) (2-)). After binding molybdate and tungstate with high affinity, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis [].
This entry represents enzymes that perform ADP-ribosylations, including ADP-ribosylhydrolase ARH1/3 from animals, which preferentially hydrolyses the scissile alpha-O-linkage attached to the anomeric C1'' position of ADP-ribose and acts on different substrates, such as proteins ADP-ribosylated on serine and threonine, free poly(ADP-ribose) and O-acetyl-ADP-D-ribose [
,
,
]. The family also includes ADP-ribosylarginine hydrolase Tri1 from Serratia proteamaculans. This protein neutralises Tre1-Sp both by occluding its active site via its N-terminal extension and by hydrolysing the ADP-ribosyl moiety from FtsZ. It functions as an immunity component of a contact-dependent interbacterial competition system (also called effector-immunity systems) [
].ADP-ribosyl-[dinitrogen reductase] glycohydrolase is involved in the regulation of nitrogen fixation activity by the reversible ADP-ribosylation of one subunit of the homodimeric dinitrogenase reductase component of the nitrogenase enzyme complex in Rhodospirillum rubrum [,
]. Jellyfish Crystallin proteins [
] are also included in this group, although these proteins appear to have lost the presumed active site residues.
Myosin-IXb, also termed myosin-9b (Myo9b), is a motor protein with a Rho GTPase activating domain (RhoGAP); it is an actin-dependent motor protein of the unconventional myosin IX class [
. It is expressed abundantly in tissues of the immune system, like lymph nodes, thymus, and spleen and in several immune cells including dendritic cells, macrophages and CD4 [
]. Myosin-IXb contains a Ras-associating (RA) domain, a motor domain, a protein kinase C conserved region 1 (C1), and a RhoGAP domain [].This entry represents the RA domain, which is located at its head domain and has the β-grasp ubiquitin-like fold with unknown function. Myosin-IXb acts as a motorized signalling molecule that links Rho signalling to the dynamic actin cytoskeleton. It regulates leukocyte migration by controlling RhoA signalling. Myosin-IXb is also involved in the development of autoimmune diseases, including rheumatoid arthritis, systemic lupus erythematosus and type 1 diabetes [
,
]. Moreover, Myosin-IXb is a ROBO-interacting protein that suppresses RhoA activity in lung cancer cells [].
Receptor-type tyrosine-protein phosphatase delta (R-PTP-delta, also known as PTPRD), belongs to the LAR (leukocyte common antigen-related) family of receptor-type tyrosine-protein phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs (
) catalyze the dephosphorylation of phosphotyrosine peptides. LAR-RPTPs are synaptic adhesion molecules that play roles in various aspects of neuronal development, including axon guidance, neurite extension, and synapse formation and function [
,
,
]. PTPRD is involved in pre-synaptic differentiation through interaction with SLITRK2 [,
]. It contains an extracellular region with three immunoglobulin-like (Ig) domains and four to eight fibronectin type III (FN3) repeats (determined by alternative splicing), a single transmembrane domain, followed by an intracellular region with a membrane-proximal catalytic PTP domain (repeat 1, also called D1) and a membrane-distal non-catalytic PTP-like domain (repeat 2, also called D2) [,
,
].This entry represents the catalytic PTP domain (repeat 1) found in the protein PTPRD from chordates.
This entry represents the RNA recognition motif (RRM) of MCM3-associated protein (MCM3AP, also known as GANP), a nuclear protein with multiple domains, which have different functions. GANP serves as the scaffold of the mammalian TREX-2 complex that links transcription with nuclear messenger RNA export [
]. GANP can act as an RNA export factor that interacts with activation-induced cytidine deaminase (AID) and shepherds it from the cytoplasm to the nucleus, and toward the IgV region loci in B cells []. GANP contains the Sac3-homology domain, a putative RNA recognition motif, the Nup-homology domain, and the C-terminal histone-acetyltransferase (HAT) domain. Its HAT domain has been shown to regulate minichromosome maintenance protein 3 (MCM3) [
]. An alternatively spliced variant of GANP mRNA has been reported in humans. A shorter isoform of GANP (likely to encode an 80kDa protein) is associated with MCM3 of the DNA helicase MCM-complex (composed of MCM2-MCM7) and possesses HAT activity [].
This entry represents the RNA recognition motif 1 (RRM1) of hnRNP A2/B1. Heterogeneous nuclear ribonucleoprotein (hnRNP) A2/B1 is an RNA trafficking response element-binding protein that interacts with the hnRNP A2 response element (A2RE), a cis-acting signal present in certain trafficked mRNAs, including those encoding myelin basic protein (MBP), CaMKII, neurogranin, and Arc [
,
,
]. Besides RNA trafficking, hnRNP A2/B1 is also involved in many aspects of mRNA processing, including packaging of nascent transcripts, splicing of pre-mRNAs, and translational regulation [
]. For instance, it functions as a splicing factor that regulates alternative splicing of tumour suppressors, such as BIN1, WWOX, the antiapoptotic proteins c-FLIP and caspase-9B, the insulin receptor (IR), and the RON proto-oncogene among others []. The overexpression of hnRNP A2/B1 has been linked to many cancers and may play a role in tumor cell differentiation []. hnRNP A2/B1 contains two RNA recognition motifs (RRMs), followed by a long glycine-rich region at the C terminus [].
This entry represents the RNA recognition motif 1 (RRM1) of SF3B4, also termed pre-mRNA-splicing factor SF3b 49kDa (SF3b50), or spliceosome-associated protein 49 (SAP 49). SF3B4 is a component of the multiprotein complex splicing factor 3b (SF3B), an integral part of the U2 small nuclear ribonucleoprotein (snRNP) and the U11/U12 di-snRNP. SF3B is essential for the accurate excision of introns from pre-messenger RNA, and is involved in the recognition of the pre-mRNA's branch site within the major and minor spliceosomes []. SF3B4 functions to tether U2 snRNP with pre-mRNA at the branch site during spliceosome assembly []. It is an evolutionarily highly conserved protein with orthologues across diverse species.SF3B4 contains two closely adjacent N-terminal RNA recognition motifs (RRMs). It binds directly to pre-mRNA and also interacts directly and highly specifically with another SF3B subunit called SAP 145 [
].Mutations in the SF3B4 gene cause Nager syndrome, a form of acrofacial dysostosis which affects the development of the face, hands, and arms [
].
This entry represents the death Domain (DD) of IRAK3 (also known as IRAK-M). IRAKs are essential components of innate immunity and inflammation in mammals and other vertebrates [
,
]. They are involved in signal transduction pathways involving IL-1 and IL-18 receptors, Toll-like receptors(TLRs), nuclear factor-kappaB (NF-kB), and mitogen-activated protein kinases (MAPKs) [,
]. IRAKs contain an N-terminal DD domain and a C-terminal kinase domain. IRAK3 is an inactive kinase present only in macrophages in an inducible manner []. It is a negative regulator of TLR signalling and it contributes to the attenuation of NF-kB activation [].DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signalling pathways and can recruit other proteins into signalling complexes [
,
,
].
This death domain (DD) is found in tumor necrosis factor receptor superfamily member 21 (TNFRSF21), also called death receptor-6, or DR6. DR6 is an orphan receptor that is expressed ubiquitously, but shows high expression in lymphoid organs, heart, brain and pancreas [
]. Results from DR6(-/-) mice indicate that DR6 plays an important regulatory role for the generation of adaptive immunity. It may also be involved in tumor cell survival and immune evasion. In neuronal cells, it binds beta-amyloid precursor protein (APP) and activates caspase-dependent cell death [].In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes [
,
].
Serine/threonine kinases (STKs) catalyze the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. LKB1, also called STK11, was first identified as a tumour suppressor responsible for Peutz-Jeghers syndrome, a disorder that leads to an increased risk of spontaneous epithelial cancer. It serves as a master upstream kinase that activates AMP-activated protein kinase (AMPK) and most AMPK-like kinases. LKB1 and AMPK are part of an energy-sensing pathway that links cell energy to metabolism and cell growth [
]. They play critical roles in the establishment and maintenance of cell polarity, cell proliferation, cytoskeletal organization, as well as T-cell metabolism, including T-cell development and function [,
,
,
]. Loss of LKB1 function in the liver results in hyperglycemia with increased gluconeogenic and lipogenic gene expression, indicating role as a mediator of glucose homeostasis []. To be activated, LKB1 requires the adaptor proteins STe20-Related ADaptor (STRAD) and mouse protein 25 (MO25)[].
PLK3 (polo-like kinase 3) is a serine/threonine-protein kinase involved in cell cycle regulation, response to stress and Golgi disassembly [
,
,
]. It regulates angiogenesis and responses to DNA damage []. Activated PLK3 mediates Chk2 phosphorylation by ATM and the resulting checkpoint activation []. PLK3 phosphorylates DNA polymerase delta and may be involved in DNA repair. It also inhibits Cdc25c, thereby regulating the onset of mitosis [,
].STKs (serine/threonine-protein kinases) catalyse the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. PLKs (polo-like kinases) play important roles in cell cycle progression and in DNA damage responses. They regulate mitotic entry, mitotic exit, and cytokinesis. In general PLKs contain an N-terminal catalytic kinase domain and a C-terminal regulatory polo box domain (PBD), which is comprised by two bipartite polo-box motifs (or polo boxes) and is involved in protein interactions. There are five mammalian PLKs (PLK1-5) from distinct genes [
,
,
].
This dimerisation domain can be found in the transferrin receptor, as well as in a number of other proteins including glutamate carboxypeptidase II and N-acetylated-alpha-linked acidic dipeptidase like protein.The transferrin receptor (TfR) assists iron uptake into vertebrate cells through a cycle of endo- and exocytosis of the iron transport protein transferrin (Tf). TfR binds iron-loaded (diferric) Tf at the cell surface and carries it to the endosome, where the iron dissociates from Tf. The apo-Tf remains bound to TfR until it reaches the cell surface, where apo-Tf is replaced by diferric Tf from the serum to begin the cycle again. Human TfR is a homodimeric type II transmembrane protein. The crystal structure of a TfR monomer reveals a 3-domain structure: a protease-like domain that closely resembles carboxy- and amino-peptidases; an apical domain consisting of a β-sandwich; and a helical dimerisation domain. The dimerisation domain consists of a 4-helical bundle that makes contact with each of the three domains in the dimer partner [
].
The alpha-2-macroglobulin receptor-associated protein (RAP) is a glycoprotein that binds to the alpha-2-macroglobulin receptor, as well as to other members of the low density lipoprotein receptor family (
). RAP acts to inhibit the binding of all know ligands for these receptors, and may prevent receptor aggregation and degradation in the endoplasmic reticulum, thereby acting as a molecular chaperone [
]. RAP may be under the regulatory control of calmodulin, since it is able to bind calmodulin and be phosphorylated by calmodulin-dependent kinase II ().
RAP is comprised of three domains, each representing about one-third of the protein, which originated from an apparent triplication of an approximately 100 residue sequence. The second and third domains interact weakly with one another, whereas the first domain is entirely independent [
]. Structural studies have revealed the first RAP domain to comprise of a partly opened bundle of three helices, the first one being shorter than the other two.
This entry includes the peptidase domains of retropepsin-like aspartic endopeptidases from retrotransposons with long terminal repeats (LTR) including Saci-1, -2 and -3 of Schistosoma mansoni [
]. Retropepsins are related to fungal and mammalian pepsins. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. Retrotransposon aspartic endopeptidase is synthesized as part of a polyprotein that also contains a reverse transcriptase and an integrase. The polyprotein is presumed to undergo specific enzymatic cleavage to yield the mature proteins. This group of aspartic endopeptidases is classified by MEROPS as the peptidase family A28, subfamily A28B [].
This entry consists of Hcp-like proteins. Hcp appears to be part of the type VI secretion system of Gram-negative bacteria. Hcp is not only a secreted effector protein, but also might act as machine component [
].Several bacterial pathogens mediate interactions with their hosts through protein secretion, often involving Hcp-like virulence loci, which are widely distributed among pathogenic bacteria. Homologues of Hcp are found in various bacteria of which most, but not all, are known pathogens. Many bacteria have two copies of hcp genes [
,
]. In Pseudomonas syringae, Hcp1 is a virulence protein, while Hcp2 seems to be required for survival in competition with enterobacteria and yeasts, and its function is associated with the suppression of the growth of these competitors [].Hcp1 monomers form a hexameric ring with a large internal diameter. Assembly of this particle is likely to occur following secretion, and could have a role in building a channel for the transport of other macromolecules [
].
Vav proteins are involved in several processes that require cytoskeletal reorganization, such as the formation of the immunological synapse (IS), phagocytosis, platelet aggregation, spreading, and transformation. Vavs function as guanine nucleotide exchange factors (GEFs) for the Rho/Rac family of GTPases [
]. Vav family members have several conserved motifs/domains including: a leucine-rich region, a leucine-zipper, a calponin homology (CH) domain, an acidic domain, a Dbl-homology (DH) domain, a pleckstrin homology (PH) domain, a cysteine-rich domain, two SH3 domains, a proline-rich region, and a SH2 domain. Vavs are the only known Rho GEFs that have both the DH/PH motifs and SH2/SH3 domains in the same protein. This entry represents the SH2 domain which mediates a high affinity interaction with tyrosine phosphorylated proteins []. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites [
].