The SERTA (for SEI-1, RBT-1, and TARA) domain is a motif of ~47 residues
corresponding to the largest conserved region among TRIP-Br (transcriptionalregulator interacting with the PHD-bromodomain) proteins, an evolutionarily
conserved family restricted to higher eukaryotes. In proteins of the TRIP-Brfamily, the SERTA domain is found in association with a cyclin A-binding
domain and a PHD-bromo binding domain. The SERTA domain is also found in someother proteins with no conservation with TRIP-Br proteins outside of the SERTA
motif. The cyclin-dependent kinase CDK4-interacting segment of TRIP-Br1includes most of the SERTA domain [
].
Exportins bind cargo molecules in the nuclei and transport them through nuclear pores to the cytoplasm, a process that requires RanGTP. This entry includes Exportin 4 and 7 (also known as RanBP16), proteins that mediate the nuclear export of proteins with broad substrate specificity. They bind cooperatively to its cargo and to the GTPase Ran in its active GTP-bound form. Exportin 4 transports the eukaryotic translation initiation factor 5A (eIF-5A) and Smad3, controlling protein synthesis and Smad signalling [
,
].This entry also includes Ran-binding protein 17 from humans.
Bacteria dimerize pairs of 70S ribosomes into inactive 100S ribosomes during the stationary phase and when exposed to certain stresses, a process that is called ribosome hibernation. Hibernating ribosomes are formed by the activity of one or more highly conserved proteins. In Gammaproteobacteria two proteins are involved in this process, ribosome modulation factor (RMF) and hibernation promoting factor (HPF), while most Gram-positive bacteria produce a single, longer HPF protein [
,
].This family includes the long form of the HPF proteins and the functional homologue PSRP1, a ribosome-binding factor from chloroplasts [
].
This entry represents the atypical Rib (aRib) domain found in a variety of bacterial cell surface proteins. These proteins share a conserved motif with the Rib domain (YPDXXD). The structure of the aRib domain has been solved from two proteins, the SrpA adhesin [
] and the GspB adhesin []. In these proteins this domain has been termed the unique domain due to its lack of similarity to any other known structures at the time. The aRib domain from SrpA has been shown to mediate a dimer interaction [].
Holins are a diverse family of proteins that cause bacterial membrane lysis during late-protein synthesis. It is thought that the temporal precision of holin-mediated lysis may occur through the buildup of a holin oligomer which causes the lysis [].An alternative isoform exists that is thought to function as an antiholin by counteracting the aggregation of the holin molecules. The isoforms differ by only 2 residues at the N terminus [
].This entry represents the Bacteriophage A118-like, holin/antiholin protein. The characteristics of the protein distribution suggest prophage matches in addition to the phage matches.
Fungal proteins FMP27 (also known as Hob1) and Hob2 (YPR117W) are tube-forming lipid transport proteins which bind to phosphatidylinositols and affects phosphatidylinositol-4,5-bisphosphate (PtdIns-4,5-P2) distribution [
,
,
]. They belong to the repeating β-groove (RBG) superfamily together with VPS13, ATG2, SHIP164, Csf1/BLTP1 proteins, which are all conserved lipid transfer proteins containing long hydrophobic grooves [,
]. They all share the same structure consisting of multiple repeating modules comprising five β-sheets followed by a loop.This entry represents a conserved region within a RBG unit of FMP27 that contains the SW and GKG sequence motifs.
ASZ1 (Ankyrin, SAM, leucine Zipper), also known as GASZ (Germ cell-specific Ankyrin, SAM, leucine Zipper), is a potential protein-protein interaction domain [
]. Proteins containing this domain are involved in the repression of transposable elements during spermatogenesis, oogenesis, and preimplantation embryogenesis. They support synthesis of PIWI-interacting RNA via association with some PIWI proteins, such as MILI and MIWI. This association is required for initiation and maintenance of retrotransposon repression during the meiosis. In mice lacking ASZ1, DNA damage and delayed germ cell maturation was observed due to retrotransposons releasing from their repressed state [,
].
Proteins in the HIN-200 family are induced by type I and II interferons (IFN), and they are involved in inflammation and immune responses [
]. In addition to this role in interferon biology, they function as regulators of cell proliferation and differentiation [,
]. This family includes human IFI16, myeloid cell nuclear differentiation antigen, absent in melanoma 2 (AIM2), pyrin and HIN domain-containing protein 1 (IFIX), and murine proteins Ifi202, Ifi203, Ifi204, Ifi205 and Ifi206. All HIN-200 proteins, with the exception of p202, have an N-terminal α-helical PAAD/DAPIN/Pyrin domain [].
Proteins containing this domain include reductive activator of CoFeSP (RACo) proteins. Structure analysis of RACo indicate that it contains 4 regions: N-terminal region (residues 3-94) binding the [2Fe-2S] cluster, a linker region (residues 95-125), the middle region (residues 126-206), and the large C-terminal domain (residues 207-630). This entry pertains to the linker region. The linker region is only present in RACE (reductive activases for corrinoid enzymes) protein sequences with the N-terminal [2Fe-2S]cluster and is absent in the RamA-like RACE proteins, suggesting that the linker domain and the N-terminal domain form one functional unit [
].
Lasp1 (LIM and SH3 domain protein 1) is a cytoplasmic protein that binds focal adhesion proteins and is involved in cell signaling, migration, and proliferation [
]. It is overexpressed in several cancer cells, including breast, ovarian, bladder, and liver [,
,
]. In some cancer cells, it can be found in the nucleus; its degree of nuclear localization correlates with tumour size and poor prognosis []. Lasp1 is a 36kDa protein containing an N-terminal LIM domain, two nebulin repeats, and a C-terminal SH3 domain []. This entry represent the SH3 domain.
Proteins in this entry are type III secretion system effectors, named differently in different species and designated YopR in Yersinia. Yersinia employs a type III secretion system (T3SS) to secrete and translocate virulence factors into to the cytoplasm of mammalian host cells. One of the secreted virulence factors is YopR (Yersinia outer protein R), encoded by the YscH (Yersinia secretion H) gene. This Yop protein is unusual in that it is released to the extracellular environment rather than injected directly into the target cell as are most Yop proteins [
,
].
This domain is predominantly found in venomous neurotoxins and cytotoxins from snakes, but also structurally similar (non-snake) toxin-like proteins (TOLIPs) such as Lymphocyte antigen 6D and Ly6/PLAUR domain-containing protein. Snake toxins are short proteins with a compact, disulphide-rich structure. TOLIPs have similar structural features (abundance of spaced cysteine residues, a high frequency of charge residues, a signal peptide for secretion and a compact structure) but, are not associated with a venom gland or poisonous function. They are endogenous animal proteins that are not restricted to poisonous animals [
].
TRIM proteins are E3 ubiquitin ligases defined by the presence of the tripartite motif RING/B-box/coiled-coil region, and are also known as RBCC proteins. While the tripartite motif is restricted to this protein family, the C-terminal domains can vary and are also present in otherwise unrelated proteins [
].This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C terminus of TRIM46. TRIM46 is required for neuronal polarization and axon formation. It localizes to the newly specified axon and organizes uniform microtubule orientation in axons [
].
The CDC48 N-terminal domain is a protein domain found in AAA ATPases including cell division protein 48 (CDC48), VCP-like ATPase (VAT) and N-ethylmaleimide sensitive fusion protein. It is a substrate recognition domain which binds polypeptides, prevents protein aggregation, and catalyses refolding of permissive substrates. It is composed of two equally sized subdomains. The amino-terminal subdomain forms a double-psi β-barrel whose pseudo-twofold symmetry is mirrored by an internal sequence repeat of 42 residues. The carboxy-terminal subdomain forms a novel six-stranded β-clamp fold []. Together these subdomains form a kidney-shaped structure. This entry represents the amino-terminal subdomain.
The XH (rice gene X Homology) domain is found in a family of plant proteins including Oryza sativa (Rice)
and Arabidopsis FDM1-5/IDN2. These proteins usually contain an XS domain (
) that is also found in the PTGS protein SGS3. As the XS and XH domains are fused in most of these proteins, these two domains may interact. The XH domain is between 124 and 145 residues in length and contains a conserved glutamate residue that may be functionally important [
].FDM1-5 and IDN2 are components of RNA-directed DNA methylation pathway (RdDM) [
,
].
This entry represents a family that includes methylenetetrahydrofolate reductase.The enzyme activities methylenetetrahydrofolate reductase (
) and 5,10-methylenetetrahydrofolate reductase (FADH) (
) differ in that the former (assigned in many eukaryotes) is defined to use NADP+ as an acceptor, while the latter (assigned in many bacteria) is flexible with respect to the acceptor. Both convert 5-methyltetrahydrofolate to 5,10-methylenetetrahydrofolate. From a larger set of proteins assigned as one or the other, this family describes the subset of proteins found in eukaryotes, and currently designated methylenetetrahydrofolate reductase(
). This protein is an FAD-containing flavoprotein.
This domain superfamily is found in proteins of unknown function related to HP0242. The crystal structure of HP0242, a hypothetical protein from Helicobacter pylori (Campylobacter pylori) has been determined. It reveals an acid-adaptive protein possibly of physiological significance when H. pylori colonises the human stomach. The protein adopts a unique four α-helical triangular conformations. The biologically active form is thought to be a tetramer. The gene exist in an operon along with 6 other genes where the gene products appear to be related to iron storage and haem biosynthesis [
].
This group contains two-domain proteins that are fusions of spore maturation protein A (SpmA) and spore maturation protein B (SpmB). SpmA and SpmB are thought to be involved in spore core dehydration in Bacillus subtilis. Spore dehydration is important for heat resistance, and for processing the spore germination protease GPR into an active form [
]. SpmA and SpmB might be involved in import or export from the forespore, or for modification of the cortex peptidoglycan structure []. SpmA and SpmB are predicted to be integral membrane proteins.For additional information please see [].
This entry represents the C-terminal domain found in the terminase large subunits. Terminase is a component of the molecular motor that translocates genomic DNA into empty capsids during DNA packaging [
]. The large subunit heterodimerises with the small terminase protein, which is docked on the capsid portal protein. The latter forms a ring through which genomic DNA is translocated into the capsid. The terminase protein has endonuclease activity to cleave DNA after encapsidation [,
].Proteins containing this domain include gp17 from bacteriophage T4 (
)[
,
,
].
This domain is found in several cell surface proteins, such as extracellular matrix-binding protein ebh [
]. Some members are involved in antibiotic resistance (e.g. and
) [
] and/or cellular adhesion (e.g. ) [
]. In some proteins it is repeated more than fifteen times, being the most repeated domain in streptococci []. This is a predominantly α-helical domain that form a long, thin, fibre-like structure and it has been proposed to function as a stalk that helps the adhesive non-repeat region (NRR) of proteins protrude beyond the cell surface [,
].
R-bodies are highly insoluble protein ribbons which coil into cylindrical structures in the cell and the genes for their synthesis and assembly are encoded on a plasmid. One of these three proteins is RebB, which this entry represents.RebB is one of three proteins necessary for the production of R- bodies, refractile inclusion bodies produced by a small number of bacterial species, essential for the expression of the killing trait of the endosymbiont bacteria that produce them for attack upon the host Paramecium. Note that many members are uncharacterised proteins [
,
].
This short conserved region is a putative destruction-box, with its RxxLxxI sequence motif, though the homology is not absolute [
]. The domain occurs on a number of tumourigenic proteins, on some RNA-binding proteins and serine-threonine regulatory proteins []. The second less well-conserved motif, WITPS, is a potential WW domain ligand-binding motif for recruiting proteins to their substrates. WW domains bind tightly to short proline-containing peptides that are typically in regions of native disordered polypeptide, as this domain is as it lies between a PIN domain and a zinc-binding domain [].
This family of proteins represents the C-terminal domain of the protein Rap-1, which plays a distinct role in silencing at the silent mating-type loci and telomeres [
]. The Rap-1 C terminus adopts an all-helical fold. Rap1 carries out its function by recruiting the Sir3 and Sir4 proteins to chromatin via its C-terminal domain []. Rap1 is otherwise known as TRF2-interacting protein, as it is one of the six subunit components of the Shelterin complex. Shelterin protects telomere ends from attack by DNA-repair mechanisms [,
,
,
].
This superfamily represents the N-terminal domain of Saccharomyces cerevisiae's Hri1 (Hrr25-interacting protein 1, YLR301w), a non-essential protein named for its interaction with the yeast protein kinase Hrr25. It has also been shown to interact with Sec72p, although it does not seem to be involved in protein translocation into the ER. This N-terminal domain of Hri1p contains a tandem repeat of a structural unit that forms a β-barrel with structural similarity to nitrobindin. It is involved in homodimerization and it may also contain a ligand binding site [,
,
].
Proteins in this Gram-positive bacteria family consist of an N-terminal signal peptide, a central region of unknown function, and a Cys-rich C-terminal region. In both the overall architecture and the apparent weak homology of the C-terminal region itself, these proteins resemble archaeal proteins such as the halocin C8 precursor. In that precursor, the C-terminal region is a bacteriocin but the N-terminal region functions as the immunity protein. The related family of halocin C8-like bacteriocins and their bacterial homologues can be found in the halocin C8-like bacteriocin domain entry (
).
This N-terminal domain of various bacterial protein families is crucial for the targetting of periplasmic or extracellular proteins to specific regions of the bacterial envelope. AMIN is derived from the N-terminal domain of AmiC, an N-acetylmuramoyl-l-alanine amidase of Escherichia coli which localises to the septal ring during division and plays a key role in the separation of daughter cells. The AMIN domain is present in several protein families besides amidases suggesting that AMIN may represent a general targetting determinant involved in the localisation of periplasmic protein complexes [].
This motif is found in the C-terminal region of the Yaf2 and RYBP proteins, which are homologous parts of the PRC1 complex [
].RYBP is a zinc finger protein with an essential role during embryonic development, which binds transcriptional factors, Polycomb products, and mediators of apoptosisis [
]. RYBP also binds ubiquitin and Cbx proteins via the C-terminal docking module [,
]. RYBP is natively unstructured until it binds to the C-terminal region of the Polycomb protein Ring1B or to DNA []. Yaf2 binds to MYC and inhibits MYC-mediated transactivation [].
This entry represents the seven-transmembrane α-helices (TM) domain found in the yeast Yro2 protein and its closely related proteins. Although the exact function of these proteins is unknown, they show strong sequence homology to the family of microbial rhodopsins, also known as type I rhodopsins, comprising the light-driven inward chloride pump halorhodopsin (HR), the outward proton pump bacteriorhodopsin (BR) [
]. This entry includes the Opsin-like protein carO, which is involved in the biosynthesis of neurosporaxanthin, a carboxylic apocarotenoid acting as an essential protective pigment and leading to orange pigmentation [].
Ephrins are a family of proteins [
] that are ligands of class V (EPH-related) receptor protein-tyrosine kinases. These receptors and their ligands have been implicated in regulating neuronal axon guidance and in patterning of the developing nervous system and may also serve a patterning and compartmentalisation role outside of the nervous system as well.Ephrins are membrane-attached proteins of 205 to 340 residues. Attachment appears to be crucial for their normal function. Type-A ephrins are linked to the membrane via a glycosylphosphatidylinositol (GPI)-linkage, while type-B ephrins are type-I membrane proteins.
This entry represents a domain found near the C terminus of bacteriophage tape measure proteins. Long-tailed bacteriophages possess a large gene encoding a tape measure protein (TMP) [
]. TMP is important for assembly of phage tails and involved in tail length determination [,
]. Mutated forms of TMP cause tail fibres to be shortened [].This protein is also found in bacteria, particularly Enterobacteriaceae, suggesting prophage matches occur in addition to the phage matches.This entry includes a domain found near the C terminus of Bacteriophage lambda tail measure protein, GpH.
Packaging of double-stranded viral DNA concatemers requires interaction of the prohead with virus DNA. This process is mediated by a phage-encoded DNA recognition and terminase protein. The terminase enzymes described so far are hetero-oligomers composed of a small and a large subunit. The small terminase subunit is thought to form a nucleoprotein structure that helps to position the terminase large subunit at the packaging initiation site [
].This family of small highly conserved proteins come from virus and a subset of Firmicute species. They share protein sequence similarity with the phage terminase small subunit.
This entry represents receptor tyrosine-protein phosphatase (PTP) alpha (
). PTP catalyses the dephosphorylation of protein tyrosine phosphate to protein tyrosine, and appear to play a pivotal role in insulin receptor signalling. It can exist as a single-pass membrane protein or in the cytoplasm. PTP-alpha is as a positive regulator of Src and Src family kinases, acting to dephosphorylate and activate Src. As such, PTP-alpha affects transformation and tumourigenesis, inhibition of proliferation and cell cycle arrest, mitotic activation of Src, integrin signalling, neuronal differentiation and outgrowth, and ion channel activity [
].
This entry represents a conserved site in the C-terminal of the CAP protein. Structurally, CAP is a protein of 474 to 551 residues, which consist of two domains separated by a proline-rich hinge. In budding and fission yeasts the CAP protein is a bifunctional protein whose N-terminal domain binds to adenylyl cyclase, thereby enabling that enzyme to be activated by upstream regulatory signals, such as Ras. The N terminus also catalyses cofilin-mediated severing of actin filaments [
]. The C-terminal domain plays a role in in recycling cofilin-bound, ADP-actin monomers [].
This entry represents a group of nitrate transporters that belong to the Major Facilitator Superfamily (MFS) of membrane transport proteins. Proteins in this entry include high-affinity nitrate transporters NRT2 from plants, and nitrate/nitrite transporters NarK2/NarT/NasA/NarU from bacteria [
,
,
,
]. NRT2 family proteins are involved in the uptake of nitrate by plant roots from the soil through the high-affinity transport system (HATS). There are seven Arabidopsis thaliana NRT2 proteins, called AtNRT2:1 to AtNRT2:7 [
,
]. This entry also includes fungal nitrate transporters such as CRNA from Emericella nidulans [].
Proteins in this entry are mostly known or predicted polyol transporters. The best characterised of these proteins are DalT and RbtT from Klebsiella pneumoniae, which tranpsort D-arabinatol and ribotol respectively. Like other members of the Major Facilitator Superfamily (MFS), DalT and RbtT appear to be secondary transporters capable only of transporting solutes in response to chemiosmotic ion gradients [
]. They contain a total of twelve predicted transmembrane helices; six in the N-terminal portion of the protein, and six in the C-terminal portion. Residues defining substrate specificity lie in the N-terminal region of these proteins.
This family includes the hamartin protein which is thought to function as a tumour suppressor. The hamartin protein interacts with the tuberin protein
. Tuberous sclerosis complex (TSC) is an autosomal dominant disorder and is characterised by the presence of hamartomas in many organs, such as brain, skin, heart, lung, and kidney. It is caused by mutation in either TSC1 or TSC2 tumour suppressor genes. TSC1 encodes a protein, hamartin, containing two coiled-coil regions, which have been shown to mediate binding to tuberin. The TSC2 gene codes for tuberin
.
Fungal proteins FMP27 (also known as Hob1) and Hob2 (YPR117W) are tube-forming lipid transport proteins which bind to phosphatidylinositols and affects phosphatidylinositol-4,5-bisphosphate (PtdIns-4,5-P2) distribution [
,
,
]. They belong to the repeating β-groove (RBG) superfamily together with VPS13, ATG2, SHIP164, Csf1/BLTP1 proteins, which are all conserved lipid transfer proteins containing long hydrophobic grooves [,
]. They all share the same structure consisting of multiple repeating modules comprising five β-sheets followed by a loop.This entry represents a RBG module within FMP27 that contains the conserved HQR and WPPW sequence motifs.
DREB2A-Interacting Proteins (DRIPs) are E3 ubiquitin-protein ligases that act as negative regulators of the response to water stress, that mediate ubiquitination and subsequent proteasomal degradation of the drought-induced transcriptional activator Dehydration-Responsive Element-Binding Protein 2A (DREB2A). DREB2A regulates the expression of stress-inducible genes via the dehydration-responsive elements and requires posttranslational modification for its activation. DRIPs contain a RING finger, and a RING finger- and WD40-associated ubiquitin-like (RAWUL) domain [
,
].This entry represents the RAWUL domain found in E3 ubiquitin protein ligases DRIP1, DRIP2 and the DRIP homologue (DRIPH, At3g23060) from Arabidopsis.
This superfamily represents the second domain found in structural maintenance of chromosomes (SMC) proteins, which function together with other proteins in a range of chromosomal transactions, including chromosome condensation, sister-chromatid cohesion, recombination, DNA repair and epigenetic silencing of gene expression [
]. The eukaryotic SMC proteins form two kind of heterodimers: the SMC1/SMC3 and the SMC2/SMC4 types. These heterodimers constitute an essential part of higher order complexes, which are involved in chromatin and DNA dynamics []. This domain is also found in the RecF protein (recombination mediator), which is involved in DNA metabolism and recombination.
This group represents a molecular chaperone regulator BAG-1 [
]. BAG-1 contains a BAG domain that is formed by two antiparallel helices, while the third helix is extended away and stabilized by crystal-packing interactions []. BAG-family proteins contain a single BAG domain, except for human BAG-5 which has four BAG repeats. The BAG domain is a conserved region located at the C terminus of the BAG-family proteins that binds the ATPase domain of Hsc70/Hsp70. BAG family proteins regulate chaperone protein activities through their interaction with Hsc70/Hsp70 [
].
This entry represents the SUPPRESSOR OF PHYTOCHROME B (SOB) five-Like (SOFL) family of novel plant-specific proteins which have high sequence similarity and similar protein structures [
,
]. These proteins are involved in cytokinin-mediated development and contain two highly conserved domains SOFL-A and SOFL-B, which are necessary for their function []. These proteins are positive modulators of the endogenous content of specific cytokinin levels derived from the biosynthetic intermediates trans-zeatin riboside monophosphate (tZRMP) and N6-(Delta2-isopentenyl)adenosine monophosphate (iPRMP) such as N-glucosides trans-zeatin 7-glucoside (tZ7G), cis-zeatin 7-glucoside (cZ7G) and N6-(Delta2-isopentenyl)adenine 7-glucoside (iP7G) [
,
].
Ephrins are a family of proteins [
] that are ligands of class V (EPH-related) receptor protein-tyrosine kinases. These receptors and their ligands have been implicated in regulating neuronal axon guidance and in patterning of the developing nervous system and may also serve a patterning and compartmentalisation role outside of the nervous system as well.Ephrins are membrane-attached proteins of 205 to 340 residues. Attachment appears to be crucial for their normal function. Type-A ephrins are linked to the membrane via a glycosylphosphatidylinositol (GPI)-linkage, while type-B ephrins are type-I membrane proteins.
This group represents a predicted acetyltransferase, GNAT family type.Bacillus subtilis has a putative posttranslational modification system of the acuABC genes. Posttranslational modification is an efficient mechanism for controlling the activity of structural proteins, gene expression regulators and enzymes in response to rapidly changing physiological conditions [
].The AcuA protein is a member of the Gcn5-related N-acetltransferase (GNAT) superfamily and the AcuC protein is a class I histone deacetylase. The role of the AcuB protein remains unknown. AcuA controls the activity of acetyl-CoA synthetase (
) by acetylating residue Lys549 [
,
].
The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. It is normally about 70 amino acids in length. It is thought to be an intracellular protein-binding or lipid-binding signalling domain, which has an important function in membrane-associated processes. The structure of the GRAM domain is similar to that found in PH domains [
]. Mutations in the GRAM domain of myotubularins cause a muscle disease, which suggests that the domain is essential for the full function of the enzyme []. Myotubularin-related proteins are a large subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids [].
This entry describes an N-terminal domain found regularly in proteins encoded near a variant form of signal peptidase I such as the SipW protein of Bacillus subtilis. Many though not all members are homologues of camelysin (a casein-cleaving metalloprotease) and TasA (CotN), a metalloprotease that is secreted, along with extracellular polysaccharide (EPS), to be the major protein constituent of the Bacillus subtilis biofilm matrix. Sequencing from several known TasA/CotN proteins shows the cleavage location to be near the centre of the alignment and typical of type I signal peptidases, with small residues at -3 and -1.
This is the C-terminal conserved coiled coil region of a family of TATA element modulatory factor 1 proteins conserved in eukaryotes [
]. The proteins bind to the TATA element of some RNA polymerase II promoters and repress their activity. by competing with the binding of TATA binding protein. TMF1_TATA_bd is the most conserved part of the TMFs []. TMFs are evolutionarily conserved golgins that bind Rab6, a ubiquitous ras-like GTP-binding Golgi protein, and contribute to Golgi organisation in animal [] and plant cells. The Rab6-binding domain appears to be the same region as this C-terminal family [].
The S1 domain was originally identified in ribosomal protein S1 but is found in a large number of RNA-associated proteins. The structure of the S1 RNA-binding domain from the Escherichia coli polynucleotide phosphorylase has been determined using NMR methods and consists of a five-stranded antiparallel beta barrel. Conserved residues on one face of the barrel and adjacent loops form the putative RNA-binding site [
]. The structure of the S1 domain is very similar to that of cold shock proteins. This suggests that they may both be derived from an ancient nucleic acid-binding protein [
].
This protein family is one of a number of homologous small, well-conserved GTP-binding proteins with pleiotropic effects. Bacterial family members are designated HflX, following the naming convention in Escherichia coli, where HflX is encoded immediately downstream of the RNA chaperone Hfq, and immediately upstream of HflKC, a membrane-associated protease pair with an important housekeeping function [
]. HflX has been shown to associate with the 50S ribosomal subunit and may have a role during protein synthesis or ribosome biogenesis [,
,
].Homo sapiens family members are named PGPL (pseudoautosomal GTP-binding protein-like), and are ubiquitously expressed [
].
The signal recognition particle (SRP) is a large ribonucleoprotein complex that targets secretory and membrane proteins to the endoplasmic reticulum membrane [
,
]. The mammalian SRP contains a 303-nucleotide SRP RNA and six proteins, named SRP9, SRP14, SRP19, SRP54, SRP68, and SRP72. Among them, the two largest, SRP68 and SRP72, form a stable SRP68/72 heterodimer of unknown structure, which is required for sorting secretory proteins []. SRP68 binds to SRP RNA directly, while SRP72 binds the SRP RNA largely via non-specific electrostatic interaction. The binding of SRP72 with SRP RNA enhances the affinity of SRP68 for the RNA.
The R2 protein of ribonucleotide reductase catalyses the reduction of all four ribonucleotides to deoxyribonucleotides for use in DNA synthesis. This catalysis involves generating and storing a tyrosyl radical, which is essential for ribonucleotide reduction. The crystal structure consists of a core of four helices in a closed bundle with a left-handed twist and one crossover connection, and a bimetal-ion centre in the middle of the bundle [
].This entry represents proteins that are structurally related to the R2 protein of class I ribonucleotide reductase, including the alpha and beta subunits of methane monooxygenase, and delta 9-stearoyl-acyl carrier protein desaturase [
].
This domain occurs in proteins that have been annotated as fibronectin/fibrinogen binding protein by similarity. This annotation comes from
where the N-terminal region is involved in this activity [
]. This entry represents an RNA binding domain in the NFACT (NEMF, FbpA, Caliban, and Tae2) proteins. This domain is found in two eukaryotic gene contexts: fused to the NFACT-N and NFACT-C domains in the NFACT protein involved in the ribosomal quality control pathway which contributes to CAT-tailing, and as a standalone domain [,
,
]. Additionally this domain contains a conserved motif D/E-X-W/Y-X-H that may be functionally important.
The bacterial DnaA protein [
,
,
] plays an important role in initiating and regulating chromosomal replication. DnaA is an ATP- and DNA-binding protein. It binds specifically to 9 bp nucleotide repeats known as dnaA boxes which are found in the chromosome origin of replication (oriC).DnaA contains two conserved regions: the first is located in the N-terminal half and corresponds to the ATP-binding domain, the second is located in the C-terminal half and could be involved in DNA-binding. The protein may also bind the RNA polymerase beta subunit, the dnaB and dnaZ proteins, and the groE gene products (chaperonins) [
].
Proteins in this entry belong to the Formate-Nitrite Transporter (FNT) family (TC 2.A.44), including the nitrite transport protein NirC and formate channel FocA. The prokaryotic proteins of the FNT family function in the transport of the structurally related compounds, formate and nitrite [
,
]. Structures from NirC and FocA showed that they have a pentameric architecture with structural similarity to aquaporins and glyceroporins, comprising a right-handed bundle of 6 transmembrane (TM) α-helices []. The homologous yeast protein may function as a short chain aliphatic carboxylate H+ symporter, transporting formate, acetate and propionate, and functioning primarily as an acetate uptake permease.
In a number of species, including Escherichia coli, the histidine biosynthetic enzymes imidazole glycerol phosphate dehydratase and histidinol phosphatase are found together in the bifunctional protein HisB. This family represents a protein closely related to the histidinol phosphatase domain of HisB. The protein is found both in Helicobacter pylori, for which the histidine biosynthetic pathway appears to be absent, and in species that also have a bifunctional HisB protein.Members of this family have been characterised as D,D-heptose 1,7-bisphosphate phosphatase, which converts the D-glycero-beta-D-manno-heptose 1,7-bisphosphate intermediate into D-glycero-beta-D-manno-heptose 1-phosphate by removing the phosphate group at the C-7 position.
Members of this entry represent a set of proteins related to, yet architecturally different from, the activating protein for the glycine radical-containing, oxygen-sensitive ribonucleoside-triphosphate reductase (RNR, see
). Members of this entry are found paired with members of a similarly divergent set of anaerobic ribonucleoside-triphosphate reductases. Identification of these proteins as RNR activating proteins is partly from pairing with the candidate RNR and further supported by the finding that upstream of these operons are examples of a conserved regulatory element that is found in nearly all bacteria and that occurs specifically upstream of operons for all three classes of RNR genes [
].
This entry represents a family that includes bacterial 5,10-methylenetetrahydrofolate reductase (FADH).The enzyme activities methylenetetrahydrofolate reductase (
) and 5,10-methylenetetrahydrofolate reductase (FADH) (
) differ in that the former (assigned in many eukaryotes) is defined to use NADP+ as an acceptor, while the latter (assigned in many bacteria) is flexible with respect to the acceptor. Both generate 5-methyltetrahydrofolate from 5,10-methylenetetrahydrofolate. From a larger set of proteins assigned as one or the other, this family describes the subset of proteins found in prokaryotes, and currently designated 5,10-methylenetetrahydrofolate reductase (FADH) (
). This protein is an FAD-containing flavoprotein [
].
The CKK domain occurs at the C terminus of a family of proteins collectively defined as calmodulin-regulated spectrin-associated (or CAMSAP) proteins. CAMSAP proteins carry an N-terminal region that includes a CH domain, a central region including a predicted coiled-coil, and this C-terminal CKK domain which is involved in binding CAMSAP proteins to microtubules [
].The structure of the CKK domain is a β-barrel with an associated α-helical hairpin. Characteristically, the CKK domain has a single invariant tryptophan residue within the core of the predicted β-barrel. Residues that interact with this Trp to form part of this core are highly conserved too.
These radical SAM domain proteins are predicted peptide maturases, similar to PqqE, AlbA, the mycofactocin radical SAM maturase, and many others that share the peptide modification radical SAM protein C-terminal additional 4Fe4S-binding domain. Members co-occur with a protein of unknown function that may be a chaperone or immunity protein and with a peptide that may have twelve or more cysteines occurring regularly spaced every fourth residue. These Cys residues tend to be flanked by residues with small side chains that provide minimal steric hindrance to crosslink formation by the radical SAM enzyme as in the subtilosin A system [
].
Proteins with this domain are periplasmic binding proteins involved in siderophore-mediated iron uptake in some eubacterial species, such as FatB in Vibrio anguillarum [
,
] and CeuE in Campylobacter coli []. FatB is encoded in Vibrio anguillarum plasmid pJM1, which harbours the genes for the biosynthesis of the siderophore anguibactin and the ferric anguibactin transport proteins FatD, C, B and A [,
]. FatB is a periplasmic lipoprotein and may not be essential for ferric anguibactin transport [] as Vibrio anguillarum plasmid pJM1 seems to carry two ABC transporter systems participating in siderophore transport [].
The CKK domain occurs at the C terminus of a family of proteins collectively defined as calmodulin-regulated spectrin-associated (or CAMSAP) proteins. CAMSAP proteins carry an N-terminal region that includes a CH domain, a central region including a predicted coiled-coil, and this C-terminal CKK domain which is involved in binding CAMSAP proteins to microtubules [
].The structure of the CKK domain is a β-barrel with an associated α-helical hairpin. Characteristically, the CKK domain has a single invariant tryptophan residue within the core of the predicted β-barrel. Residues that interact with this Trp to form part of this core are highly conserved too.
This entry represents the UPF0234 family of uncharacterised proteins, which includes YajQ.In Pseudomonas syringae, YajQ functions as a host protein involved in the temporal control of bacteriophage Phi6 gene transcription. It has been shown to bind to the phage's major structural core protein P1, most likely activating transcription by acting indirectly on the RNA polymerase. YajQ may remain bound to the phage particles throughout the infection period [
,
]. Earlier, YajQ was characterized as a putative nucleic acid-binding protein based on the similarity of its (ferredoxin-like) three-dimensional topology with that of RNP-like RNA-binding domains [,
].
Cren7 is a chromatin protein found in Crenarchaeota and has a higher affinity for double-stranded DNA than for single-stranded DNA. The protein contains negative DNA supercoils and is associated with genomic DNA in vivo. Cren7 interacts with duplex DNA through a β-sheet and a long flexible loop. It is binding to double-stranded DNA without sequence specificity [
]. There is approximately 1 Cren7 molecule for 12 bp of DNA. The function of Cren7 has not been completely determined but it is thought that the protein may have a role similar to that of archaeal proteins in Euryarchaea [].
This entry represents outer surface proteins (Osp) from the Borrelia spp. spirochete [
]. The superfamily includes OspE, OspF, and OspEF-related proteins (Erp) []. These proteins are coded for on different circular plasmids in the Borrelia genome.Borrelia burgdorferi spirochetes, that cause Lyme borreliosis, survive for a long time in human serum due to their ability to succesfully evade the complement system, an important arm of innate immunity. The outer surface protein E (OspE) of B. burgdorferi is required for this since it recruits complement regulator factor H (FH) onto the bacterial surface to evade complement-mediated cell lysis [].
This entry represents the RNA recognition motif 1 (RRM1) of Mei2-like proteins. Mei2-like proteins represent an ancient eukaryotic RNA-binding proteins family [
]. Their corresponding Mei2-like genes appear to have arisen early in eukaryote evolution, been lost from some lineages such as Saccharomyces cerevisiae and metazoans, and diversified in the plant lineage []. The plant Mei2-like genes may function in cell fate specification during development, rather than as stimulators of meiosis [,
]. Members of the Mei2-like family contain three RNA recognition motifs (RRMs). The C-terminal RRM (RRM3) is unique to Mei2-like proteins and it is highly conserved between plants and fungi [].
This entry represents the RNA recognition motif 1 (RRM1) of RBM15.RNA-binding motif protein 15 (RBM15), also termed one-twenty two protein 1 (OTT1), is a novel mRNA export factor and component of the NXF1 pathway. It binds to NXF1 and serves as receptor for the RNA export element RTE [
]. It also possesses mRNA export activity and can facilitate the access of DEAD-box protein DBP5 to mRNA at the nuclear pore complex (NPC) []. RBM15 belongs to the Spen (split end) protein family, which contains three N-terminal RNA recognition motifs (RRMs), and a C-terminal SPOC (Spen paralogue and orthologue C-terminal) domain.
This entry represents the RNA recognition motif 2 (RRM2) of RBM15.RNA-binding motif protein 15 (RBM15), also termed one-twenty two protein 1 (OTT1), is a novel mRNA export factor and component of the NXF1 pathway. It binds to NXF1 and serves as receptor for the RNA export element RTE [
]. It also possesses mRNA export activity and can facilitate the access of DEAD-box protein DBP5 to mRNA at the nuclear pore complex (NPC) []. RBM15 belongs to the Spen (split end) protein family, which contains three N-terminal RNA recognition motifs (RRMs), and a C-terminal SPOC (Spen paralogue and orthologue C-terminal) domain.
This entry represents the RNA recognition motif 3 (RRM3) of RBM15.RNA-binding motif protein 15 (RBM15), also termed one-twenty two protein 1 (OTT1), is a novel mRNA export factor and component of the NXF1 pathway. It binds to NXF1 and serves as receptor for the RNA export element RTE [
]. It also possesses mRNA export activity and can facilitate the access of DEAD-box protein DBP5 to mRNA at the nuclear pore complex (NPC) []. RBM15 belongs to the Spen (split end) protein family, which contains three N-terminal RNA recognition motifs (RRMs), and a C-terminal SPOC (Spen paralogue and orthologue C-terminal) domain.
This entry represents the N-terminal HTH (helix-turn-helix) domain of YfmP and related proteins. YfmP is a a transcription regulator that regulates the multidrug efflux protein, YfmO, and indirectly regulates the expression of the Bacillus subtilis copZA operon encoding a metallochaperone, CopZ, and a CPx-type ATPase efflux protein, CopA [
]. These proteins belong to the MerR superfamily of transcription regulators that promote expression of several stress regulon genes by reconfiguring the spacer between the -35 and -10 promoter elements. Their conserved N-terminal domains contain predicted winged HTH motifs that mediate DNA binding, while the dissimilar C-terminal domains bind specific coactivator molecules [].
The region featured in this family is found repeated in a number of plant proteins, some of which are expressed specifically in nodules formed during symbiotic interactions with certain bacterial species]. Some of these proteins are also termed glycine-rich proteins (GRPs), due to the presence of a glycine-rich C-terminal region in their structures []. Bacterial infection is required for the induction of nodule-specific GRP genes, and it is thought that nodule-specific GRPs may play non-redundant roles required at specific stages of nodule development []. Members of this group of proteins may be cytosolic, whereas others are thought to be membrane-associated [].
STKs (serine/threonine-protein kinases) catalyse the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TTBK is a neuron-specific kinase that phosphorylates the microtubule-associated protein tau and promotes its aggregation. Higher vertebrates contain two TTBK proteins, TTBK1 and TTBK2, both of which have been implicated in neurodegeneration. Genetic variations in TTBK1 are linked to Alzheimer's disease (AD). Hyperphosphorylated tau is a major component of paired helical filaments that accumulate in the brain of AD patients [
]. Studies in transgenic mice show that TTBK1 is involved in the phosphorylation-dependent pathogenic aggregation of tau [
,
].
The major outer membrane protein of Chlamydia contains four symmetrically spaced variable domains (VDs I
to IV). This protein maintains the structural rigidity of the outer membrane and facilitates porin formation,permitting diffusion of solutes through the intracellular reticulate body membrane. It is believed to play a role
in pathogenesis and possibly adhesion. Along with the lipopolysaccharide, the major out membrane protein(MOMP) makes up the surface of the elementary body cell. Disulphide bond interactions within and between
MOMP molecules and other components form high molecular weight oligomers. The MOMP is the protein usedto determine the different serotypes.
Vinculin is a eukaryotic protein that appears to be involved in the attachment of actin-based microfilaments to
the plasma membrane []. It also interacts with other structural proteins, such as talin []
and alpha-actinin. The protein is located on the cytoplasmic side of focal contacts or adhesion plaques. Vinculinis a large protein (~1000 residues) that contains an acidic N-terminal domain, separated from a smaller, basic
C-terminal domain by a 50-residue proline-rich region []. The central part of the N-terminal domainconsists of a variable number of repeats of a 110-residue domain, one of which is lacking in nematode vinculin
[].
A small domain of the E2 subunit of 2-oxo-acid dehydrogenases that is responsible for the binding of the E3 subunit. Proteins containing this domain include the branched-chain alpha-keto acid dehydrogenase complex of bacteria, which catalyses the overall conversion of alpha-keto acids to acyl-CoA and carbon dioxide; and the E3 binding protein of eukaryotic pyruvate dehydrogenase [
].The structure of the E3-binding domain has three helices with a close bundle fold and right-handed twist going up-and-down [
]. This structure arrangement is also found at the C-terminal of Nucleoid-associated protein Lsr2 and Protein Ku, in which it is a DNA-binding domain [].
Heterogeneous nuclear ribonucleoproteins (hnRNPs) bind directly to nascent RNA polymerase II transcripts and play an important role in both transcript-specific packaging and alternative splicing of pre-mRNAs [
]. hnRNP M proteins are an abundant group of hnRNPs that have been shown to bind avidly to poly(G) and poly(U) RNA homopolymers [].hnRNP M family members are able to induce exon skipping and promote exon inclusion, suggesting that the proteins may broadly contribute to the fidelity of splice site recognition and alternative splicing regulation [
].This entry represents the N-terminal PY nuclear localisation signal of heterogeneous nuclear ribonucleoprotein M [
].
CpG-binding protein is a transcriptional activator that exhibits a DNA binding specificity for unmethylated CpG motifs [
]. CpG is made of a PHD1, an acidic, a basic, a coiled-coil, and a PHD2 domains [,
]. This protein contains three cysteine-rich domains identified at the N-terminal, central region, and C-terminal of the protein, where the central cysteine-rich domain is located within the DNA-binding domain, found between PHD1 and the acidic domains [,
].This domain is found in eukaryotes, and is approximately 240 amino acids in length. It is found at the C-terminal of the CpG-binding protein containing the coiled-coil and PHD2 domains.
Tyrosine-protein kinase Fgr belongs to the SRC family of the Tyr protein kinases. It is a non-receptor tyrosine-protein kinase that transmits signals from cell surface receptors devoid of kinase activity and contributes to the regulation of immune responses, including neutrophil, monocyte, macrophage and mast cell functions, cytoskeleton remodeling in response to extracellular stimuli, phagocytosis, cell adhesion and migration [
,
,
]. It contains a protein kinase domain, an SH2 domain and an SH3 domain. Fgr interacts with tyrosine phosphorylated SYK, FLT3 and HCLS1 via its SH2 domain [].This entry represents the SH2 domain of Fgr.
NusA, or N utilisation substance protein A, is a bacterial transcription termination factor. It binds to RNA polymerase alpha subunit and promotes termination at certain RNA hairpin structures. It is named for the interaction in Escherichia coli of Bacteriophage lambda antitermination protein N with the N-utilisation substance, consisting of NusA, NusB, NusE (ribosomal protein S10), and NusG [
]. This entry represents an acidic 50-residue region found in two copies toward the C terminus of most proteobacterial NusA proteins, spaced about 26 residues apart. Analogous C-terminal extensions in some other bacterial lineages lack apparent homology but appear similarly acidic
The region is found towards the N terminus of a number of adaptor proteins that interact with Abl-family tyrosine kinases [
]. More specifically, it is termed the homeo-domain homologous region (HHR), as it is similar to the DNA-binding region of homeo-domain proteins []. Other homeo-domain proteins have been implicated in specifying positional information during embryonic development, and in the regulation of the expression of cell-type specific genes []. The Abl-interactor proteins are thought to coordinate the cytoplasmic and nuclear functions of the Abl-family kinases, and seem to be involved in cytoskeletal reorganisation, but their precise role remains unclear [].
Lipoic acid is an organosulphur compound that is an essential coenzyme in several multienzyme complexes, including pyruvate dehydrogenase, 2-oxoglutarate dehydrogenase and the glycine cleavage system. Many Gram-positive bacteria, such as Bacillus subtilis, require three proteins for lipoic acid cofactor biosynthesis: LipJ, LipL and LipM [
,
]. LipM is a lipoate:protein ligase that transfers an octanoyl moiety from acyl-carrier protein to the GcvH protein of the glycine cleavage system. LipL, an octanoyltransferase, then transfers this moiety from GcvH to other enzyme complexes. LipA inserts the sulphur group to form the active lipoate cofactor.This entry represents the LipM component of this system.
Adenoviruses are responsible for diseases such as pneumonia, cystitis, conjunctivitis and diarrhoea, all
of which can be fatal to patients who are immunocompromised []. Viral infection commences with recognition of host cell receptors by means of specialised proteins on viral surfaces. Specific attachment
of adenovirus is achieved through interactions between host-cell receptors and the adenovirus fibre protein and is mediated by the globular carboxy-terminal domain of the adenovirus fibre protein, rather than the
'shaft' region represented by this family. The alignment of this family contains two copies of a fifteenresidue repeat found in the 'shaft' region of adenoviral fibre proteins.
The arfaptin homology (AH) domain is a protein domain found in a range of proteins, including arfaptins, protein kinase C-binding protein PICK1 [
] and mammalian 69kDa islet cell autoantigen (ICA69) []. The AH domain of arfaptin has been shown to dimerise and to bind Arf and Rho family GTPases [,
], including ARF1, a small GTPase involved in vesicle budding at the Golgi complex and immature secretory granules.The AH domain consists of three α-helices arranged as an extended antiparallel α-helical bundle. Two arfaptin AH domains associate to form a highly elongated, crescent-shaped dimer [
,
].
SARAF is an endoplasmic reticulum membrane resident protein that serves as a negative regulator of store-operated Ca2+ entry (SOCE) involved in protecting cells from Ca2+ overfilling. It is a single pass ER membrane protein whose systolic-facing domain is responsible for activity and whose luminary-facing domain carries out a regulatory function in conjunction with another membrane protein STIM, an ER single pass membrane protein that detects changes in ER Ca2+ levels through its EF-hand, conserved Ca2+ binding domain. STIM is the major target for SARAF regulation, and thus SARAF negatively regulates the SOCE entry of calcium into cells protecting them from overfilling [
].
This entry represents the RNA recognition motif 3 (RRM3) of Mei2-like proteins.Mei2-like proteins represent an ancient eukaryotic RNA-binding proteins family []. Their corresponding Mei2-like genes appear to have arisen early in eukaryote evolution, been lost from some lineages such as Saccharomyces cerevisiae and metazoans, and diversified in the plant lineage []. The plant Mei2-like genes may function in cell fate specification during development, rather than as stimulators of meiosis [,
]. Members of the Mei2-like family contain three RNA recognition motifs (RRMs). The C-terminal RRM (RRM3) is unique to Mei2-like proteins and it is highly conserved between plants and fungi [].
FliO is an essential component of the flagellum-specific protein export apparatus [
]. It is an integral membrane protein that acts as an assembly chaperone for the flagellar protein FliP. It is crucial, although not strictly required, for efficient FliP assembly and, thus flagellar assembly and function []. FliO is a short protein found in flagellar biosynthesis operons, and which contains a highly hydrophobic N-terminal sequence followed generally by two basic amino acids. This region is reminiscent of but distinct from the twin-arginine translocation signal sequence. Some instances of this gene have been named "FliZ"but phylogenetic tree building supports a single FliO family.
Hematopoietic cell signal transducer (HCST, also known as DAP10) is a transmembrane adaptor that associates with an activation receptor, NKG2D, which is found on NK and subsets of T cells. The ligands for this receptor include MHC class I chain-related (MIC) protein A and protein B and UL16-binding proteins [
]. In activated mouse natural killer (NK) cells, the NKG2D receptor associates with two intracellular adaptors, DAP10 and DAP12, which trigger phosphatidyl inositol 3 kinase (PI3K) and Syk family protein tyrosine kinases, respectively. It has been suggested that the DAP10-PI3K pathway is sufficient to initiate NKG2D-mediated killing of target cells [].
GREB1 (gene regulated in breast cancer 1 protein) is a ESR1 (estrogen receptor 1)-upregulated protein that may regulate estrogen action and it is associated with estrogen-stimulated cell proliferation [
]. It acts as a regulator of hormone-dependent cancer growth in breast, ovarian and prostate cancers [,
,
,
]. This protein may be a target to inhibit tumour-promoting pathways both downstream and independent of ESR1 as a possible treatment strategy.This entry represents the C-terminal domain of GREB1 and similar proteins, which may adopt a twisted α/β structure comprising parallel β-sheetσ connected by α-helices that surround the sheet, as revealed from structure predictions.
GREB1 (gene regulated in breast cancer 1 protein) is a ESR1 (estrogen receptor 1)-upregulated protein that may regulate estrogen action and it is associated with estrogen-stimulated cell proliferation [
]. It acts as a regulator of hormone-dependent cancer growth in breast, ovarian and prostate cancers [,
,
,
]. This protein may be a target to inhibit tumour-promoting pathways both downstream and independent of ESR1 as a possible treatment strategy.This entry represents a the N-terminal domain of GREB1 and similar proteins, whose function is not yet known. Structure predictions suggest that it may have an α/β fold.
This entry represents the DNA-packaging protein Gp16 found in Enterobacteria phage T4 (Bacteriophage T4). Double-stranded DNA packaging in bacteriophages is driven by a molecular motor. The phage T4 motor is composed of the small terminase protein, Gpl6 (18kDa), the large terminase protein, Gp17 (70kDa), and the dodecameric portal protein Gp20 (61kDa). Gp16 is involved in the recognition of the viral DNA substrate, the very first step in the DNA packaging pathway, and stimulates the ATPase and packaging activities associated with Gp17 [
]. Gp16 modulates the activity of Gp17 [] and is required to translocate phage T4 DNA into the head [].
This superfamily represents a domain is found in DCN1-like proteins. Proteins of the DCN family may contribute to neddylation of cullin components of SCF-type E3 ubiquitin ligase complexes, which are multi-protein complexes required for polyubiquitination and subsequent degradation of target proteins by the 26S proteasome [
,
].The structure of this domain is composed entirely of alpha helices [
,
]. It has been referred to as potentiating neddylation domain (PONY) and can be found in association with an N-terminal UBA domain. The PONY domain contains a cullin-binding surface within its C-terminal region and is sufficient to promote neddylation [,
].
This entry represents the C-terminal domain of the O-linked beta-N-acetylglucosamine transferase (OGT, also known as UDP-N-acetylglucosamine--peptide N-acetylglucosaminyltransferase), which catalyses the transfer of a single GlcNAc to the Ser or Thr of nucleocytoplasmic proteins [
]. OGTs have two known domains: the N-terminal tetratricopeptide repeat domain and the C-terminal glycosyltransferase domain []. Deletions of the C-terminal domain result in a complete loss of the enzyme activity []. In animals, OGT is an essential protein that modifies transcription factors, nuclear pore proteins, kinases, and many other proteins. Abnormalities in OGT activities have been associated with type 2 diabetes [
].
This entry represents CbsA, a novel, highly glycosylated, mono-haem cytochrome b558/566 found in Sulfolobus acidocaldarius and several other members of the Sulfolobales, a branch of the Crenarchaeota [
,
]. Encoded at the same locus as CbsA are: CbsB, a hydrophobic protein, SoxL, a Rieske iron-sulphur protein, SoxN, a predicted membrane-bound b-type cytochrome b, and OdsN, a protein of unknown function. Transcriptional studies suggest that these proteins may form a complex analogous to the cytochrome bc1 complex. The redox-active subunits of this complex would consist of CbsA, SoxL and SoxN, while CbsB and OdsN would be additional non-redox-active subunits.
Sorting nexins (SNXs) are Phox homology (PX) domain-containing proteins that are involved in regulating membrane traffic and protein sorting in the endosomal system. SNXs differ from each other in their lipid-binding specificity, subcellular localization and specific function in the endocytic pathway. A subset of SNXs also contain BAR domains [
]. The PX-BAR structural unit determines the specific membrane targeting of SNXs [].Vsp5 is the yeast counterpart of human SNX1 and is part of the retromer complex, which functions in the endosome-to-Golgi retrieval of vacuolar protein sorting receptor Vps10, as well as other later-Golgi proteins [
,
,
].
Contact-dependent growth inhibition (CDI) systems encode polymorphic toxin/immunity proteins that mediate competition between neighbouring bacterial cells. CDI is mediated by the CdiB/CdiA family of two-partner secretion proteins. This domain represents the C-terminal of CdiA proteins (CdiA-CT), which contains the CDI toxin activity. The C-terminal nuclease domain forms a stable complex with its cognate immunity protein. It is also sufficient to inhibit growth when expressed in E. coli cells, consolidating the idea that they constitute the functional CDI toxin. The CdiA-CT C-terminal domains are structurally similar to type IIS restriction endonucleases suggesting that the toxins have metal-dependent DNase activity [
].
This entry represents a family of short 14kDa proteins from Psuedomonas. The structure of
a secreted protein has been solved and deposited as PDB:3npd. It comprises one structural domain with five β-strands and five α-helices. Various comparative structural prediction methods plus its genomic location point to the protein forming a functional dimer with its adjacent genomic partner,
. Together these might be regulated by the other product from the PotABCD operon, namely the putrescine-binding periplasmic protein
. which has been implicated in quorum-sensing. QSregVF is certainly up-regulated in quorum-sensing, and is predicted to be a virulence factor [
].
This domain superfamily is found in the tubulin alpha, beta and gamma chains, as well as the bacterial FtsZ family of proteins and other Tubulin-like proteins. These proteins are GTPases and are involved in polymer formation. Tubulin is the major component of microtubules, while FtsZ is the polymer-forming protein of bacterial cell division, it is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerise into tubes, sheets, and rings
in vitroand is ubiquitous in bacteria and archaea. This is the C-terminal domain.
Many bacterial species swim actively by means of flagella. The flagella
organelle is made of three parts: the basal body, the hook and the filament.The basal body consists of four rings (L,P,S, and M) mounted on a central rod [
].In Salmonella typhimurium and related organisms the rod has been shown to
consist of four different, yet evolutionary related proteins: in the distalportion of the rod there are about 26 subunits of protein flgG and in the
proximal portion there are about six subunits each of proteins flgB, flgC, andflgF.
These four proteins contain a highly conservedasparagine-rich domain at their N terminus.
This entry includes the poxvirus families F1 and C10. C10 proteins are apoptosis regulators, which function to modulate the apoptotic cascades and thereby favour productive viral replication. One of these, M11L inhibits mitochondrial-dependent apoptosis by mimicking and competing with host proteins for the binding and blocking of Bak and Bax, two executioner proteins [
]. The poxvirus F1 family members are related to Vaccinia virus protein F1L, which interacts with and inhibits NLR-mediated interleukin-1 beta/IL1B production in infected cells. F1L suppresses mitochondrial-dependent apoptosis by binding to the BH3 domain of host BAK and prevents BAK from binding active BAX [
,
].
Proteins that transport heavy metals in micro-organisms and eukaryotes share similarities in their sequences and structures.These proteins provide an important focus for research, some being involved in bacterial resistance to toxic metals, such as lead and cadmium, while others are involved in inherited human syndromes, such as Wilson's and Menke's diseases [
]. A conserved 30-residue domain has been found in a number of these heavy metal transport or detoxification proteins [
]. The domain, which has been termed Heavy-Metal-Associated (HMA), contains two conserved cysteines that are probably involved in metal binding. This sub-domain is found in copper-binding proteins.
The CDC48 N-terminal domain is a protein domain found in AAA ATPases including cell division protein 48 (CDC48), VCP-like ATPase (VAT) and N-ethylmaleimide sensitive fusion protein. It is a substrate recognition domain which binds polypeptides, prevents protein aggregation, and catalyses refolding of permissive substrates. It is composed of two equally sized subdomains. The amino-terminal subdomain forms a double-psi β-barrel whose pseudo-twofold symmetry is mirrored by an internal sequence repeat of 42 residues. The carboxy-terminal subdomain forms a novel six-stranded β-clamp fold [
]. Together these subdomains form a kidney-shaped structure. This entry represents the carboxy-terminal subdomain.
This domain is likely to have a protease inhibitory function. The name is derived from peptidase (M4) and YpeB of Bacillus subtilis [
].This domain is found in the propeptide of members of the MEROPS peptidase family M4 (clan MA(E)), which contains the thermostable thermolysins (
), and related thermolabile neutral proteases (bacillolysins) (
) from various species of Bacillus. It is also in many non-peptidase proteins, including Bacillus subtilis YpeB protein - a regulator of SleB spore cortex lytic enzyme - and a large number of eubacterial and archaeal cell wall-associated and secreted proteins which are mostly annotated as 'hypothetical protein'.