This family consists of Arfaptin-1, Arfaptin-2, and protein kinase C-binding protein 1 (PICK1). They all contain a Bin/amphiphysin/Rvs (BAR) domain [
]. BAR-domain-containing proteins are key players in membrane dynamics as their crescent shape allows them to sense or generate curvature on lipid bilayers [].Arfaptin-1 and 2 have been shown to interact with Arf GTPases and Arf-like 1 (Arl1). They are recruited to trans-Golgi membranes through interaction with Arl1 [
].In neurons, PICK1 forms homodimers and serves as one of the scaffolding proteins that interacts with AMPA receptors and regulates their trafficking in synaptic plasticity [
]. In pancreatic beta cells, it forms heteromeric BAR-domain complexes with another cytosolic lipid-binding protein, ICA69. Together they are key regulators of the formation and maturation of insulin granules [].
This entry represents a group of fugal proteins that contain a HTH APSES-type DNA-binding domain. Proteins in this entry include:EFG1 (enhanced filamentous growth protein 1) from Candida albicans. EFG1 is a transcriptional regulator of the switch between 2 heritable states, the white and opaque states [
]. Transcriptional activator Phd1 and its paralog, Sok2, from Saccharomyces cerevisiae. Phd1 is a transcriptional activator that enhances pseudohyphal growth [
]. Sok2 plays a general regulatory role in the cyclic AMP-dependent protein kinase-stimulated (PKA) signal transduction pathway by regulating the expression of genes important in growth and development [
]. Cell pattern formation-associated protein StuA from Emericella nidulans. StuA is required for the orderly differentiation and spatial organisation of cell types of the conidiophore [
].
This entry represents the C-terminal domain of the RNA-directed RNA polymerase (RdRp) found in many positive strand RNA eukaryotic viruses. It is part of the genome polyprotein that contains other polypeptides such as coat proteins VP1 to VP4, core proteins P2A to P2C and P3A, genome-linked protein VPG and picornain 3C (
).
Structural studies indicate that these proteins form the "right hand"structure found in all oligonucleotide polymerases, containing thumb, finger and palm domains, and also the additional bridging finger and thumb domains unique to RNA-directed RNA polymerases [
,
,
,
].Remdesivir, a recent treatment approved for Covid-19 disease, directly interacts with this region of the RdRp (NSP12) from SARS-CoV-2 and explains its mechanism of action via delayed-chain termination [
].
This entry represents capsid coat proteins from a variety of RNA bacteriophages, including Bacteriophage MS2, Bacteriophage GA, Bacteriophage fr, Bacteriophage Q-beta and Pseudomonas phage PP7. These capsid coat proteins share a similar structure, consisting of a 6-stranded β-sheet followed by two helices. Capsid proteins form the bacteriophage coat that encapsidates the viral RNA. In Bacteriophage MS2, 180 copies of this protein form the virion shell and control two distinct processes: sequence-specific RNA encapsidation and repression of replicase translation by binding to an RNA stem-loop structure of 19 nucleotides containing the initiation codon of the replicase gene. The binding of a coat protein dimer to this hairpin shuts off synthesis of the viral replicase, switching the viral replication cycle to virion assembly rather than continued replication [
].
Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Man family is unique in several respects among PTS permease families.It is the only PTS family in which members possess a IID protein.It is the only PTS family in which the IIB constituent is phosphorylated on a histidyl rather than a cysteyl residue.Its permease members exhibit broad specificity for a range of sugars, rather than being specific for just one or a few sugars.The Gut family consists only of glucitol-specific permeases, but these occur both in Gram-negative and Gram-positive bacteria. Escherichia coli consists of IIA protein, a IIC protein and a IIBC protein.This entry represents the N-terminal conserved region of the IIBC component.
The HECT (Homologous to the E6-AP Carboxyl Terminus) domain is an around 350 amino acids motif that has been identified in proteins that all belong to a particular E3 ubiquitin-protein ligase family [
]. HECT domain containing proteins accept ubiquitin from an E2 ubiquitin-conjugating enzyme in the form of a thioester and then transfer it to lysine side chains of target proteins, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains. The site of ubiquitin thioester formation is a conserved cysteine residue located in the last 32-36 aa of the HECT domain []. The amino-terminal part of the HECT domain has been involved in E2 binding [,
]. Once linked to ubiquitin, the target proteins are degraded in the 26 S proteasome.
There is a unique sequence domain at the C terminus of all known 4.1 proteins, known as the C-terminal domain (CTD). Mammalian CTDs are associated with a growing number of protein-protein interactions, although such activities have yet to be associated with invertebrate CTDs. Mammalian CTDs are generally defined by sequence alignment as encoded by exons 18-21. Comparison of known vertebrate 4.1 proteins with invertebrate 4.1 proteins indicates that mammalian 4.1 exon 19 represents a vertebrate adaptation that extends the sequence of the CTD with a Ser/Thr-rich sequence. The CTD was first described as a 22/24kDa domain by chymotryptic digestion of erythrocyte 4.1 (4.1R). CTD is thought to represent an independent folding structure which has gained function since the divergence of vertebrates from invertebrates [
].
Chromosomal replication control, initiator DnaA, conserved site
Type:
Conserved_site
Description:
The bacterial dnaA protein [
,
,
] plays an important role in initiating and regulating chromosomal replication. DnaA is an ATP- and DNA-binding protein. It binds specifically to 9 bp nucleotide repeats known as dnaA boxes which are found in the chromosome origin of replication (oriC).DnaA is a protein of about 50kDa that contains two conserved regions: the first is located in the N-terminal half and corresponds to the ATP-binding domain, the second is located in the C-terminal half and could be involved in DNA-binding. The protein may also bind the RNA polymerase beta subunit, the dnaB and dnaZ proteins, and the groE gene products (chaperonins) [
].The signature pattern in this entry is located in the most conserved part of the putative DNA-binding domain.
Stonin-2 is a stonin family protein involved in synaptic vesicle recycling. It is an adaptor-like protein that serves as a linker between the endocytic proteins AP-2 and Eps15, and the calcium-sensing synaptic vesicle (SV) protein synaptotagmin 1 [
,
,
].Stonin family members are conserved clathrin adaptor complex AP-2mu-related factors that may act as cargo-specific sorting adaptors in endocytosis. They are conserved from C. elegans to humans, but are not found in prokaryotes or yeasts. Their protein structure consists of an N-terminal proline- and serine-rich domain, a central stonin homology domain (SHD), and a C-terminal Mu homology domain that may adopt a conformation similar to the C-terminal sorting signal binding domain of AP-2mu (C-mu2) [
].This entry represents the Mu homology domain (MHD) of Stonin-2.
Intermediate filaments (IF) [
,
,
] are proteins which are primordial components of the cytoskeleton and the nuclear envelope. All IF proteins are structurally similar in that they consist of: a central rod domain comprising some 300 to 350 residues which is arranged in coiled-coiled α-helices, with at least two short characteristic interruptions; a N-terminal non-helical domain (head) of variable length; and a C-terminal domain (tail) which is also non-helical, and which shows extreme length variation between different IF proteins. While IF proteins are evolutionary and structurally related, they have limited sequence homologies except in several regions of the rod domain.This entry represents a family of cytoplasmic intermediate filament proteins of invertebrates, and includes both A and B types [
,
].
Phosphotransferease, sorbitol phosphotransferase enzyme II
Type:
Family
Description:
Bacterial PTS transporters transport and concomitantly phosphorylate their sugar substrates, and typically consist of multiple subunits or protein domains. The Man family is unique in several respects among PTS permease families.It is the only PTS family in which members possess a IID protein.It is the only PTS family in which the IIB constituent is phosphorylated on a histidyl rather than a cysteyl residue.Its permease members exhibit broad specificity for a range of sugars, rather than being specific for just one or a few sugars.The Gut family consists only of glucitol-specific permeases, but these occur both in Gram-negative and Gram-positive bacteria. Escherichia coli consists of IIA protein, a IIC protein and a IIBC protein.This family is specific for the IIBC component.
This entry represents the SET domain of SET and MYND domain-containing protein 5 (SMYD5, also termed protein NN8-4AG, or retinoic acid-induced protein 15)). SMYD5 functions as a histone lysine methyltransferase that mediates H4K20me3 at heterochromatin regions [
]. It plays an important role in chromosome integrity by regulating heterochromatin and repressing endogenous repetitive DNA elements during differentiation []. In zebrafish embryogenesis, it plays pivotal roles in both primitive and definitive hematopoiesis [].The SMYD family consists of five members including SMYD1/2/3/4/5. They contain two highly conserved structural and functional domains, the SET and MYND domains. The SET domain is involved in lysine methylation, while the MYND domain is involved in protein-protein interaction. They are essential in several mammalian developmental pathways [
,
,
,
].
This entry represents a coiled-coil region close to the C terminus of centrosomal proteins that is directly responsible for recruiting AKAP-450 and pericentrin to the centrosome. Hence the suggested name for this region is a PACT domain (pericentrin-AKAP-450 centrosomal targeting). This domain is also present at the C terminus of coiled-coil proteins from Drosophila and Schizosaccharomyces pombe (Fission yeast), and in the Drosophila protein it is sufficient for targeting to the centrosome in mammalian cells. The function of these proteins is unknown but they seem good candidates for having a centrosomal or spindle pole body location. The final 22 residues of this domain in AKAP-450 appear specifically to be a calmodulin-binding domain, indicating that this protein at least is likely to contribute to centrosome assembly [].
This entry represents Jiv (J-domain protein interacting with viral protein), a domain found in the DnaJ protein from eukaryotes [
]. It can also be found in the N-terminal region of the pestivirus viral polypeptide. The viral protein interacts stably with non structural (NS) protein NS2, causing a conformational change in NS2-NS3 and stimulates NS2-NS3 cleavage in trans. Cleavage of NS2-NS3 increases cytopathogenicity and consequently aids viral replication. Jiv therefore acts as a regulating cofactor for NS2 auto-protease. The efficient release of NS3 from the viral polypeptide by Jiv is considered crucial to the pestivirus cytopathogenicity []. In eukaryotes, it usually lies 40 residues downstream of DnaJ domain (). However, the function of this domain in eukaryotes is still unknown.
Signal recognition particle 43kDa protein, chromodomain 3
Type:
Domain
Description:
This group of proteins includes chloroplast signal-recognition particle 43kDa protein (CpSRP43) and similar plant proteins. In Arabidopsis, CpSRP43 (AT2G47450, CAO) is a component of the chloroplast signal recognition particle pathway that is involved in LHCP (light-harvesting chlorophyll a/b-binding protein) targeting. LHCPs are synthesised in the cytoplasm, and after import into the chloroplast, they are targeted and inserted into the thylakoid membrane. In addition, CpSRP43 functions as a chaperone preventing aggregation of LHCP following import into the chloroplast [
,
].This entry represents the chromodomain 3 of SRP43, which appear to play a role in the functional organization of the eukaryotic nucleus. This domain is involved in the binding activity of these proteins to methylated histone tails and maybe RNA [
].
Eukaryotic eIF-5A was initially thought to function as a translation initiation factor, based on its ability to stimulate methionyl-puromycin synthesis. However, subsequent work revealed a role for eIF5A in translation elongation [
,
]. Depletion or inactivation of eIF-5A in the yeast Saccharomyces cerevisiae (Baker's yeast) resulted in the accumulation of polysomes and an increase in ribosomal transit times. Addition of recombinant eIF-5A from yeast, but not a derivative lacking hypusine, enhanced the rate of tripeptide synthesis in vitro. Moreover, inactivation of eIF-5A mimicked the effects of the eEF2 inhibitor sordarin, indicating that eIF-5A might function together with eEF2 to promote ribosomal translocation. Finally, it was shown that eIF5A is specifically required to promote peptide-bond formation between consecutive proline residues. It has been proposed to stimulate the peptidyl-transferase activity of the ribosome and facilitate the reactivity of poor substrates like proline [].eIF-5A is a cofactor for the Rev and Rex transactivator proteins of human immunodeficiency virus-1 and T-cell leukaemia virus I, respectively [
,
,
]. IF-5A is the sole protein in eukaryotes and archaea to contain the unusual amino acid hypusine (Ne-(4-amino-2-hydroxybutyl)lysine) that is an absolute functional requirement. The first step in the post-translational modification of lysine to hypusine is catalyzed by the enzyme deoxyhypusine synthase, the structure of which has been reported []. The archaeal IF-5A proteins have not been studied as comprehensively as their eukaryotic homologues, though the crystal structure of the Pyrobaculum aerophilum protein has been determined. Unmodified P. aerophilum IF-5A is found to be a beta structure with two domains and three separate hydrophobic cores. The lysine (Lys42) that is post-translationally modified by deoxyhypusine synthase is found at one end of the IF-5A molecule in a turn between beta strands beta4 and beta5; this lysine residue is freely solvent accessible. The C-terminal domain is found to be homologous to the cold-shock protein CspA of E. coli, which has a well characterised RNA-binding fold, suggesting that IF-5A is involved in RNA binding [
].This family also includes the Woronin body major protein Hex1, whose sequence and structure are similar to eukaryotic initiation factor 5A (eIF5A), suggesting they share a common ancestor during evolution [
]. Woronin bodies are important for stress resistance and virulence [].
Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll 'a' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane [
,
,
]. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10kDa), andare involved in PSII assembly, stabilisation, dimerisation, and photo-protection [
]. This entry represents the intrinsic antenna proteins CP43 (PsbC) and CP47 (PsbB) found in the reaction centre of PSII. These polypeptides bind to chlorophyll a and beta-carotene and pass the excitation energy on to the reaction centre [
]. This entry also includes the iron-stress induced chlorophyll-binding protein CP43' (IsiA), which evolved in cyanobacteria from a PSII protein to cope with light limitations and stress conditions. Under iron-deficient growth conditions, CP43' associates with PSI to form a complex that consists of a ring of 18 or more CP43' molecules around a PSI trimer, which significantly increases the light-harvesting system of PSI. IsiA can also provide photoprotection for PSII [].
Peroxidases are haem-containing enzymes that use hydrogen peroxide as
the electron acceptor to catalyse a number of oxidative reactions.Most haem peroxidases follow the reaction scheme:
Fe3++ H
2O
2-->[Fe
4+=O]R' (Compound I) + H2O
[Fe4+=O]R' + substrate -->[Fe
4+=O]R (Compound II) + oxidised substrate[Fe4+=O]R + substrate -->Fe
3++ H
2O + oxidised substrate
In this mechanism, the enzyme reacts with one equivalent of H
2O
2to give
[Fe4+=O]R' (compound I). This is a two-electron oxidation/reduction reaction where H
2O
2is reduced to water and the enzyme is oxidised. One
oxidising equivalent resides on iron, giving the oxyferryl [] intermediate, while in many peroxidases the porphyrin (R) is oxidised to
the porphyrin pi-cation radical (R'). Compound I then oxidises an organic substrate to give a substrate radical [
].Haem peroxidases include two superfamilies: one found in bacteria, fungi, plants and the second found in animals. The first one can be
viewed as consisting of 3 major classes. ClassI, the intracellular peroxidases, includes: yeast cytochrome c peroxidase
(CCP), a soluble protein found in the mitochondrial electron transportchain, where it probably protects against toxic peroxides; ascorbate
peroxidase (AP), the main enzyme responsible for hydrogen peroxide removalin chloroplasts and cytosol of higher plants; and bacterial catalase-
peroxidases, exhibiting both peroxidase and catalase activities. It isthought that catalase-peroxidase provides protection to cells under
oxidative stress [].Class II consists of secretory fungal peroxidases: ligninases, or lignin
peroxidases (LiPs), and manganese-dependent peroxidases (MnPs). These aremonomeric glycoproteins involved in the degradation of lignin. In MnP,
Mn2+serves as the reducing substrate [
]. Class II proteins contain fourconserved disulphide bridges and two conserved calcium-binding sites.
Class III consists of the secretory plant peroxidases, which have multiple
tissue-specific functions: e.g., removal of hydrogen peroxide fromchloroplasts and cytosol; oxidation of toxic compounds; biosynthesis of the
cell wall; defence responses towards wounding; indole-3-acetic acid (IAA) catabolism; ethylene biosynthesis; and so on. Class III proteins are
also monomeric glycoproteins, containing four conserved disulphide bridges and two calcium ions, although the placement of the disulphides differs
from class II enzymes. The crystal structures of a number of these proteins show that they share the same architecture - two all-alpha domains between which the haem group is embedded. This entry represents an active site found in a number of peroxidases.
This entry represents the CRIB domain. Many putative downstream effectors of the small GTPases Cdc42 and Rac contain a GTPase binding domain (GBD), also called p21 binding domain (PBD), which has been shown to specifically bind the GTP bound form of Cdc42 or Rac, with a preference for Cdc42 [
,
]. The most conserved region of GBD/PBD domains is the N-terminal Cdc42/Rac interactive binding motif (CRIB), which consists of about 16 amino acids with the consensus sequence I-S-x-P-x(2,4)-F-x-H-x(2)-H-V-G [].Although the CRIB motif is necessary for the binding to Cdc42 and Rac, it is not sufficient to give high-affinity binding [
,
]. A less well conserved inhibitory switch (IS) domain responsible for maintaining the proteins in a basal (autoinhibited) state is located C-terminaly of the CRIB-motif [,
,
].GBD domains can adopt related but distinct folds depending on context. Although GBD domains are largely unstructured in the free state, the IS domain forms an N-terminal β-hairpin that immediately follows the conserved CRIB motif and a central bundle of three α-helices in the autoinhibited state. The interaction between GBD domains and their respective G proteins leads to the formation of a high-affinity complex in which unstructured regions of both the effector and the G protein become rigid. CRIB motifs from various GBD domains interact with Cdc42 in a similar manner, forming an intermolecular β-sheet with strand β-2 of Cdc42. Outside the CRIB motif, the C-terminal of the various GBD domains are very divergent and show variation in their mode of binding to Cdc42, perhaps determining the specificity of the interaction. Binding of Cdc42 or Rac to the GBD domain causes a dramatic conformational change, refolding part of the IS domain and unfolding the rest [
,
,
,
,
].Some proteins known to contain a CRIB domain are listed below:Mammalian activated Cdc42-associated kinases (ACKs), nonreceptor tyrosine kinases implicated in integrin-coupled pathways.Mammalian p21-activated kinases (PAK1 to PAK4), serine/threonine kinases that modulate cytoskeletal assembly and activate MAP-kinase pathways.Mammalian Actin nucleation-promoting factor WAS (also known as Wiskott-Aldrich Symdrom Proteins, WASPs), non-kinase proteins involved in the organisation of the actin cytoskeleton.Yeast STE20 and CLA4, the homologues of mammalian PAKs. STE20 is involved in the mating/pheromone MAP kinase cascade.
This superfamily represents a structural domain with a closed β-barrel fold with greek-key topology. Domains with this structure can be found in the following proteins:Riboflavin synthase, which contains two homologous domains of this structure [
].The FAD-binding (N-terminal) domain of ferredoxin reductase (flavodoxin reductase), where the FAD-binding domain is coupled with a NADP-binding domain of the alpha/beta class [
].The FAD-binding domain of NADPH-cytochrome p450 reductase; however, this domain has an additional α-helical domain inserted into it [
].Riboflavin synthase (
) catalyses the final step in the biosynthesis of vitamin B2, namely the dismutation of two molecules of 6,7-dimethyl-8-ribityllumazine to yield riboflavin and 4-(1-D-ribitylamino)-5-amino-2,6-dihydroxypyrimidine (which is recycled) [
].Flavins can act as primary and secondary emitters in bacterial luminescence. Lumazine proteins are involved in the bioluminescence of certain marine bacteria. These proteins are catalytically inactive, but they resemble riboflavin synthase [
]. Lumazine is non-covalently bound to the fluorophore 6,7-dimethyl-8-ribityllumazine, which is the substrate of riboflavin synthase.Ferredoxin reductase is a FAD-containing oxidoreductase that transports electrons between flavodoxin or ferredoxin and NADPH. In Escherichia coli, ferredoxin reductase together with flavodoxin is involved in the reductive activation of three enzymes: cobalamin-dependent methionine synthase, pyruvate formate lyase and anaerobic ribonucleotide reductase [
]. An additional function for the oxidoreductase appears to be to protect the bacteria against oxygen radicals. The β-barrel domain found in ferredoxin reductase is similar to that found in: NAD(P)H:flavin oxidoreductase [], the core domain of nitrate reductase [], cytochrome b5 reductase [], phthalate dioxygenase reductase (which contains an additional 2Fe-2S ferredoxin domain) [], benzoate dioxygenase reductase [], the PyrK subunit of dihydroorotate dehydrogenase B [], the central domain of flavohaemoglobin (which contains an additional globin domain) [], and methane monooxygenase component C (MmoC) []. Microsomal NADPH-cytochrome P450 reductase (
) (CPR) (NADPH-haemoprotein reductase) is a membrane-bound protein that contains both FAD and FMN. CPR catalyses electron transfer from NADPH to all known microsomal cytochromes P450. The β-barrel domain found in NADPH-cytochrome p450 reductase is similar to that found in: sulphite reductase flavoprotein [], and the FAD/NADP+ domain of neuronal nitric-oxide synthase [].
Peroxidases are haem-containing enzymes that use hydrogen peroxide as
the electron acceptor to catalyse a number of oxidative reactions.Most haem peroxidases follow the reaction scheme:
Fe3++ H
2O
2-->[Fe
4+=O]R' (Compound I) + H2O
[Fe4+=O]R' + substrate -->[Fe
4+=O]R (Compound II) + oxidised substrate[Fe4+=O]R + substrate -->Fe
3++ H
2O + oxidised substrate
In this mechanism, the enzyme reacts with one equivalent of H
2O
2to give
[Fe4+=O]R' (compound I). This is a two-electron oxidation/reduction reaction where H
2O
2is reduced to water and the enzyme is oxidised. One
oxidising equivalent resides on iron, giving the oxyferryl [] intermediate, while in many peroxidases the porphyrin (R) is oxidised to
the porphyrin pi-cation radical (R'). Compound I then oxidises an organic substrate to give a substrate radical [
].Haem peroxidases include two superfamilies: one found in bacteria, fungi, plants and the second found in animals. The first one can be
viewed as consisting of 3 major classes. ClassI, the intracellular peroxidases, includes: yeast cytochrome c peroxidase
(CCP), a soluble protein found in the mitochondrial electron transportchain, where it probably protects against toxic peroxides; ascorbate
peroxidase (AP), the main enzyme responsible for hydrogen peroxide removalin chloroplasts and cytosol of higher plants; and bacterial catalase-
peroxidases, exhibiting both peroxidase and catalase activities. It isthought that catalase-peroxidase provides protection to cells under
oxidative stress [].Class II consists of secretory fungal peroxidases: ligninases, or lignin
peroxidases (LiPs), and manganese-dependent peroxidases (MnPs). These aremonomeric glycoproteins involved in the degradation of lignin. In MnP,
Mn2+serves as the reducing substrate [
]. Class II proteins contain fourconserved disulphide bridges and two conserved calcium-binding sites.
Class III consists of the secretory plant peroxidases, which have multiple
tissue-specific functions: e.g., removal of hydrogen peroxide fromchloroplasts and cytosol; oxidation of toxic compounds; biosynthesis of the
cell wall; defence responses towards wounding; indole-3-acetic acid (IAA) catabolism; ethylene biosynthesis; and so on. Class III proteins are
also monomeric glycoproteins, containing four conserved disulphide bridges and two calcium ions, although the placement of the disulphides differs
from class II enzymes. The crystal structures of a number of these proteins show that they share the same architecture - two all-alpha domains between which the haem group is embedded. This domain is found in Class III secretory plant peroxidases.
NADPH-dependent 7-cyano-7-deazaguanine reductase, QueF type 1
Type:
Family
Description:
Members of this group are involved in the biosynthesis of queuosine, a 7-deazaguanine-modified nucleoside found in tRNA(GUN) of Bacteria and Eukarya. QueF (YkvM) from Bacillus subtilis has been shown to catalyse the NADPH-dependent reduction of 7-cyano-7-deazaguanine to 7-aminomethyl-7-deazaguanine, a late step in the biosynthesis of queuosine [
].Queuosine is located in the anticodon wobble position 34 of tRNAs specific for Tyr, His, Asp, and Asn. With few exceptions (such as yeast and mycoplasma), it is widely distributed in most prokaryotes and eukaryotes [
]. Queuosine is based on a very unusual 7-deazaguanosine core, which is further modified by addition of a cyclopentendiol ring [].This group of proteins belongs to the T fold structural superfamily and is related to GTP cyclohydrolase (GTP-CH-I) FolE. Two major features differentiate the QueF and FolE groups. First, the strictly conserved QueF motif E-78(S/L)K(S/A)hK(L/Y)(Y/F/W)-85 (residue numbers are those of B. subtilis YkvM, h is hydrophobic amino acid) is characteristic of the QueF family, but is not found in the FolE family. Second, four catalytically important residues in FolE [
], His-112, 113, and 179 and Cys-181 (Escherichia coli FolE numbering), are absent in the QueF group.QueF-like proteins form two groups, type I proteins exemplified by Bacillus subtilis YkvM (
) and type II proteins exemplified by Escherichia coli YqcD (
). The type I proteins are comparable in size with bacterial and mammalian FolE, whereas the type II proteins are larger and are predicted to be comprised of two domains, similar to plant FolE [
].The discovery of oxidoreductase activity within the FolE scaffold is an intriguing example of structural and functional evolution, particularly in light of the need to bind a second organic substrate, the cofactor NADPH. The specificity of the QueF motif to the QueF family suggests that these residues might be involved in NADPH binding [
]. Additionally, the binding of a modified base to QueF, instead of the nucleotide to FolE, in principle leaves vacant in QueF the binding site occupied by the ribosyl portion of GTP. This putative "empty"ribosyl pocket might also contribute to NADPH binding [
].
Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll 'a' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane [
,
,
]. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection []. This entry represents the intrinsic antenna proteins CP43 (PsbC) and CP47 (PsbB) found in the reaction centre of PSII. These polypeptides bind to chlorophyll a and beta-carotene and pass the excitation energy on to the reaction centre [
]. This entry also includes the iron-stress induced chlorophyll-binding protein CP43' (IsiA), which evolved in cyanobacteria from a PSII protein to cope with light limitations and stress conditions. Under iron-deficient growth conditions, CP43' associates with PSI to form a complex that consists of a ring of 18 or more CP43' molecules around a PSI trimer, which significantly increases the light-harvesting system of PSI. IsiA can also provide photoprotection for PSII [].
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].G2A is expressed mainly in lymphocytes and its expression is up-regulated by stress and prolonged mitogenic signals. Mice
lacking the receptor have been found to develop a late-onset autoimmunedisease [
]. It has therefore been suggested that G2A may function as a sensor of LPC levels at sites of inflammation and act as a negativeregulator of lymphocyte growth to limit expansion of tissue-infiltrating
cells and overt autoimmune disease. Activation of G2A by LPC results inan increase in intracellular calcium levels (through coupling to Gi
proteins) and activation of MAP kinases. The receptor has also been shown to couple to G13 proteins, causing RhoA activation and formation of actin
stress fibres.
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Computational methods, including percent identity plots, hydropathy profiles and BLAST, have been used to analyse a gene-rich cluster at human chromosome 12p13 and to compare it with its syntenic region in mouse chromosome 6 [
,
,
]. Of 6 genes identified, a number were novel receptors, including GPR153 (also known as PGR1) and GPR162 (also known as GRCA) []. GPR153 is a cerebellar target of the Gli1 transcription factor, which is involved in the maintenance and proliferation of grabule neuron precursor cells in the cerebellum, and like GPR162 has a noted role in food uptake and decision making processes [].This entry represents G-protein coupled receptor 153 and G-protein coupled receptor 162.
This entry represents the CRIB domain superfamily. Many putative downstream effectors of the small GTPases Cdc42 and Rac contain a GTPase binding domain (GBD), also called p21 binding domain (PBD), which has been shown to specifically bind the GTP bound form of Cdc42 or Rac, with a preference for Cdc42 [,
]. The most conserved region of GBD/PBD domains is the N-terminal Cdc42/Rac interactive binding motif (CRIB), which consists of about 16 amino acids with the consensus sequence I-S-x-P-x(2,4)-F-x-H-x(2)-H-V-G [].Although the CRIB motif is necessary for the binding to Cdc42 and Rac, it is not sufficient to give high-affinity binding [
,
]. A less well conserved inhibitory switch (IS) domain responsible for maintaining the proteins in a basal (autoinhibited) state is located C-terminaly of the CRIB-motif [,
,
].GBD domains can adopt related but distinct folds depending on context. Although GBD domains are largely unstructured in the free state, the IS domain forms an N-terminal beta; hairpin that immediately follows the conserved CRIB motif and a central bundle of three alpha; helices in the autoinhibited state. The interaction between GBD domains and their respective G proteins leads to the formation of a high-affinity complex in which unstructured regions of both the effector and the G protein become rigid. CRIB motifs from various GBD domains interact with Cdc42 in a similar manner, forming an intermolecular beta;-sheet with strand beta;-2 of Cdc42. Outside the CRIB motif, the C-termini of the various GBD domains are very divergent and show variation in their mode of binding to Cdc42, perhaps determining the specificity of the interaction. Binding of Cdc42 or Rac to the GBD domain causes a dramatic conformational change, refolding part of the IS domain and unfolding the rest [
,
,
,
,
].Some proteins known to contain a CRIB domain are listed below:Mammalian activated Cdc42-associated kinases (ACKs), nonreceptor tyrosine kinases implicated in integrin-coupled pathways.Mammalian p21-activated kinases (PAK1 to PAK4), serine/threonine kinases that modulate cytoskeletal assembly and activate MAP-kinase pathways.Mammalian Actin nucleation-promoting factor WAS proteins (WASPs), non-kinase proteins involved in the organisation of the actin cytoskeleton.Yeast STE20 and CLA4, the homologues of mammalian PAKs. STE20 is involved in the mating/pheromone MAP kinase cascade.
ABL-family proteins are highly conserved tyrosine kinases. Each ABL protein contains an SH3-SH2-TK (Src homology 3-Src homology 2-tyrosine kinase) domain cassette, which confers autoregulated kinase activity and is common among nonreceptor tyrosine kinases. Several types of posttranslational modifications control ABL catalytic activity, subcellular localization, and stability, with consequences for both cytoplasmic and nuclear ABL functions. Binding partners provide additional regulation of ABL catalytic activity, substrate specificity, and downstream signaling. By combining this cassette with actin-binding and -bundling domains, ABL proteins are capable of connecting phosphoregulation with actin-filament reorganization [
]. Vertebrate paralogs, ABL1 and ABL2, have evolved to perform specialized functions. ABL1 includes nuclear localization signals and a DNA binding domain which is used to mediate DNA damage-repair functions, while ABL2 has additional binding capacity for actin and for microtubules to enhance its cytoskeletal remodeling functions. SH2 is involved in several autoinhibitory mechanism that constrain the enzymatic activity of the ABL-family kinases. In one mechanism SH2 and SH3 cradle the kinase domain while a cap sequence stabilizes the inactive conformation resulting in a locked inactive state. Another involves phosphatidylinositol 4,5-bisphosphate (PIP2) which binds the SH2 domain through residues normally required for phosphotyrosine binding in the linker segment between the SH2 and kinase domains. The SH2 domain contributes to ABL catalytic activity and target site specificity. It is thought that the ABL catalytic site and SH2 pocket have co-evolved to recognize the same sequences. Recent work now supports a hierarchical processivity model in which the substrate target site most compatible with ABL kinase domain preferences is phosphorylated with greatest efficiency. If this site is compatible with the ABL SH2 domain specificity, it will then reposition and dock in the SH2 pocket. This mechanism also explains how ABL kinases phosphorylates poor targets on the same substrate if they are properly positioned and how relatively poor substrate proteins might be recruited to ABL through a complex with strong substrates that can also dock with the SH2 pocket [].This entry includes the SH2 domain of ABL-family proteins. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites [
,
,
,
].
Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll 'a' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane [
,
,
]. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection []. In PSII, the oxygen-evolving complex (OEC) is responsible for catalysing the splitting of water to O(2) and 4H+. The OEC is composed of a cluster of manganese, calcium and chloride ions bound to extrinsic proteins. In cyanobacteria there are five extrinsic proteins in OEC (PsbO, PsbP-like, PsbQ-like, PsbU and PsbV), while in plants there are only three (PsbO, PsbP and PsbQ), PsbU and PsbV having been lost during the evolution of green plants [
].This family represents the PSII extrinsic protein PsbU, which forms part of the OEC in cyanobacteria and red algae. PsbU acts to stabilise the oxygen-evolving machinery of PSII against heat-induced inactivation, which is crucial for cellular thermo-tolerance [
].
Frizzleds are seven transmembrane-spanning proteins that constitute an unconventional class of G protein-coupled receptors [
]. They have important regulatory roles during embryonic development [,
].Frizzleds expose their large N terminus on the extracellular side. The N-terminal, extracellular cysteine-rich domain (CRD) has been implicated as the Wnt binding domain and its structure has been solved [
]. The cysteine-rich domain of Frizzled (Fz) is shared with other receptor tyrosine kinases that have roles in development including the muscle-specific receptor tyrosine kinase (MuSK), the neuronal specific kinase (NSK2), and ROR1 and ROR2. The cytoplasmic side of many Fz proteins has been shown to interact with the PDZ domains of PSD-95 family members and is thought to have a role in the assembly of signalling complexes. The conserved cytoplasmic motif of Fz, Lys-Thr-X-X-X-Trp, is required for activation of the beta-catenin pathway, and for membrane localisation and phosphorylation of Dsh.In Drosophila melanogaster, the frizzled locus is involved in planar cell polarity, which is the coordination of the cytoskeleton of epidermal cells to produce a parallel array of cuticular hairs and bristles [
,
]. In the wild-type wing, all hairs point towards the distal tip [], whereas in Fz mutants, the orientation of individual hairs with respect both to their neighbours and to the organism as a whole is altered. In the developing wing, Fz function is required for cells to respond to the extracellular polarity signal as well as the proximal-distal transmission of an intracellular polarity signal.In Caenorhabditis elegans, protein mom-5 is the equivalent of frizzled [
].Three main signalling pathways are activated by agonist-activated Frizzled proteins: the Fz/beta-catenin pathway, the Fz/Ca2+ pathway and the Fz/PCP (planar cell polarity) pathway [
]. The Wnt/beta-catenin pathway is the best studied signalling pathway involving Fz receptors. In the Wnt/beta-catenin pathway the first downstream cytoplasmic components activated by Fz signalling include Dishevelled (Dsh) and/or its regulatory kinases.This entry represents the cysteine-rich Wnt-binding domain (CRD) of Frizzled-5 (Fz5). The cysteine-rich domain (CRD) is an essential extracellular portion of the Fz5 receptor, and is required for binding Wnt proteins [
].
Amphiphysins belong to the expanding BAR (Bin-Amphiphysin-Rvsp) family proteins, all members of which share a highly conserved N-terminal BAR domain, which has predicted coiled-coil structures required for amphiphysin dimerisation and plasma membrane interaction [
]. Almost all members also share a conserved C-terminal Src homology 3 (SH3) domain, which mediates their interactions with the GTPase dynamin and the inositol-5'-phosphatase synaptojanin 1 in vertebrates and with actin in yeast. The central region of all these proteins is most variable. In mammals, the central region of amphiphysin I and amphiphysin IIa contains a proline-arginine-rich region for endophilin binding and a CLAP domain, for binding to clathrin and AP-2. The interactions mediated by both the central and C-terminal domains are believed to be modulated by protein phosphorylation [
,
].Amphiphysins are proteins that are thought to be involved in clathrin-mediated endocytosis, actin function, and signalling pathways. In vertebrates, amphiphysins may regulate, but are not essential for clathrin-mediated endocytosis of SVs. However, in Drosophila amphiphysin is not involved at all in SV endocytosis but is required for T-tubule structure and excitation-contraction coupling muscles and plays a role in membrane morphogenesis in developing photoreceptors and a variety of other cells [
].This entry includes amphiphysin 2 (also known as Myc box-dependent-interacting protein 1) , which was the second amphiphysin family member found in mammals. The gene encoding it has been found to be alternatively spliced. The various products have been named: BIN-1, Sh3P9, BRAMP-2 and ALP-1. They have different distribution patterns, with the largest form (~95 kD) being expressed solely in the brain, where it shares a very similar (if not identical) distribution pattern to amphiphysin 1 [
]. This protein is a a key player in the control of plasma membrane curvature, membrane shaping and membrane remodeling. It is required in muscle cells for the formation of T-tubules, tubular invaginations of the plasma membrane that function in depolarization-contraction coupling []. It is also involved in the regulation of intracellular vesicles sorting, modulation of BACE1 trafficking and the control of amyloid-beta production []. It has been shown that in vitro, it has actin bundling activity and stabilizes actin filaments against depolymerization [].
Interleukin-1 alpha and interleukin-1 beta (IL-1 alpha and IL-1 beta) are
cytokines that participate in the regulation of immune responses, inflammatory reactions, and hematopoiesis []. Two types of IL-1 receptor, each with three extracellular immunoglobulin (Ig)-like domains, limited sequence similarity (28%) and different pharmacological characteristics have been cloned from mouse and human cell lines: these have been termed type I and type II receptors []. The receptors both exist in transmembrane (TM) and soluble forms: the soluble IL-1 receptor is thought to be post-translationally derived from cleavage of the extracellular portion of the membrane receptors.Both IL-1 receptors appear to be well conserved in evolution, and map to the
same chromosomal location []. The receptors can both bind all three forms of IL-1 (IL-1 alpha, IL-1 beta and IL-1RA).The crystal structures of IL1A and IL1B [
] have been solved, showing them to share the same 12-stranded β-sheet structure as both the heparin binding growth factors and the Kunitz-type soybean trypsin inhibitors []. The β-sheets are arranged in 3 similar lobes around a central axis, 6 strands forming an anti-parallel β-barrel. Several regions, especially the loop between strands 4 and 5, have been implicated in receptor binding.
The Vaccinia virus genes B15R and B18R each encode proteins with N-terminal
hydrophobic sequences, possible sites for attachment of N-linked carbohydrate and a short C-terminal hydrophobic domain []. These propertiesare consistent with the mature proteins being either virion, cell surface or secretory glycoproteins. Protein sequence comparisons reveal that the gene products are related to each other (20% identity) and to the Ig superfamily. The highest degree of similarity is to the human and murine interleukin-1 receptors, although both proteins are related to a wide range of Ig superfamily members, including the interleukin-6 receptor. A novel method for virus immune evasion has been proposed in which the product of one or both of these proteins may bind interleukin-1 and/or interleukin-6, preventing these cytokines reaching their natural receptors [
]. A similar gene product from Cowpox virus (CPV) has also been shown to specifically bind murine IL-1 beta [].This entry represents the IL-1 family alpha (IL1A).
Major Histocompatibility Complex (MHC) glycoproteins are heterodimeric cell surface receptors that function to present antigen peptide fragments to T cells responsible for cell-mediated immune responses. MHC molecules can be subdivided into two groups on the basis of structure and function: class I molecules present intracellular antigen peptide fragments (~10 amino acids) on the surface of the host cells to cytotoxic T cells; class II molecules present exogenously derived antigenic peptides (~15 amino acids) to helper T cells. MHC class I and II molecules are assembled and loaded with their peptide ligands via different mechanisms. However, both present peptide fragments rather than entire proteins to T cells, and are required to mount an immune response.Class I MHC glycoproteins are expressed on the surface of all somatic nucleated cells, with the exception of neurons. MHC class I receptors present peptide antigens that are synthesised in the cytoplasm, which includes self-peptides (presented for self-tolerance) as well as foreign peptides (such as viral proteins). These antigens are generated from degraded protein fragments that are transported to the endoplasmic reticulum by TAP proteins (transporter of antigenic peptides), where they can bind MHC I molecules, before being transported to the cell surface via the Golgi apparatus [
,
]. MHC class I receptors display antigens for recognition by cytotoxic T cells, which have the ability to destroy viral-infected or malignant (surfeit of self-peptides) cells.MHC class I molecules are comprised of two chains: a MHC alpha chain (heavy chain), and a beta2-microglobulin chain (light chain), where only the alpha chain spans the membrane. The alpha chain has three extracellular domains (alpha 1-3, with alpha1 being at the N terminus), a transmembrane region and a C-terminal cytoplasmic tail. The soluble extracellular beta-2 microglobulin chain associates primarily with the alpha-3 domain and is necessary for MHC stability. The alpha1 and alpha2 domains of the alpha chain are referred to as the recognition region, because the peptide antigen binds in a deep groove between these two domains. This entry represents the alpha chain domains alpha1 and alpha2 that make up this recognition region (the alpha3 domain is represented by (
).
Anthrax toxin, edema factor, central domain superfamily
Type:
Homologous_superfamily
Description:
Anthrax toxin is a plasmid-encoded toxin complex produced by the Gram-positive, spore-forming bacteria, Bacillus anthracis. The toxin consists of three non-toxic proteins: the protective antigen (PA), the lethal factor (LF) and the edema factor (EF) [
]. These component proteins self-assemble at the surface of host cell receptors, yielding a series of toxic complexes that can produce shock-like symptoms and death. Anthrax toxin is one of a large group of Bacillus and Clostridium exotoxins referred to as binary toxins, forming independent enzymatic (A moiety) and binding (B moiety) components. The LF and EF proteins are the enzymes (A moiety) that act on cytosolic substrates, while PA is a multi-functional protein (B moiety) that binds to cell surface receptors, mediates the assembly and internalisation of the complexes, and delivers them to the host cell endosome []. Once PA is attached to the host receptor [], it must then be cleaved by a host cell surface (furin family) protease before it is able to bind EF and LF. The cleavage of the N terminus of PA enables the C-terminal fragment to self-associate into a ring-shaped heptameric complex (prepore) that can bind LF or EF competitively. The PA-LF/EF complex is then internalised by endocytosis, and delivered to the endosome, where PA forms a pore in the endosomal membrane in order to translocate LF and EF to the cytosol. LF is a Zn-dependent metalloprotease that cleaves and inactivates mitogen-activated protein (MAP) kinases, kills macrophages, and causes death of the host by inhibiting cell proliferation [,
]. EF is a calcium-and calmodulin-dependent adenylyl cyclase that can cause edema (fluid-filled swelling) when associated with PA. EF is not toxic by itself, and is required for the survival of germinated Bacillus spores within macrophages at the early stages of infection. EF dramatically elevates the level of host intracellular cAMP, a ubiquitous messenger that integrates many processes of the cell; increases in cAMP can interfere with host intracellular signalling [].This entry represents a central domain superfamily in the edema factor adenylyl cyclase protein of anthrax toxin, as well as in adenylyl cylcases from other bacterial toxins.
Interleukin-1 alpha and interleukin-1 beta (IL-1 alpha and IL-1 beta) are
cytokines that participate in the regulation of immune responses, inflammatory reactions, and hematopoiesis []. Two types of IL-1 receptor, each with three extracellular immunoglobulin (Ig)-like domains, limited sequence similarity (28%) and different pharmacological characteristics have been cloned from mouse and human cell lines: these have been termed type I and type II receptors []. The receptors both exist in transmembrane (TM) and soluble forms: the soluble IL-1 receptor is thought to be post-translationally derived from cleavage of the extracellular portion of the membrane receptors.Both IL-1 receptors appear to be well conserved in evolution, and map to the
same chromosomal location []. The receptors can both bind all three forms of IL-1 (IL-1 alpha, IL-1 beta and IL-1RA).The crystal structures of IL1A and IL1B [
] have been solved, showing them to share the same 12-stranded β-sheet structure as both the heparin binding growth factors and the Kunitz-type soybean trypsin inhibitors []. The β-sheets are arranged in 3 similar lobes around a central axis, 6 strands forming an anti-parallel β-barrel. Several regions, especially the loop between strands 4 and 5, have been implicated in receptor binding.The Vaccinia virus genes B15R and B18R each encode proteins with N-terminal
hydrophobic sequences, possible sites for attachment of N-linked carbohydrate and a short C-terminal hydrophobic domain []. These propertiesare consistent with the mature proteins being either virion, cell surface or secretory glycoproteins. Protein sequence comparisons reveal that the gene products are related to each other (20% identity) and to the Ig superfamily. The highest degree of similarity is to the human and murine interleukin-1 receptors, although both proteins are related to a wide range of Ig superfamily members, including the interleukin-6 receptor. A novel method for virus immune evasion has been proposed in which the product of one or both of these proteins may bind interleukin-1 and/or interleukin-6, preventing these cytokines reaching their natural receptors [
]. A similar gene product from Cowpox virus (CPV) has also been shown to specifically bind murine IL-1 beta [].The N-terminal of Interleukin-1 is approximately 115 amino acids long, it forms a propeptide that is cleaved off to release the active interleukin-1. This signature is for the propeptide.
Zona occludens (ZO), or tight junctions (TJ), are specialised membrane domains found at the most apical region of polarised epithelial and endothelial cells. They create a primary barrier, preventing paracellular transport of solutes, and restricting the lateral diffusion of membrane lipids and proteins, thus maintaining cellular polarity [
]. They also act as diffusion barriers within plasma membranes, creating and maintaining apical and basolateral membrane domains. Under freeze-fracture electron microscopy, TJs appear as a network of continuous anastomosing intramembranous strands. These strands consist mainly of claudins and occludin (), which are transmembrane proteins that polymerise
within plasma membranes to form fibrils [].Recently, the molecular architecture of tight junctions has begun to be elucidated. One group of proteins thought to be major components of TJs is the claudin family [
]. Immunofluorescence studies have shown that claudins are targeted to and incorporated into tight junctions []. Furthermore, when claudins are introduced into cells that lack tight junctions, networks of strands and grooves form at cell-cell contact sites that closely resemble native TJs [].The claudin protein family is encoded by at least 17 human genes, with many homologues cloned from other species. Tissue distribution patterns for the claudin family members are distinct. Claudin-1 and -2, for example, are expressed at high levels in the liver and kidney, whereas claudin-3 mRNA is detected mainly in the lung and liver [
,
]. This suggests that multiple claudin family members may be involved in tight junction strand formation in a tissue-dependent manner. Hydropathy analysis suggests that all claudins share a common transmembrane (TM) topology. Each family member is predicted to possess four TM domains with intracellular N and C termini. Although their C-terminal cytoplasmic domain sequences vary, most claudin family members share a common motif of -Y-V in this region. This has been postulated as a possible binding motif for PDZ domains of other tight junction-associated membrane proteins, such as ZO-1 (
).
Orthologues of the claudin 12 subtype have been identifed in humans and zebrafish. Whilst these proteins clearly belong to the tetraspanin superfamily, they represent the most divergent claudin subtype in terms of primary structure [
]. Claudin 12 mRNA has been detected in the brain, prostate, colon and uterus [].
Vav acts as a guanosine nucleotide exchange factor (GEF) for Rho/Rac proteins. They control processes including T cell activation, phagocytosis, and migration of cells. The Vav subgroup of Dbl GEFs consists of three family members (Vav1, Vav2, and Vav3) in mammals [
]. Vav1 is preferentially expressed in the hematopoietic system, while Vav2 and Vav3 are described by broader expression patterns []. Mammalian Vav proteins consist of a calponin homology (CH) domain, an acidic region, a catalytic Dbl homology (DH) domain, a PH domain, a zinc finger cysteine rich domain (C1/CRD), and an SH2 domain, flanked by two SH3 domains. In invertebrates such as Drosophila and C. elegans, Vav is missing the N-terminal SH3 domain. The DH domain is involved in RhoGTPase recognition and selectivity and stimulates the reorganization of the switch regions for GDP/GTP exchange []. The PH domain is implicated in directing membrane localization, allosteric regulation of guanine nucleotide exchange activity, and as a phospholipid-dependent regulator of GEF activity []. Vavs bind RhoGTPases including Rac1, RhoA, and RhoG, while other members of the GEF family are specific for a single RhoGTPase. This promiscuity is thought to be a result of its CRD [].PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner [
]. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity []. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane []. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes [].
G protein-coupled receptor 176, rhodopsin-like, 7TM
Type:
Domain
Description:
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].This entry represents the seven transmembrane regions of the G-protein coupled receptor 176 (GPCR176), a member of the rhodopsin-like class A GPCR superfamily. Its endogenous ligand has not yet been identified. It is highly expressed in suprachiasmatic nucleus (SCN) neurons in a circadian rhythm manner, which has a role in setting the pace of the circadian behaviour. This receptor has an agonist-independent basal activity to reduce cAMP signalling and it has been shown that it acts through a unique G-protein subclass Gz to repress the second messenger signalling [
,
].
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].The G protein-coupled receptors EDG-2, EDG-4 and EDG-7 have now been identified
as high affinity receptors for lysophosphatidic acid (LPA). EDG-7 is expressed at high levels in the testis, prostate, heart, pancreas and frontal cerebral cortex in humans and at lower levels in the intestine, lung and ovary. Binding of LPA to the receptor leads to increased cyclic AMP and calcium levels and activation of MAP kinases. It is believed that these effects are mediated by Gq and possibly Gi class proteins [
]. EDG-7 does not appear to be able to couple to G1.
Protein tyrosine (pTyr) phosphorylation is a common post-translational modification which can create novel recognition motifs for protein interactions and cellular localisation, affect protein stability, and regulate enzyme activity. Consequently, maintaining an appropriate level of protein tyrosine phosphorylation is essential for many cellular functions. Tyrosine-specific protein phosphatases (PTPase;
) catalyse the removal of a phosphate group attached to a tyrosine residue, using a cysteinyl-phosphate enzyme intermediate. These enzymes are key regulatory components in signal transduction pathways (such as the MAP kinase pathway) and cell cycle control, and are important in the control of cell growth, proliferation, differentiation and transformation [
,
]. The PTP superfamily can be divided into four subfamilies []:(1) pTyr-specific phosphatases(2) dual specificity phosphatases (dTyr and dSer/dThr)(3) Cdc25 phosphatases (dTyr and/or dThr)(4) LMW (low molecular weight) phosphatasesBased on their cellular localisation, PTPases are also classified as:Receptor-like, which are transmembrane receptors that contain PTPase domains [
]
Non-receptor (intracellular) PTPases [
]
All PTPases carry the highly conserved active site motif C(X)5R (PTP signature motif), employ a common catalytic mechanism, and share a similar core structure made of a central parallel β-sheet with flanking α-helices containing a β-loop-α-loop that encompasses the PTP signature motif [
]. Functional diversity between PTPases is endowed by regulatory domains and subunits. The structures of receptor PTPases comprise a variable length extracellular
domain, followed by a TM region and a cytoplasmic C-terminal catalyticdomain. The extracellular regions of some receptor PTPases house fibronectin
type III repeats, immunoglobulin-like domains, MAM domains or carbonicanhydrase-like domains. The cytoplasmic region generally contains 2 copies
of the PTPase domain: the first of these is enzymatically active; the secondis inactive, but appears to affect substrate specificity in the first.
PTPase domains contain ~300 residues, including 2 conserved cysteines, thesecond of which is required for activity. Other conserved residues in its
immediate vicinity are also catalytically important [].This entry represents protein-tyrosine phosphatases that contain a Kinase Interaction Motif (KIM), including receptor PTPases and non-receptor (types 2 and 7) PTPases. Enzymes PTP-STEP, PTP-SL and LC-PTP each contain a KIM in the N-terminal portion of the molecule. The KIM sequence mediates interaction with
MAP kinases, predominantly ERK1 and ERK2. It has been experimentally shownthat over-expression of PTP-SL down-regulates the activation of ERK2 and its
nuclear translocation [].
Anthrax toxin is a plasmid-encoded toxin complex produced by the Gram-positive, spore-forming bacteria, Bacillus anthracis. The toxin consists of three non-toxic proteins: the protective antigen (PA), the lethal factor (LF) and the edema factor (EF) [
]. These component proteins self-assemble at the surface of host cell receptors, yielding a series of toxic complexes that can produce shock-like symptoms and death. Anthrax toxin is one of a large group of Bacillus and Clostridium exotoxins referred to as binary toxins, forming independent enzymatic (A moiety) and binding (B moiety) components. The LF and EF proteins are the enzymes (A moiety) that act on cytosolic substrates, while PA is a multi-functional protein (B moiety) that binds to cell surface receptors, mediates the assembly and internalisation of the complexes, and delivers them to the host cell endosome []. Once PA is attached to the host receptor [], it must then be cleaved by a host cell surface (furin family) protease before it is able to bind EF and LF. The cleavage of the N terminus of PA enables the C-terminal fragment to self-associate into a ring-shaped heptameric complex (prepore) that can bind LF or EF competitively. The PA-LF/EF complex is then internalised by endocytosis, and delivered to the endosome, where PA forms a pore in the endosomal membrane in order to translocate LF and EF to the cytosol. LF is a Zn-dependent metalloprotease that cleaves and inactivates mitogen-activated protein (MAP) kinases, kills macrophages, and causes death of the host by inhibiting cell proliferation [
,
]. EF is a calcium-and calmodulin-dependent adenylyl cyclase that can cause edema (fluid-filled swelling) when associated with PA. EF is not toxic by itself, and is required for the survival of germinated Bacillus spores within macrophages at the early stages of infection. EF dramatically elevates the level of host intracellular cAMP, a ubiquitous messenger that integrates many processes of the cell; increases in cAMP can interfere with host intracellular signalling [].This entry represents a central domain in the edema factor adenylyl cyclase protein of anthrax toxin, as well as in adenylyl cylcases from other bacterial toxins.
Interleukin-1 alpha and interleukin-1 beta (IL-1 alpha and IL-1 beta) are
cytokines that participate in the regulation of immune responses, inflammatory reactions, and hematopoiesis []. Two types of IL-1 receptor, each with three extracellular immunoglobulin (Ig)-like domains, limited sequence similarity (28%) and different pharmacological characteristics have been cloned from mouse and human cell lines: these have been termed type I and type II receptors []. The receptors both exist in transmembrane (TM) and soluble forms: the soluble IL-1 receptor is thought to be post-translationally derived from cleavage of the extracellular portion of the membrane receptors.Both IL-1 receptors appear to be well conserved in evolution, and map to the
same chromosomal location []. The receptors can both bind all three forms of IL-1 (IL-1 alpha, IL-1 beta and IL-1RA).The crystal structures of IL1A and IL1B [
] have been solved, showing them to share the same 12-stranded β-sheet structure as both the heparin binding growth factors and the Kunitz-type soybean trypsin inhibitors []. The β-sheets are arranged in 3 similar lobes around a central axis, 6 strands forming an anti-parallel β-barrel. Several regions, especially the loop between strands 4 and 5, have been implicated in receptor binding.The Vaccinia virus genes B15R and B18R each encode proteins with N-terminal
hydrophobic sequences, possible sites for attachment of N-linked carbohydrate and a short C-terminal hydrophobic domain []. These propertiesare consistent with the mature proteins being either virion, cell surface or secretory glycoproteins. Protein sequence comparisons reveal that the gene products are related to each other (20% identity) and to the Ig superfamily. The highest degree of similarity is to the human and murine interleukin-1 receptors, although both proteins are related to a wide range of Ig superfamily members, including the interleukin-6 receptor. A novel method for virus immune evasion has been proposed in which the product of one or both of these proteins may bind interleukin-1 and/or interleukin-6, preventing these cytokines reaching their natural receptors [
]. A similar gene product from Cowpox virus (CPV) has also been shown to specifically bind murine IL-1 beta [].This entry represents the Interleukin-1 conserved region in the C-terminal section.
Amyloidogenic glycoprotein, copper-binding domain conserved site
Type:
Conserved_site
Description:
Amyloid-beta precursor protein (APP, or A4) is associated with Alzheimer's disease (AD), because one of its breakdown products, amyloid-beta (A-beta), aggregates to form amyloid or senile plaques [
,
,
]. Mutations in APP or in proteins that process APP have been linked with early-onset, familial AD. Individuals with Down's syndrome carry an extra copy of chromosome 21, which contains the APP gene, and almost invariably develop amyloid plaques and Alzheimer's symptoms.APP is important for the neurogenesis and neuronal regeneration, either through the intact protein, or through its many breakdown products [
,
]. APP consists of a large N-terminal extracellular region containing heparin-binding and copper-binding sites, Kunitz domain, E2 domain, a short hydrophobic transmembrane domain, and a short C-terminal intracellular domain. The N-terminal region is similar in structure to cysteine-rich growth factors and appears to function as a cell surface receptor, contributing to neurite growth, neuronal adhesion, axonogenesis and cell mobility []. APP acts as a kinesin I membrane receptor to mediate the axonal transport of beta-secretase and presenilin 1. The N-terminal domain can regulate neurite outgrowth through its binding to heparin and collagen I and IV, which are components of the extracellular matrix. APP is also coupled to apoptosis-inducing pathways, and is involved in copper homeostasis/oxidative stress through copper ion reduction, where copper-metallated APP induces neuronal death [,
]. The C-terminal intracellular domain appears to be involved in transcription regulation through protein-protein interactions. APP can promote transcription activation through binding to APBB1/Tip60, and may bind to the adaptor protein FE65 to transactivate a wide variety of different promoters.APP can be processed by different sets of enzymes:In the non-amyloidogenic (non-plaque-forming) pathway, APP is cleaved by alpha-secretase to yield a soluble N-terminal sAPP-alpha (neuroprotective) and a membrane-bound CTF-alpha. CTF-alpha is broken-down by presenilin-containing gamma-secretase to yield soluble p3 and membrane-bound AICD (nuclear signalling). In the amyloidogenic pathway (plaque-forming), APP is broken down by beta-secretase to yield soluble sAPP-beta and membrane-bound CTF-beta. CTF-beta is broken down by gamma-secretase to yield soluble amyloid-beta and membrane-bound AICD. Amyloid-beta is required for neuronal function, but can aggregate to form amyloid plaques that seem to disrupt brain cells by clogging points of cell-cell contact.This entry represents a conserved octapeptide located in the CuBD domain.
SKP1 (together with SKP2) was identified as an essential component of the
cyclin A-CDK2 S phase kinase complex []. It was found to bind several F-box containing proteins (e.g., Cdc4, Skp2, cyclin F) and to be involved in the ubiquitin protein degradation pathway. A yeast homologue of SKP1 (P52286) was identified in the centromere bound kinetochore complex [] and is also involved in the ubiquitin pathway []. In Dictyostelium discoideum (Slime mold) FP21 was shown to be glycosylated in the cytosol and has homology to SKP1 [].This entry represents a dimerisation domain found at the C-terminal of SKP1 proteins [
], as well as in subunit D of the centromere DNA-binding protein complex Cbf3 []. This domain is multi-helical in structure, and consists of an interlocked herterodimer in F-box proteins.
This entry includes proteins from fungi, plants and bacteria. Proteins in this entry contain a ferredoxin-like fold. They are predicted ferredoxins, which are iron-sulfur (Fe-S) proteins that may play a role in redox sensing and electron transfer [
]. Some members appear to have sucrolytic activity []. The putative active site of the ferredoxin-like domain of the enzyme contains two cysteines and two histidines for possible binding to iron-sulfur clusters, compared to four cysteines present in the active site of ferredoxin [].Two budding yeast proteins, Aim32 and Apd1, are included in this entry. Apd1 is an Fe-S protein and is involved in cellular defense against hydroxyurea [
]. It is required for normal localisation of actin patches in budding yeasts []. The function of Aim32 is not clear.
This entry represents the surfeit locus protein SURF6 from mammals and its homologues from plants and fungi. In mammals, SURF6 is a component of the nucleolar matrix and has a strong binding capacity for nucleic acids [
]. SURF6 is always found in the nucleolus regardless of the phase of the cell cycle suggesting that it is a structural protein constitutively present in nucleolar substructures. A role in rRNA processing has been proposed for this protein. Saccharomyces cerevisiae member of the SURF-6 family, named Rrp14 (ribosomal RNA-processing protein 14), interacts with proteins involved in ribosomal biogenesis and cell polarity [
]. It is required for the synthesis of both 40S and 60S ribosomal subunits and may also play some direct role in correct positioning of the mitotic spindle during mitosis [,
].
The PUL (after PLAP, UFD3 and lub1) domain is a predicted predominantly alpha helical globular domain found in eukaryotes. It is found in association with either WD repeats (see
) and the PFU domain (see
) or PPPDE and thioredoxin (see
) domains. The PUL domain is a protein-protein interaction domain [
,
].Some proteins known to contain a PUL domain are listed below:Saccharomyces cerevisiae DOA1 (UFD3, ZZZ4), involved in ubiquitin conjugation pathway. DOA1 participates in the regulation of the ubiquitin conjugation pathway involving CDC48 by hindering multiubiquitination of substrates at the CDC48 chaperone.Schizosaccharomyces pombe ubiquitin homeostasis protein lub1, acts as a negative regulator of vacuole-dependent ubiquitin degradation.Mammalian phospholipase A-2-activating protein (PLA2P, PLAA), the homologue of DOA1. PLA2P plays an important role in the regulation of specific inflammatory disease processes.
This family consists of apoptosis inhibitory protein 5 (API5) sequences from several organisms. Apoptosis or programmed cell death is a physiological form of cell death that occurs in embryonic development and organ formation. It is characterised by biochemical and morphological changes such as DNA fragmentation and cell volume shrinkage. API5 is an anti apoptosis gene located in Homo sapiens chromosome 11, whose expression prevents the programmed cell death that occurs upon the deprivation of growth factors [
,
] and is up-regulated in various cancer cells. This protein has an elongated all α-helical structure, in which the N-terminal half is similar to the HEAT repeat and the C-terminal half is similar to the ARM (Armadillo-like) repeat. This suggests that API5 is involved in protein-protein interactions and may act as a scaffold for multiprotein complexes [].
The SWI/SNF family of complexes, which are conserved from yeast to humans, are ATP-dependent chromatin-remodelling proteins that facilitate transcription activation [
,
,
]. The mammalian complexes are made up of 9-12 proteins called BAFs (BRG1-associated factors). The BAF60 family have at least three members: BAF60a, which is ubiquitous, BAF60b and BAF60c, which are expressed in muscle and pancreatic tissues, respectively. BAF60b is present in alternative forms of the SWI/SNF complex, including complex B (SWIB), which lacks BAF60a. The SWIB domain is a conserved region found within the BAF60b proteins [], and can be found fused to the C terminus of DNA topoisomerase in Chlamydia. This domain is also found in the Saccharomyces cerevisiae SNF12 protein, the eukaryotic initiation factor 2 (eIF2) []and the Arabidopsis thaliana At1g31760 protein [].
The GGDEF domain, which has been named after the conserved central sequence pattern GG[DE][DE]F is widespread in prokaryotes. It is typically present in multidomain proteins containing regulatory domains of signalling pathways or protein-protein or protein-ligand interaction modules, such as the response regulatory domain, the PAS/PAC domain, the HAMP domain, the GAF domain, the FHA domain or the TPR repeat. However a few single-domain proteins are also known. The GGDEF domain is involved in signal transduction and is likely to catalyse synthesis or hydrolysis of cyclic diguanylate (c-diGMP, bis(3',5')-cyclic diguanylic acid), an effector molecule that consists of two cGMP moieties bound head-to-tail [
,
,
].Structural studies of PleD from Caulobacter crescentus show that this domain forms a five-stranded β-sheet surrounded by helices, similar to the catalytic core of adenylate cyclase [
].
Herpesviruses are enveloped by a lipid bilayer that contains at least a dozen glycoproteins. The virion surface glycoproteins mediate recognition of susceptible cells and promote fusion of the viral envelope with the cell membrane, leading to virus entry. No single glycoprotein associated with the virion membrane has been identified as the fusogen [
].Glycoprotein L (gL) forms a non-covalently linked heterodimer with glycoprotein H (gH). This heterodimer is essential for virus-cell and cell-cell fusion since the association of gH and gL is necessary for correct localisation of gH to the virion or cell surface. gH anchoring the heterodimer to the plasma membrane through its transmembrane domain. gL lacks a transmembrane domain and is secreted from cells when expressed in the absence of gH [
].This superfamily represents a subgroup of gL found in rhadinoviruses.
This bacterial family of integral membrane proteins represents a positive regulator of the sigma(E) transcription factor, namely RseC/MucC. The sigma(E) transcription factor is up-regulated by cell envelope protein misfolding, and regulates the expression of genes that are collectively termed ECF (devoted to Extra-Cellular Functions) [
]. In Pseudomonas aeruginosa, derepression of sigma(E) is associated with the alginate-overproducing phenotype characteristic of chronic respiratory tract colonization in cystic fibrosis patients. The mechanism by which RseC/MucC positively regulates the sigma(E) transcription factor is unknown. RseC is also thought to have a role in thiamine biosynthesis in Salmonella typhimurium []. In addition, this family also includes an N-terminal part of RnfF, a Rhodobacter capsulatus protein, of unknown function, that is essential for nitrogen fixation. This protein also contains a domain found in ApbE protein .
OCRL1 (oculocerebrorenal syndrome of Lowe 1)-like proteins contain a PH-domain at the N-terminal, a central inositol polyphosphate 5-phosphatase domain and a C-terminal Rho GAP domain. OCRL-like proteins are type II inositol polyphosphate 5-phosphatases that can hydrolyse lipid PI(4,5)P2 and PI(3,4,5)P3 and soluble Ins(1,4,5)P3 and Ins(1,3,4,5)P4, but their individual specificities vary. OCRL regulates traffic in the endosomal pathway by regulating the specific pool of phosphatidylinositol 4,5-bisphosphate that is associated with endosomes [
] and is involved in primary cilia assembly [,
]. This protein is associated with the oculocerebrorenal syndrome of Lowe and Dent disease 2 [,
].This entry represents the RhoGAP (GTPase-activator protein [GAP] for Rho-like small GTPases) domain present in OCRL1-like proteins. This GAP domain lacks the catalytic residue and therefore maybe inactive. The functionality of the RhoGAP domain is still unclear [].
This entry describes the head-tail adaptor protein of bacteriophage SPP1
and related proteins in other bacteriophage and prophage regions of bacterial genomes. Homologues are also found in Gene Transfer Agents (GTA) [], including ORFg7 (RCAP_rcc01689) of the GTA of Rhodobacter capsulatus (Rhodopseudomonas capsulata) [].In bacteriophage SPP1, the gp16 protein functions as a stopper, locking the viral DNA into the capsid. When the tail attachment binds to the entry receptor, gp16 opens by a diaphragm-like motion, allowing the genome to exit the capsid through the tail tube to the host cell. During virion assembly, gp16 functions as a docking platform to which the preassembled tail binds [
].The SPP1 head-to-tail connector is composed of cyclical dodecamers of the portal protein gp6 and of the 2 head completion proteins gp15 and gp16 [].
The function of Probable G-protein coupled receptor 82 (GPR82) is not clear. GPR82 is a member of the rhodopsin-like, class A GPCRs, which is a widespread protein family that includes the light-sensitive rhodopsin as well as receptors for biogenic amines, lipids, nucleotides, odorants, peptide hormones, and a variety of other ligands. All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) α-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signalling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes [
,
].
SKP1 (together with SKP2) was identified as an essential component of the
cyclin A-CDK2 S phase kinase complex []. It was found to bind several F-box containing proteins (e.g., Cdc4, Skp2, cyclin F) and to be involved in the ubiquitin protein degradation pathway. A yeast homologue of SKP1 (P52286) was identified in the centromere bound kinetochore complex [] and is also involved in the ubiquitin pathway []. In Dictyostelium discoideum (Slime mold) FP21 was shown to be glycosylated in the cytosol and has homology to SKP1 [].This entry represents the superfamily of a dimerisation domain found at the C-terminal of SKP1 proteins [
], as well as in subunit D of the centromere DNA-binding protein complex Cbf3 []. This domain is multi-helical in structure, and consists of an interlocked herterodimer in F-box proteins.
This entry represents the Death Domain (DD) found in Uncoordinated-5D (UNC5D), which is part of the UNC-5 homologue family. It is a receptor for the secreted netrin-1 and plays a role in axonal guidance, angiogenesis, and apoptosis [
]. UNC5 proteins are transmembrane proteins with an extracellular domain consisting of two immunoglobulin repeats, two thrombospondin type-I modules and an intracellular region containing a ZU-5 domain, UPA domain and a DD [].In general, DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes [
].
STAP1 is a signal-transducing adaptor protein. It is composed of a Pleckstrin Homology (PH) and SH2 domains along with several tyrosine phosphorylation sites. STAP-1 is an orthologue of BRDG1 (also known as BCR downstream signaling 1). STAP1 protein functions as a docking protein acting downstream of Tec tyrosine kinase in B cell antigen receptor signaling. The protein is phosphorylated by Tec and participates in a positive feedback loop, increasing Tec activity [
]. STAP-1 has been shown to interact with STAT5 []. This entry represents the SH2 domain of STAP1.In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites [
,
,
].
Flagellar motor switch FliN/Type III secretion HrcQb
Type:
Family
Description:
The flagellar motor switch in Escherichia coli and Salmonella typhimurium regulates the direction of
flagellar rotation and hence controls swimming behaviour. The switch is a complexapparatus that responds to signals transduced by the chemotaxis sensory signalling
system during chemotactic behaviour []. Theswitch complex comprises at least three proteins - FliG, FliM and FliN. It has been
shown that FliG interacts with FliM, FliM interacts with itself, and FliM interacts withFliN [
]. The proteinsare not particularly hydrophobic and may be peripheral to the membrane, possibly mounted
on the basal body M ring [,
].This entry represents the flagellar motor switch proteins FliN and FliY, and proteins related with type III secretion system such as hrcQb and SsaQ. Members of this group of proteins are mainly found in bacteria.
Testin contains three C-terminal LIM domains and a PET protein-protein interaction domain at the N terminus. Testin is a cytoskeleton associated focal adhesion protein that localizes along actin stress fibres, at cell-cell-contact areas, and at focal adhesion plaques. Testin interacts with a variety of cytoskeletal proteins, including zyxin, mena, VASP, talin, and actin and it is involved in cell motility and adhesion events [
]. Knockout mice experiments reveal the tumour repressor function of Testin [].LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes [
,
,
].This is the third LIM domain of Testin.
Testin contains three C-terminal LIM domains and a PET protein-protein interaction domain at the N terminus. Testin is a cytoskeleton associated focal adhesion protein that localizes along actin stress fibres, at cell-cell-contact areas, and at focal adhesion plaques. Testin interacts with a variety of cytoskeletal proteins, including zyxin, mena, VASP, talin, and actin and it is involved in cell motility and adhesion events [
]. Knockout mice experiments reveal the tumour repressor function of Testin [].LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes [
,
,
].This is the first LIM domain of Testin.
Testin contains three C-terminal LIM domains and a PET protein-protein interaction domain at the N terminus. Testin is a cytoskeleton associated focal adhesion protein that localizes along actin stress fibres, at cell-cell-contact areas, and at focal adhesion plaques. Testin interacts with a variety of cytoskeletal proteins, including zyxin, mena, VASP, talin, and actin and it is involved in cell motility and adhesion events [
]. Knockout mice experiments reveal the tumour repressor function of Testin [].LIM domains are 50-60 amino acids in size and share two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes [
,
,
].This is the second LIM domain of Testin.
The membrane-embedded multi-protein complexes of mitochondria mediate the transport of nuclear-encoded proteins across and into the outer or inner mitochondrial membranes [
]. Two translocases of the inner mitochondrial membrane (TIM22 and TIM23 complexes) mediate protein transport at the inner membrane.The TIM22 complex (a twin-pore carrier translocase) catalyses the insertion of multi-spanning proteins that have internal targeting signals into the inner membrane. The TIM22 complex mediates the membrane insertion of multi-spanning inner-membrane proteins that have internal targeting signals, and it uses the membrane potential as an external driving force [
]. The Tim22 subunit of the mitochondrial import inner membrane translocase is included in this family. Tim22 forms a hydrophilic, high-conductance channel with distinct opening states and pore diameters, that is voltage-activated and specifically responds to an internal targeting signal [].
Probable G-protein coupled receptor 146 (GPR146) is an orphan G-protein coupled receptor that belongs to the class A of seven-transmembrane GPCR superfamily. The endogenous ligand for GPR146 is not known. It has been suggested that GPR146 may be a part of the C-peptide signaling complex [
].All GPCRs have a common structural architecture comprising of seven-transmembrane (TM) α-helices interconnected by three extracellular and three intracellular loops. A general feature of GPCR signaling is agonist-induced conformational changes in the receptors, leading to activation of the heterotrimeric G proteins, which consist of the guanine nucleotide-binding G-alpha subunit and the dimeric G-beta-gamma subunits. The activated G proteins then bind to and activate numerous downstream effector proteins, which generate second messengers that mediate a broad range of cellular and physiological processes [
,
].
This family contains the P18 proteins of citrus tristeza virus (CTV). CTV is a member of the closterovirus group and is one of the more complex single-stranded RNA viruses. Assembly of the viral genome into virions is a critical process of the virus life cycle often defining the ability of the virus to move within the plant and to be transmitted horizontally to other plants. Closteroviridae virions are polar helical rods assembled primarily by a major coat protein, but with a related minor coat protein at one end. It is the only virus family that encodes a protein with similarity to cellular chaperones, a 70kDa heat-shock protein homologue (HSP70h). Deletion mutagenesis reveals that p33, p6, p18, p13, p20, and p23 genes are not needed for virion formation. Their function is unknown [
].
This entry describes the head-tail adaptor protein of bacteriophage SPP1
and related proteins in other bacteriophage and prophage regions of bacterial genomes. Homologues are also found in Gene Transfer Agents (GTA) [], including ORFg7 (RCAP_rcc01689) of the GTA of Rhodobacter capsulatus (Rhodopseudomonas capsulata) [].In bacteriophage SPP1, the gp16 protein functions as a stopper, locking the viral DNA into the capsid. When the tail attachment binds to the entry receptor, gp16 opens by a diaphragm-like motion, allowing the genome to exit the capsid through the tail tube to the host cell. During virion assembly, gp16 functions as a docking platform to which the preassembled tail binds [
].The SPP1 head-to-tail connector is composed of cyclical dodecamers of the portal protein gp6 and of the 2 head completion proteins gp15 and gp16 [].
Ribonuclease P (Rnp) is a ubiquitous ribozyme that catalyzes a Mg2 -dependent hydrolysis to remove the 5'-leader sequence of precursor tRNA (pre-tRNA) in all three domains of life [
]. In bacteria, the catalytic RNA (typically ~120kDa) is aided by a small protein cofactor (~14kDa) []. Archaeal and eukaryote RNase P consist of a single RNA and archaeal RNase P has four or five proteins, while eukaryotic RNase P consists of 9 or 10 proteins. Eukaryotic and archaeal RNase P RNAs cooperatively function with protein subunits in catalysis [].This entry represents ribonuclease P (Rnp) subunit RNP4 (also known as Rpp21) mostly from archaea. In the hyperthermophilic archaeon Pyrococcus horikoshii OT3, RNase P is composed of the RNase P RNA (pRNA) and five proteins (PhoPop5, PhoRpp38, PhoRpp21, PhoRpp29, and PhoRpp30) [,
].
This family is defined to identify a pair of paralogous 3'->5' exoribonucleases in Escherichia coli, plus the set of proteins apparently orthologous to one or the other in other eubacteria. VacB was characterised originally as required for the expression of virulence genes, but is now recognised as the exoribonuclease RNase R (Rnr). Its paralog in Escherichia coli and Haemophilus influenzae is designated exoribonuclease II (Rnb) [
]. Both are involved in the degradation of mRNA, and consequently have strong pleiotropic effects that may be difficult to disentangle. Both these proteins share domain-level similarity (RNB, S1) with a considerable number of other proteins, and full-length similarity scoring below the trusted cut-off to proteins associated with various phenotypes but uncertain biochemistry; it may be that these latter proteins are also 3' exoribonucleases.
This group of proteins belong to MEROPS peptidase family C1, subfamily C1B. This family contains prokaryotic and eukaryotic aminopeptidases and includes bleomycin hydrolases. Bleomycins are antitumour glycopeptide antibiotics originally isolated from the actinomycete Streptomyces verticillus, and are inactivated by bleomycin hydrolase, which hydrolyses the carboxyamide bond of the beta-aminoalanine moiety [
]. Other cysteine peptidases from the papain family have no effect on bleomycins [,
]. Bleomycin hydrolase acts mainly as an aminopeptidase on short peptides, but the rat orthologue has been shown to cleave amyloid β-peptides, acting as an endopeptidase or a carboxypeptidase, as well as an aminopeptidase []. The C terminus is autolytically processed to remove the C-terminal residue []. Bleomycin hydrolase is intracellular and exists naturally as a homohexamer, with all the active site in the central cavity of the barrel []. This molecular organisation prevents native proteins from entering the cavity. The negative C terminus interacts with the positive N terminus of the substrate and anchors it to the active site. The structure of the C terminus resembles that of an inhibitor, such as leupeptin, bound to an active peptidase, such as papain, and the distance of the C terminus from the active site acts as a molecular ruler to confer positional specificity [].A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families [
]. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid,
N-ethylmaleimide or
p-chloromercuribenzoate.
Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [
].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues []. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [
]. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.
The infection of mammalian host cells by Yersinia sp. causes a rapid induction of the mitogen-activated protein kinase (MAPK; including the ERK, JNK and p38 pathways) and nuclear factor kappaB (NF-kappaB) signalling pathways that would typically result in cytokine production and initiation of the innate immune response. However, these pathways are rapidly inhibited promoting apoptosis. YopJ has been shown to block phosphorylation of active site residues [
]. It has also been shown that YopJ acetyltransferase is activated by eukaryotic host cell inositol hexakisphosphate [].Serine and threonine acetylation is yet another complication to the control of signalling pathways and may be a widespread mode of biochemical regulation of endogenous processes in eukaryotic cells. It has been shown that YopJ is a serine/threonine acetyltransferase [
]. It acetylates the serine and threonine residues in the phosphorylation sites of MAPK kinases and nuclear factor kappaB, preventing their activation by phosphorylation and the inhibition of these signalling pathways [,
]. This entry contains YopJ and related proteins.A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families [
]. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid,
N-ethylmaleimide or
p-chloromercuribenzoate.
Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [
].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [
]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [
]. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.
This entry represents the calcium-binding domain found in SPARC (Secreted Protein Acidic and Rich in Cysteine) and Testican (also known as SPOCK; or SParc/Osteonectin, Cwcv and Kazal-like domains) proteins. SPARC proteins are down-regulated in various tumours and may have a tumour-suppressor function [
,
]. Testican-3 appears to be a novel regulator that reduces the activity of matrix metalloproteinase (MMP) in adult T-cell leukemia (ATL) [].This cysteine-rich domain is responsible for the anti-spreading activity of human urothelial cells. This extracellular calcium-binding domain is rich in α-helices and contains two EF-hands that each coordinates one Ca2+ ion, forming a helix-loop-helix structure that not only drives the conformation of the protein but is also necessary for biological activity. The anti-spreading activity was dependent on the coordination of Ca2+ by a Glu residue at the Z position of EF-hand 2 [
].
This entry includes animal M-phase-specific PLK1-interacting protein (also known as TTDN1) and plant protein SICKLE. M-phase-specific PLK1-interacting protein (also known as TTD non-photosensitive 1 protein, TTDN1) co-localises with Plk1 at the centrosome in mitosis and the midbody during cytokinesis [
]. TTDN1 is phosphorylated by cyclin-dependent kinase 1 during mitosis and subsequently interacts with polo-like kinase 1 (PLK1) [,
]. It may play a role in maintenance of cell cycle integrity by regulating mitosis or cytokinesis []. Mutations in the C7orf11 (TTDN1) gene has been linked to Trichothiodystrophy (TTD), a rare autosomal recessive disorder whose defining feature is brittle hair []. Protein SICKLE is required for development and abiotic stress tolerance, and is involved in microRNA biogenesis [
] and mRNA splicing []. SICKLE, also known as ROTUNDA 3, may contribute to plant development by phosphatase 2A-mediated regulation of auxin transporter recycling [].
Glycation is a nonenzymatic covalent reaction between proteins and endogenous reducing sugars or dicarbonyls (methylglyoxal, glyoxal) that results in protein inactivation. DJ-1 was described in vitro as a protein deglycase that repaired methylglyoxal- and glyoxal-glycated proteins [
,
]. Since then there have been reports against [], and supporting this role for DJ-1 [].Furthermore, supporting its deglycase activity, DJ-1 and its bacterial homologues have been shown to be able to repair methylglyoxal- and glyoxal-glycated nucleotides and nucleic acids [
]. This ability would make DJ-1 a target for diabetic and cancer research []. DJ-1, also known as Park7, has been associated with human parkinsonism [].Included in this family is also YajL from Escherichia coli, the bacterial homologue of DJ-1 [
,
]. This group of proteins are classified as either DJ-1 putative peptidases or non-peptidase homologues in MEROPS peptidase family C56 (clan PC(C)).
In yeasts, vacuolar protein sorting-associated protein 30 (Vps30), also known as autophagy-related protein 6 (Atg6), is a common component of two distinct phosphatidylinositol 3-kinase complexes. In complex I, Atg14 links Vps30 to Vps34 lipid kinase and plays a specific role in autophagy, while in complex II, Vps38 links Vps30 to Vps34 and plays an important role in vacuolar protein sorting [
]. The C-terminal of Vps30 contains a globular fold comprised of three β-sheet-α-helix repeats (also known as beta-alpha repeated, autophagy-specific (BARA) domain) and is required for autophagy through the targeting of complex I to the pre-autophagosomal structure. The N-terminal of Vps30 is required for vacuolar protein sorting []. Beclin, the mammalian homologue of yeast Atg6/Vps30, is a tumour suppressor that coordinately regulates the autophagy and membrane trafficking involved in several physiological and pathological processes [
,
].
This domain is found in bacteria, archaea and eukaryotes, and is approximately 50 amino acids in length. It contains an evolutionary conserved signature W-X-Y-X6-11-GPF-X4-M-X2-W-X3-GYF, the site of interaction with proline-rich peptides. Proteins containing this domain include RME-8 (Required for receptor-mediated endocytosis 8), a DNAJC13 protein. RME-8 was first identified as a protein that is required for endocytosis in Caenorhabditis elegans. It coordinates the activity of the WASH complex with the function of the retromer SNX dimer to control endosomal tubulation [
]. Proteins containing this domain also include Arabidopsis trithorax-related3 (Atxr3) and Tic56. Atxr3 is the major enzyme responsible for H3K4me3, which is critical for regulating gene expression and plant development [
]. Tic56 is an essential subunit of a 1-MDa protein complex at the inner chloroplast envelope membrane []. Tic56 also plays important roles in rRNA processing and chloroplast ribosome assembly [].
The exchange of macromolecules between the nucleus and cytoplasm takes place through nuclear pore complexes within the nuclear membrane. Active transport of large molecules through these pore complexes require carrier proteins, called karyopherins (importins and exportins), which shuttle between the two compartments.This domain is found close to the N terminus of yeast exportin 1 (Xpo1, Crm1,
), as well as adjacent to the N-terminal domain of importin-beta (
). Exportin 1 is a nuclear export receptor that translocates proteins out of the nucleus; it interacts with leucine-rich nuclear export signal (NES) sequences in proteins to be transported, as well as with RanGTP [
,
]. Importin-beta is a nuclear import receptor that translocates proteins into the nucleus; it interacts with RanGTP and importin-alpha, the latter binding with the nuclear localisation signal (NLS) sequences in proteins to be transported [].
F-actin capping protein, alpha subunit, conserved site
Type:
Conserved_site
Description:
The actin filament system, a prominent part of the cytoskeleton in eukaryotic cells, is both a static structure and a dynamic network that can undergo rearrangements: it is thought to be involved in processes such as cell movement and phagocytosis [
], as well as muscle contraction.The F-actin capping protein binds in a calcium-independent manner to the fast growing ends of actin filaments (barbed end) thereby blocking the exchange of subunits at these ends. Unlike gelsolin and severin this protein does not sever actin filaments. The F-actin capping protein is a heterodimer composed of two unrelated subunits: alpha and beta (see
). Neither of the subunits shows sequence similarity to other filament-capping proteins [
].The alpha subunit is a protein of about 268 to 286 amino acid residues whose sequence is well conserved in eukaryotic species [
].
This superfamily represents the Spen Paralogue and Orthologue C-terminal (SPOC) domain and its structural homologues. This domain has a closed β-barrel fold of complex topology. Proteins that carry a SPOC-like domain include:Spen proteins, such as SHARP (SMRT/HDAC1-associated repressor protein), which carry a SPOC domain at the C-terminal and an RNA-binding motif in the N-terminal; Spen proteins regulate the expression of key transcriptional effectors in diverse signalling pathways, the SHARP protein being a component of transcriptional repression complexes in both nuclear receptor and Notch/RBP-Jkappa signalling pathways [
].The middle domains of Ku70 and Ku80 (which includes the C-terminal α-helical arm and the DNA encircling insertion); the Ku heterodimer, which is composed of Ku70 and Ku80 subunits, contributes to genomic integrity through its ability to bind DNA double-strand breaks and facilitate repair by the non-homologous end-joining pathway [
].
The cysteine-rich secretory proteins (Crisp) are predominantly found in the mammalian male reproductive tract as well as in the venom of reptiles. This family includes mammalian testis-specific protein (Tpx-1), also known as cysteine-rich secretory protein 2 (CRISP2) [
]; venom allergen 5 from vespid wasps and venom allergen 3 from fire ants, which are potent allergens that mediate allergic reactions to stings insects of the Hymenoptera family []; scoloptoxins from Scolopendra dehaani (Thai centipede) []; plant pathogenesis proteins of the PR-1 family [], which are synthesised during pathogen infection or other stress-related responses; allurin, a sperm chemoattractant [], serotriflin [], etc.The precise function of some of these proteins is still unclear. Tpx-1 or CRISP2 may regulate some ion channels' activity and thereby regulate calcium fluxes during sperm capacitation [
].This entry also includes allergen Tab y 5.0101 from horsefly salivary glands [
].
This entry represents a conserved domain found in a group of sulphate transporters, known as the SLC26A/SulP family [
,
]. These proteins contain an N-terminal membrane domain and a C-terminal cytoplasmic STAS domain a STAS (sulfate transporter and anti-sigma factor antagonist) domain []. This central domain is usually found next to the STAS domain (). Proteins containing this domain include:
Neurospora crassa sulphate permease II (gene cys-14).Yeast sulphate permeases (genes SUL1 and SUL2).Rat sulphate anion transporter 1 (SAT-1).Mammalian DTDST, a probable sulphate transporter which, in human, is involved in the genetic disease, diastrophic dysplasia (DTD).Sulphate transporters 1, 2 and 3 from the legume Stylosanthes hamata.Human pendrin (gene PDS), which is involved in a number of hearing loss genetic diseases.Human protein DRA (Down-Regulated in Adenoma).Soybean early nodulin 70.Escherichia coli hypothetical protein ychM.Caenorhabditis elegans hypothetical protein F41D9.5.
Cyclophilins exhibit peptidyl-prolyl cis-trans isomerase (PPIase) activity (
), accelerating protein folding by catalysing the cis-trans isomerisation of proline imidic peptide bonds in oligopeptides [
,
]. They also have protein chaperone-like functions [] and are the major high-affinity binding proteins for the immunosuppressive drug cyclosporin A (CSA) in vertebrates [].Cyclophilins are found in all prokaryotes and eukaryotes, and have been structurally conserved throughout evolution, implying their importance in cellular function [
]. They share a common 109 amino acid cyclophilin-like domain (CLD) and additional domains unique to each member of the family. The CLD domain contains the PPIase activity, while the unique domains are important for selection of protein substrates and subcellular compartmentalisation [].This entry represents the cyclophilin peptidyl-prolyl cis-trans isomerase family. The family includes RING-type E3 ubiquitin-protein ligase PPIL2, which is thought to be an inactive PPIase [].
The E2F family of transcription factors plays a crucial role in the control of cell cycle [
,
] and action of tumour suppressor proteins. This family consists of eight members and is divided into activators (E2F1-3) and repressors (E2F4-8) depending on cellular context, target gene and cofactors []. The E2F proteins contain several evolutionarily conserved domains found in most members of the family. These domains include a DNA-binding domain, a dimerisation domain which determines interaction with the differentiation regulated transcription factor proteins (DP), a transactivation domain enriched in acidic amino acids, and a tumour suppressor protein association domain which is embedded within the transactivation domain []. Classical E2Fs (E2F1-6) contain one DNA-binding domain which form heterodimers with DP proteins; atypical family members, E2F7 and E2F8, possess two DNA-binding domains, form homodimers or heterodimers, thus regulating transcription in a DP-independent manner [].
A number of bacterial and archaebacterial proteins involved in transporting formate or nitrite have been shown to be related [
]:FocA and FocB, from Escherichia coli, transporters involved in the bidirectional transport of formate [
,
,
].FdhC, from Methanobacterium formicicum and Methanothermobacter thermautotrophicus (Methanobacterium thermoformicicum), a probable formate transporter.NirC, from E. coli and Salmonella typhimurium, a probable nitrite transporter.Bacillus subtilis hypothetical protein YrhG.B. subtilis hypothetical protein YwcJ (ipa-48R).The 70 kd yeast hypothetical protein YHL008c is highly similar, in its N-terminal section, to the prokaryotic members of this family. These transporters are proteins of about 280 residues and seem to contain six transmembrane regions. This entry represents two conserved sites, the first one located in what seems to be a cytoplasmic loop between the second and third transmembrane domains; the second one is part of the fourth transmembrane region.
The Escherichia coli phnB gene is found next to an operon of fourteen genes (phnC-to-phnP) related to the cleavage of carbon-phosphorus (C-P) bonds in unactivated alkylphosphonates, supporting bacterial growth on alkylphosphonates as the sole phosphorus source. It was originally considered part of that operon. PhnB appears to play no direct catalytic role in the usage of alkylphosphonate [
,
,
]. PA2721, an uncharacterized protein from P. aeruginosa also belongs to this family []. Although many of the proteins in this family have been annotated as 3-demethylubiquinone-9 3-methyltransferase enzymes by automatic annotation programs, the experimental evidence for this assignment is lacking. In Escherichia coli, the gene coding 3-demethylubiquinone-9 3-methyltransferase enzyme is ubiG, which belongs to the AdoMet-MTase protein family. PhnB-like proteins adopt a structural fold similar to bleomycin resistance proteins, glyoxalase I, and type I extradiol dioxygenases.
Trinucleotide repeat-containing gene 6C protein (TNRC6C) is one of three GW182 paralogs in mammalian genomes. It is enriched in P-bodies and important for efficient miRNA-mediated repression. TNRC6C is composed of an N-terminal glycine/tryptophan (G/W)-rich region containing an Ago hook responsible for Ago protein-binding; a ubiquitin-associated (UBA) domain and a glutamine (Q)-rich region in the middle region; a middle G/W-rich region, a RNA recognition motif (RRM), also called RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal G/W-rich region, at the C terminus. A bipartite C-terminal region including the middle and C-terminal G/W-rich regions is referred to as silencing domain, that triggers silencing of bound transcripts by inhibiting protein expression and promoting mRNA decay via deadenylation. The C-terminal half containing the RRM domain functions as a key effector domain mediating protein synthesis repression by TNRC6C [
,
].
This entry represents the second immunoglobulin (Ig) domain of nectin-1 (also known as poliovirus receptor related protein 1, PRR1, PVRL1 or CD111). Nectin-1 belongs to the nectin family comprised of four transmembrane glycoproteins (nectins-1 through -4). Nectins are synaptic cell adhesion molecules (CAMs) which facilitate adhesion and signaling at various intracellular junctions. Nectins form homophilic cis-dimers, followed by homophilic and heterophilic trans-dimers involved in cell-cell adhesion. In addition nectins heterophilically trans-interact with other CAMs such as nectin-like molecules (Necls), nectin-1 for example, has been shown to trans-interact with Necl-1 [
]. Nectins also interact with various other proteins, including the actin filament (F-actin)-binding protein, afadin []. Mutation in the human nectin-1 gene is associated with cleft lip/palate ectodermal dysplasia syndrome (CLPED1) []. Nectin-1 is a major receptor for herpes simplex virus through interaction with the viral envelope glycoprotein D [].
This entry represents the first immunoglobulin variable (IgV) domain of nectin-1 (also known as poliovirus receptor related protein 1, PRR1, PVRL1 or CD111). Nectin-1 belongs to the nectin family comprised of four transmembrane glycoproteins (nectins-1 through -4). Nectins are synaptic cell adhesion molecules (CAMs) which facilitate adhesion and signaling at various intracellular junctions. Nectins form homophilic cis-dimers, followed by homophilic and heterophilic trans-dimers involved in cell-cell adhesion. In addition nectins heterophilically trans-interact with other CAMs such as nectin-like molecules (Necls), nectin-1 for example, has been shown to trans-interact with Necl-1 []. Nectins also interact with various other proteins, including the actin filament (F-actin)-binding protein, afadin []. Mutation in the human nectin-1 gene is associated with cleft lip/palate ectodermal dysplasia syndrome (CLPED1) []. Nectin-1 is a major receptor for herpes simplex virus through interaction with the viral envelope glycoprotein D [].
In yeasts, vacuolar protein sorting-associated protein 30 (Vps30), also known as autophagy-related protein 6 (Atg6), is a common component of two distinct phosphatidylinositol 3-kinase complexes. In complex I, Atg14 links Vps30 to Vps34 lipid kinase and plays a specific role in autophagy, while in complex II, Vps38 links Vps30 to Vps34 and plays an important role in vacuolar protein sorting [
]. The C-terminal of Vps30 contains a globular fold comprised of three β-sheet-α-helix repeats (also known as beta-alpha repeated, autophagy-specific (BARA) domain) and is required for autophagy through the targeting of complex I to the pre-autophagosomal structure. The N-terminal of Vps30 is required for vacuolar protein sorting []. Beclin, the mammalian homologue of yeast Atg6/Vps30, is a tumour suppressor that coordinately regulates the autophagy and membrane trafficking involved in several physiological and pathological processes [
,
].
Smoothened (SMO) is a transmembrane G protein-coupled receptor that acts as the transducer of the hedgehog (HH) signaling pathway [
]. SMO is activated by the hedgehog (HH) family of proteins acting on the 12-transmembrane domain receptor patched (PTCH), which constitutively inhibits SMO. Thus, in the absence of HH proteins, PTCH inhibits SMO signaling. On the other hand, binding of HH to the PTCH receptor activates its internalization and degradation, thereby releasing the PTCH inhibition of SMO. This allows SMO to trigger intracellular signaling and the subsequent activation of the Gli family of zinc finger transcriptional factors and induction of HH target gene expression (PTCH, Gli1, cyclin, Bcl-2, etc) []. SMO is closely related to the frizzled (FZD) family of seven transmembrane-spanning proteins, which constitute a novel and separate family of G-protein coupled receptors [,
].
The breast cancer susceptibility gene contains at its C terminus two copies of a conserved domain that was named BRCT for BRCA1 C terminus. This domain of about 95 amino acids is found in a large variety of proteins involved in DNA repair, recombination and cell cycle control [
,
,
]. The BRCT domain is not limited to the C-terminal of protein sequences and can be found in multiple copies or in a single copy as in RAP1 and TdT. Some data [] indicate that the BRCT domain functions as a protein-protein interaction module.The structure of the first of the two C-terminal BRCT domains of the human DNA repair protein XRCC1 has been determined by X-ray crystallography, it comprises a four-stranded parallel β-sheet surrounded by three α-helices, which form an autonomously folded domain [
].
The green fluorescent protein (GFP) is found in the jellyfish (Aequorea victoria), and functions as an energy-transfer acceptor. It fluoresces
in vivoupon receiving energy from the
Ca2+-activated photoprotein aequorin. The protein absorbs light maximally at 395 nm and exhibits a smaller absorbance peak at 470 nm. The fluorescence emission spectrum peaks at 509 nm with a shoulder at 540 nm. The protein is produced in the photocytes and contains a chromophore, which is composed of modified amino acid residues. The chromophore is formed upon cyclisation of the residues ser-dehydrotyr-gly. There are several other members of the GFP family, which are able to fluoresce different colours, sveral of which are non-fluorescent [
]. These proteins are all essentailly encoded by single genes, since both the substrate and the catalytic enzyme for pigment biosynthesis are provided within a single polypeptide chain [].
The green fluorescent protein (GFP) is found in the jellyfish (Aequorea victoria), and functions as an energy-transfer acceptor. It fluoresces
in vivoupon receiving energy from the
Ca2+-activated photoprotein aequorin. The protein absorbs light maximally at 395 nm and exhibits a smaller absorbance peak at 470 nm. The fluorescence emission spectrum peaks at 509 nm with a shoulder at 540 nm. The protein is produced in the photocytes and contains a chromophore, which is composed of modified amino acid residues. The chromophore is formed upon cyclisation of the residues ser-dehydrotyr-gly. There are several other members of the GFP family, which are able to fluoresce different colours, several of which are non-fluorescent [
]. These proteins are all essentially encoded by single genes, since both the substrate and the catalytic enzyme for pigment biosynthesis are provided within a single polypeptide chain [].
A family of bacterial proteins has been described which groups transcriptional repressors, sugar kinases and yet uncharacterised open reading frames [
]. This family, known as ROK (Repressor, ORF, Kinase) includes the xylose operon repressor, xylR, from Bacillus subtilis, Lactobacillus pentosus and Staphylococcus xylosus; N-acetylglucosamine repressor, nagC, from Escherichia coli; glucokinase from Streptomyces coelicolor; fructokinase
from Pediococcus pentosaceus, Streptococcus mutans and Zymomonas mobilis; allokinase
and mlc from E. coli; and E. coli hypothetical proteins yajF and yhcI and the corresponding Haemophilus influenzae proteins. The repressor proteins (xylR and nagC) from this family possess an N-terminal region not present in the sugar kinases and which contains an helix-turn-helix DNA-binding motif.
This entry represent the bacterial N-acetylmannosamine kinase subfamily of the ROK family. Proteins in this entry catalyze the phosphorylation of N-acetylmannosamine(ManNAc) to ManNAc-6-P [
,
,
].
Tubulin-folding cofactor B (TBCB) is one of the protein cofactors A through E that is required for the folding of tubulins prior to their incorporation into microtubules and heterodimer assembly [
]. These cofactors are involved in the biogenesis and degradation of alpha and beta tubulins to maintain concentrated soluble pools, required for cell homeostasis []. TBCB comprises an N-terminal ubiquitin-like (Ubl) domain and a C-terminal cytoskeleton-associated protein with glycine-rich segment (CAP-Gly) domain. The Ubl domain of TBCB is essential for proper folding and assembly of tubulin alpha. It has a β-grasp Ubl fold, a common structure involved in protein-protein interactions. Ubiquitin (Ub) is a protein modifier in eukaryotes that is involved in various cellular processes, including transcriptional regulation, cell cycle control, and DNA repair. TBC-A through E are necessary for the biogenesis of microtubules and for cell viability [,
,
].
This superfamily consists of haemolysin expression modulating protein (Hha) from Escherichia coli and its enterobacterial homologues, such as YmoA from Yersinia enterocolitica, and RmoA encoded on the R100 plasmid. These proteins act as modulators of bacterial gene expression. Members of the Hha/YmoA/RmoA family act in conjunction with members of the H-NS family, participating in the thermoregulation of different virulence factors and in plasmid transfer [
]. Hha, along with the chromatin-associated protein H-NS, is involved in the regulation of expression of the toxin alpha-haemolysin in response to osmolarity and temperature []. YmoA modulates the expression of various virulence factors, such as Yop proteins and YadA adhesin, in response to temperature. RmoA is a plasmid R100 modulator involved in plasmid transfer []. Members of this family display striking similarity to the oligomerization domain of the H-NS proteins.
This entry represents the SH2 domain of SH2B3.SH2B adapter protein 3 (SH2B3) belongs to the SH2B family of adapter proteins [
]. It is involved in the homeostasis of hematopoietic stem cells and lymphoid progenitors, and plays a tumour suppressor role in the acute lymphoblastic leukemia [,
,
]. SH2B3 (Lnk) may also influence inflammatory immune responses in peripheral lymphoid tissues []. A link has been established between polymorphism in this adaptor protein and autoimmune diseases, including type 1 diabetes and celiac disease [,
,
].SH2B family contains three members of adaptor proteins: SH2B1, 2 and 3 [
]. Typical SH2B proteins contain a SH2 (Src homology 2) and a PH (pleckstrin homology) domain. They serve as adaptors involved in signalling by the receptors for growth factors, such as insulin-like growth factor 1, platelet-derived growth factor and nerve growth factor [].
This alcohol dehydrogenase domain is located on the C-terminal part of a bifunctional two-domain protein. The N-terminal part of the protein contains an acetaldehyde-CoA dehydrogenase domain. This protein is involved in pyruvate metabolism. Pyruvate is converted to acetyl-CoA and formate by pyruvate formate-lysase (PFL). Under anaerobic condition, acetyl-CoA is reduced to acetaldehyde and ethanol by this two-domain protein [
]. Acetyl-CoA is first converted into an enzyme-bound thiohemiacetal by the N-terminal acetaldehyde dehydrogenase domain. The enzyme-bound thiohemiacetal is subsequently reduced by the C-terminal NAD+-dependent alcohol dehydrogenase domain. In E. coli, this protein is called AdhE and was shown pyruvate formate-lysase (PFL) deactivase activity, which is involved in the inactivation of PFL, a key enzyme in anaerobic metabolism []. In Escherichia coli and Entamoeba histolytica, this enzyme forms homopolymeric peptides composed of more than 20 protomers associated in a helical rod-like structure [].
This entry represents the C-terminal domain of ARMET.
ARMET, also known as mesencephalic astrocyte-derived neurotrophic factor (MANF) or arginine-rich protein, is a small protein of approximately 170 residues which contains four di-sulphide bridges that are highly conserved from nematodes to humans. It is a soluble protein resident in the endoplasmic reticulum and induced by ER stress. It appears to be involved with dealing with mis-folded proteins in the ER, thus in quality control of ER stress [
]. ARMET from Rattus norvegicus (Rat) selectively promotes the survival of dopaminergic neurons of the ventral mid-brain. It modulates GABAergic transmission to the dopaminergic neurons of the substantia nigra, and enhances spontaneous, as well as evoked, GABAergic inhibitory postsynaptic currents in dopaminergic neurons [].Proteins containing this domain includes the related neurotrophic factor CDNF (cerebral dopamine neurotrophic factor) [
].
The F-actin capping protein binds in a calcium-independent manner to the fast growing ends of actin filaments (barbed end) thereby blocking the exchange of subunits at these ends. Unlike gelsolin and severin this protein does not sever actin filaments. The F-actin capping protein is a heterodimer composed of two unrelated subunits: alpha and beta. Neither of the subunits shows sequence similarity to other filament-capping proteins [
].This entry represent domain 1 found at the N terminus of the alpha subunit (CAPZA), which is a protein of about 268 to 286 amino acid residues whose sequence is well conserved in eukaryotic species [
]. In Drosophila mutations in the alpha and beta subunits cause actin accumulation and subsequent retinal degeneration []. In humans CAPZA is part of the WASH complex that controls the fission of endosomes [].
Flotillin proteins are membrane-bound chaperones that localize to lipid rafts, where they may recruit the proteins that need to be localized in lipid rafts to be active and facilitate their interaction and oligomerization. Proteins in this entry include flotillin-1 (also known as reggie-2), flotillin-2 (also known as reggie-1) and their homologues. Flotillin-1 and flotillin-2 associate with membrane microdomains known as rafts [
]. They play a role in various cellular processes such as insulin signaling, T cell activation, membrane trafficking, phagocytosis, and epidermal growth factor receptor signaling []. This entry also include bacterial homologues of Flotillin-1. These bacterial proteins are found in membrane microdomains that may be equivalent to eukaryotic membrane rafts [
]. Similarly to eukaryotic flotillin proteins, flotillins in bacteria play an essential role in organizing and maintaining the correct architecture of the functional membrane microdomains [].
This is one of two Tudor-like domains found in the N-terminal region of RapA proteins. RapA is an abundant RNAP-associated protein of 110kDa molecular weight with ATPase activity. It forms a stable complex with the RNAP core enzyme, but not with the holoenzyme. The ATPase activity of RapA increases upon its binding to RNAP [
]. The N-terminal region of RapA contains two copies of a Tudor-like domains, both folded as a highly bent antiparallel β-sheet. This fold is also found in transcription factor NusG , ribosomal protein L24, human SMN (survival of motor neuron) protein, mammalian DNA repair factor 53BP1, putative fission yeast DNA repair factor Crb2 and bacterial transcription-repair coupling factor known as Mfd. The functional roles of the N-terminal region homologs in these proteins suggest that the Tudor-like domains of RapA may interact with both nucleic acids and RNAP [].