Hematopoietic lineage cell-specific protein-1 (HS1) binding protein 3 (HS1BP3) associates with HS1 proteins through their SH3 domains, suggesting a role in mediating signaling. It has been reported that HS1BP3 might affect the IL-2 signaling pathway in hematopoietic lineage cells [
]. Mutations in HS1BP3 may also be associated with familial Parkinson disease and essential tremor [,
]. HS1BP3 contains a PX domain, a leucine zipper, motifs similar to immunoreceptor tyrosine-based inhibitory motif and proline-rich regions [
].This entry represents the PX domain of HS1BP3. In general, the PX domain interacts with PIs and plays a role in targeting proteins to PI-enriched membranes [
].
The flagellar motor switch in Escherichia coli and Salmonella typhimurium regulates the direction of flagellar rotation and hence controls swimming behaviour [
]. The switch is a complex apparatus that responds to signals transduced by the chemotaxis sensory signalling system during chemotactic behaviour []. CheY, the chemotaxis response regulator, is believed to act directly on the switch to induce tumbles in the swimming pattern, but no physical interactions of CheY and switch proteins have yet been demonstrated.The switch complex comprises at least three proteins - FliG, FliM and FliN. It has been shown that FliG interacts with FliM, FliM interacts with itself, and FliM interacts with FliN [
]. Several residues within the middle third of FliG appear to be strongly involved in the FliG-FliM interaction, with residues near the N or C termini being less important []. Such clustering suggests that FliG-FliM interaction plays a central role in switching.Analysis of the FliG, FliM and FliN sequences shows that none are especially hydrophobic or appear to be integral membrane proteins [
]. This result is consistent with other evidence suggesting that the proteins may be peripheral to the membrane, possibly mounted on the basal body M ring [,
].
BipA (also called TypA) is a highly conserved protein with global regulatory properties in Escherichia coli. Mutants show altered regulation of some pathways. BipA is a 50S ribosomal subunit assembly protein with GTPase activity, required for 50S subunit assembly at low temperatures and it also functions as a translation factor that is required specifically for the expression of the transcriptional modulator Fis. BipA binds to ribosomes at a site that coincides with that of EF-G and has a GTPase activity that is sensitive to high GDP:GTP ratios and is stimulated by 70S ribosomes programmed with mRNA and aminoacylated tRNAs [
,
]. The growth rate-dependent induction of BipA allows the efficient expression of Fis, thereby modulating a range of downstream processes, including DNA metabolism and type III secretion. This GTPase impacts interactions between enteropathogenic E.coli (EPEC) and epithelial cells and also has an effect on motility []. It appears to be involved in the regulation of several processes important for infection, including rearrangements of the cytoskeleton of the host, bacterial resistance to host defence peptides, flagellum-mediated cell motility, and expression of K5 capsular genes [,
].This entry also includes TypA-like SVR3 from Arabidopsis, a putative chloroplastic elongation factor involved in response to chilling stress. It is required for proper chloroplast rRNA processing and/or translation at low temperature [] and it is also involved in plastid protein homeostasis [].This entry represents domain III of BipA/TypA, which adopts an α/β structure.
The KLHL (Kelch-like) proteins generally have a BTB/POZ domain, a BACK domain, and five to six Kelch motifs. They constitute a subgroup at the intersection between the BTB/POZ domain and Kelch domain superfamilies. The BTB/POZ domain facilitates protein binding [
], while the Kelch domain (repeats) form β-propellers. The Kelch superfamily of proteins can be subdivided into five groups: (1) N-propeller, C-dimer proteins, (2) N-propeller proteins, (3) propeller proteins, (4) N-dimer, C-propeller proteins, and (5) C-propeller proteins. KLHL family members belong to the N-dimer, C-propeller subclass of Kelch repeat proteins []. In addition to BTB/POZ and Kelch domains, the KLHL family members contain a BACK domain, first described as a 130-residue region of conservation observed amongst BTB-Kelch proteins [
]. Many of the Kelch-like proteins have been identified as adaptors for the recruitment of substrates to Cul3-based E3 ubiquitin ligases [,
].Kelch-like protein 15 (KLHL15) is a substrate-specific adaptor for the Cullin3 E3 ubiquitin-protein ligase complex that targets the serine/threonine-protein phosphatase 2A (PP2A) subunit PPP2R5B for ubiquitination and subsequent proteasomal degradation, thus promoting exchange with other regulatory subunits [
,
]. It also plays a key role in DNA damage response, favoring DNA double-strand repair through error-prone non-homologous end joining (NHEJ) over error-free, RBBP8-mediated homologous recombination (HR), by targeting the DNA-end resection factor RBBP8/CtIP for ubiquitination and subsequent proteasomal degradation. KLHL15 contains a BTB domain and kelch repeats, characteristics of a kelch family protein [,
,
].This entry represents the BACK domain of KLHL15.
The KLHL (Kelch-like) proteins generally have a BTB/POZ domain, a BACK domain, and five to six Kelch motifs. They constitute a subgroup at the intersection between the BTB/POZ domain and Kelch domain superfamilies. The BTB/POZ domain facilitates protein binding [
], while the Kelch domain (repeats) form β-propellers. The Kelch superfamily of proteins can be subdivided into five groups: (1) N-propeller, C-dimer proteins, (2) N-propeller proteins, (3) propeller proteins, (4) N-dimer, C-propeller proteins, and (5) C-propeller proteins. KLHL family members belong to the N-dimer, C-propeller subclass of Kelch repeat proteins []. In addition to BTB/POZ and Kelch domains, the KLHL family members contain a BACK domain, first described as a 130-residue region of conservation observed amongst BTB-Kelch proteins []. Many of the Kelch-like proteins have been identified as adaptors for the recruitment of substrates to Cul3-based E3 ubiquitin ligases [,
].Kelch-like protein 32 (KLHL32), also called BTB and kelch domain-containing protein 5 (BKLHD5) [
], contains a BTB domain and kelch repeats, characteristics of a kelch family protein. Its function remains unclear. Deletion of KLHL32 may be associated with Tourette syndrome and obsessive-compulsive disorder [,
]. Its function is not clear.This entry represents the BACK domain of KLHL32.
The KLHL (Kelch-like) proteins generally have a BTB/POZ domain, a BACK domain, and five to six Kelch motifs. They constitute a subgroup at the intersection between the BTB/POZ domain and Kelch domain superfamilies. The BTB/POZ domain facilitates protein binding [
], while the Kelch domain (repeats) form β-propellers. The Kelch superfamily of proteins can be subdivided into five groups: (1) N-propeller, C-dimer proteins, (2) N-propeller proteins, (3) propeller proteins, (4) N-dimer, C-propeller proteins, and (5) C-propeller proteins. KLHL family members belong to the N-dimer, C-propeller subclass of Kelch repeat proteins []. In addition to BTB/POZ and Kelch domains, the KLHL family members contain a BACK domain, first described as a 130-residue region of conservation observed amongst BTB-Kelch proteins []. Many of the Kelch-like proteins have been identified as adaptors for the recruitment of substrates to Cul3-based E3 ubiquitin ligases [
,
].Kelch-like protein 14 (KLHL14) belongs to the KLHL family [
]. Its is also known as Printor (protein interactor of torsinA). It selectively binds to the ATP-free form, but not to the ATP-bound form of torsinA, suggesting a role for Printor as a cofactor rather than a substrate of the AAA+ protein torsinA and is implicated in dystonia pathogenesis [,
,
].This entry represents the BACK domain.
The KLHL (Kelch-like) proteins generally have a BTB/POZ domain, a BACK domain, and five to six Kelch motifs. They constitute a subgroup at the intersection between the BTB/POZ domain and Kelch domain superfamilies. The BTB/POZ domain facilitates protein binding [
], while the Kelch domain (repeats) form β-propellers. The Kelch superfamily of proteins can be subdivided into five groups: (1) N-propeller, C-dimer proteins, (2) N-propeller proteins, (3) propeller proteins, (4) N-dimer, C-propeller proteins, and (5) C-propeller proteins. KLHL family members belong to the N-dimer, C-propeller subclass of Kelch repeat proteins []. In addition to BTB/POZ and Kelch domains, the KLHL family members contain a BACK domain, first described as a 130-residue region of conservation observed amongst BTB-Kelch proteins []. Many of the Kelch-like proteins have been identified as adaptors for the recruitment of substrates to Cul3-based E3 ubiquitin ligases [,
].Kelch-like protein 24 (KLHL24, also known as kainate receptor-interacting protein for GluR6 (KRIP6) or protein DRE1) belongs to the KLHL family [
]. is necessary to maintain the balance between intermediate filament stability and degradation, a process that is essential for skin integrity. KLHL24 is a component of a BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex that mediates ubiquitination of KRT14 and controls its levels during keratinocyte differentiation [
,
,
,
,
. KLHL24 binds to and regulates the GluR6a kainate receptor [
]. It also modulates the interaction of PICK1 with GluR6 kainate receptors [,
,
]. Kainate receptors (KAR) are ionotropic receptors that respond to the neurotransmitter glutamate and have been implicated in epilepsy, stroke, Alzheimer's and neuropathic pain [].This entry represents the BACK domain.
The KLHL (Kelch-like) proteins generally have a BTB/POZ domain, a BACK domain, and five to six Kelch motifs. They constitute a subgroup at the intersection between the BTB/POZ domain and Kelch domain superfamilies. The BTB/POZ domain facilitates protein binding [
], while the Kelch domain (repeats) form β-propellers. The Kelch superfamily of proteins can be subdivided into five groups: (1) N-propeller, C-dimer proteins, (2) N-propeller proteins, (3) propeller proteins, (4) N-dimer, C-propeller proteins, and (5) C-propeller proteins. KLHL family members belong to the N-dimer, C-propeller subclass of Kelch repeat proteins []. In addition to BTB/POZ and Kelch domains, the KLHL family members contain a BACK domain, first described as a 130-residue region of conservation observed amongst BTB-Kelch proteins []. Many of the Kelch-like proteins have been identified as adaptors for the recruitment of substrates to Cul3-based E3 ubiquitin ligases [,
].Kelch-like protein 21 (KLHL21) is a substrate adaptor protein in the Cul3-KLHL21 E3 ubiquitin ligase complex required for efficient chromosome alignment and cytokinesis. During cytokinesis, it localises to midzone microtubules in anaphase and recruits aurora B and Cul3 to this region [
]. KLHL21 also targets IkappaB kinase-beta to regulate nuclear factor kappa-light chain enhancer of activated B cells (NF-kappaB) signalling negatively [,
,
,
]. This entry represents the BACK domain.
The KLHL (Kelch-like) proteins generally have a BTB/POZ domain, a BACK domain, and five to six Kelch motifs. They constitute a subgroup at the intersection between the BTB/POZ domain and Kelch domain superfamilies. The BTB/POZ domain facilitates protein binding [
], while the Kelch domain (repeats) form β-propellers. The Kelch superfamily of proteins can be subdivided into five groups: (1) N-propeller, C-dimer proteins, (2) N-propeller proteins, (3) propeller proteins, (4) N-dimer, C-propeller proteins, and (5) C-propeller proteins. KLHL family members belong to the N-dimer, C-propeller subclass of Kelch repeat proteins []. In addition to BTB/POZ and Kelch domains, the KLHL family members contain a BACK domain, first described as a 130-residue region of conservation observed amongst BTB-Kelch proteins []. Many of the Kelch-like proteins have been identified as adaptors for the recruitment of substrates to Cul3-based E3 ubiquitin ligases [,
].Kelch-like protein 23 (KLHL23) belongs to the KLHL family [
]. KLHL23 overexpression is associated with increased cell proliferation and invasion in gastric cancer. Downregulation of KLHL23 is associated with invasion, metastasis, and poor prognosis of hepatocellular carcinoma and pancreatic cancer [,
,
]. This entry represents the BACK domain.
IPP (also known as Kelch-like protein 27) belongs to the KLHL family. It binds to actin through its kelch repeat domain [
]. It may play a role in organizing the actin cytoskeleton, however, its exact function is not clear. This entry represents the BACK domain.
The KLHL (Kelch-like) proteins generally have a BTB/POZ domain, a BACK domain, and five to six Kelch motifs. They constitute a subgroup at the intersection between the BTB/POZ domain and Kelch domain superfamilies. The BTB/POZ domain facilitates protein binding [
], while the Kelch domain (repeats) form β-propellers. The Kelch superfamily of proteins can be subdivided into five groups: (1) N-propeller, C-dimer proteins, (2) N-propeller proteins, (3) propeller proteins, (4) N-dimer, C-propeller proteins, and (5) C-propeller proteins. KLHL family members belong to the N-dimer, C-propeller subclass of Kelch repeat proteins []. In addition to BTB/POZ and Kelch domains, the KLHL family members contain a BACK domain, first described as a 130-residue region of conservation observed amongst BTB-Kelch proteins []. Many of the Kelch-like proteins have been identified as adaptors for the recruitment of substrates to Cul3-based E3 ubiquitin ligases [,
].Kelch-like protein 7 (KLHL7) belongs to the KLHL family [
]. It serves as a substrate-specific adapter in the Cul3-dependent ubiquitin ligase complex and is linked to autosomal dominant retinitis pigmentosa (adRP) []. This entry represents the BACK domain.
Barley yellow dwarf virus (BYDV) can be separated into two groups based on serological relationships, presumably governed by the viral capsid structure [
]. Coding regions of coat proteins have been identified for the MAV-PS1, P-PAV (group 1) and NY-RPV (group 2) isolates of BYDV. Group 1 proteins show 71% sequence similarity to each other, 51% similarity to those of group 2, and a high degree of similarity to those from other luteoviruses (including coat proteins from Beet western yellows virus (BWYV) [] and Potato leafroll virus (PLrV) [,
]).Among luteovirus coat protein sequences in general, several highly conserved domains can be identified, while other domains differentiate group 1 isolates from group 2 and other luteoviruses. Sequence comparisons between the genomes of PLrV, BWYV and BYDV have revealed ~65% protein sequence similarity between the capsid proteins of BWYV and PLrV and ~45% similarity between BYDV and PLrV [
]. The N-terminal regions of these sequences, like those of many plant virus capsid proteins, is highly basic. These regions may be involved in protein-RNA interaction.
Members of this group are bacterial microcompartment shell proteins: PduT of Salmonella enterica and its orthologs in the propriondiol and ethanolamine operons of bacteria [
,
,
,
]. Some non-autotrophic organisms form polyhedral organelles, enterosomes [], that resemble the carboxysomes found in autotrophs, particularly the cyanobacteria. Carboxysomes are well-studied polyhedral organelles found in cyanobacteria and some chemoautotrophs [,
]. They are composed of a proteinaceous shell that houses most of the cell s ribulose bisphosphate carboxylase/oxygenase (RuBisCO). They are required for autotrophic growth at low CO2concentrations and are thought to function as part of a CO
2-concentrating mechanism [
,
].Polyhedral organelles, enterosomes, from non-autotrophic organisms are involved in coenzyme B12-dependent 1,2-propanediol utilization (e.g., in S. enterica, [
]) and ethanolamine utilization (e.g., in Salmonella typhimurium []). Genes needed for enterosome formation are located in the 1,2-propanediol utilization pdu[
,
] or ethanolamine utilization eut[
,
] operons, respectively. Although enterosomes of non-autotrophic organisms are apparently related to carboxysomes structurally, a functional relationship is uncertain. A role in CO2concentration, similar to that of the carboxysome, is unlikely since there is no known association between CO
2and coenzyme B12-dependent 1,2-propanediol or ethanolamine utilization [
].In S. enterica the propriondiol degrading enterosome consists of at least 15 proteins, of which, at least, seven are shell proteins: pduA [
], pduB, B', J, K, T and U. In addition the organelle contains four enzymes: B12-dependent diol dehydratase (pduCDE) and its reactivating factor (pduGH), CoA-dependent proprionaldehyde dehydrogenase (pduP, [
]) and adenosyl transferase (pduO) []. It has been suggested that enterosomes sequester toxic aldehydes formed during both 1,2-propanediol and ethanolamine degradation and channel them to subsequent pathway enzymes. It has also been suggested that polyhedra might be used to protect diol dehydratase and ethanolamine ammonia-lyase from oxygen, to which both are sensitive [,
]. Mutational studies of PduA indicate that the organelles of S. enterica are not involved in concentrating 1,2-propanediol or CN-B12, but are consistent with a role in moderating aldehyde toxicity [,
].
Calcium/calmodulin-dependent protein kinase II, association-domain
Type:
Domain
Description:
Protein phosphorylation, which plays a key role in most cellular activities, is a reversible process mediated by protein kinases and phosphoprotein phosphatases. Protein kinases catalyse the transfer of the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. Phosphoprotein phosphatases catalyse the reverse process.
Protein kinases fall into three broad classes, characterised with respect to substrate specificity []:Serine/threonine-protein kinasesTyrosine-protein kinasesDual specificity protein kinases (e.g. MEK - phosphorylates both Thr and Tyr on target proteins)Protein kinase function is evolutionarily conserved from Escherichia coli to human [
]. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation []. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins. The catalytic subunits of protein kinases are highly conserved, and several structures have been solved [], leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases [].This domain is found at the C terminus of the Calcium/calmodulin dependent protein kinases II (CaMKII). These proteins also have a Ser/Thr protein kinase domain (
) at their N terminus [
]. The function of the CaMKII association domain is the assembly of the single proteins into large (8 to 14 subunits) multimers [] and is a prominent kinase in the central nervous system that may function in long-term potentiation and neurotransmitter release.
Several members of this family are annotated as being ATP/GTP-binding site motif A (P-loop) proteins, but this has not been confirmed. The one structure solved for this family (
) exhibits an immunoglobin-like β-sandwich fold. Crystal packing suggests that a tetramer is a significant oligomerisation state, and a disulphide bridge is formed between Cys 125 at the C-terminal end of the monomer, and Cys 69.
This family of proteins is functionally uncharacterised. Proteins in this family are found in bacteria and contain a single completely conserved residue Q that may be functionally important.
This family of bacterial proteins is functionally uncharacterised. Proteins in this family contain two completely conserved residues (P and S) that may be functionally important.
This family of proteins is functionally uncharacterised. Proteins in this family acontain two completely conserved residues (G and D) that may be functionally important.
This family of proteins is functionally uncharacterised. Proteins in this family contain is a single completely conserved residue E that may be functionally important.
This domain is found in enhanced intracellular survival protein eis from Mycobacterium tuberculosis. It shares protein sequence similarity with the SCP2 sterol-binding domain. eis may participate in pathogenesis, possibly by enhancing survival of the bacteria in host macrophages during infection [
].
This is the highly conserved central region of Gas8 proteins. Growth arrest-specific protein 8 (Gas8) is a microtubule-binding protein localised to regions of dynein regulation in mammalian cells. In mouse, Gas8 is predominantly a testicular protein, whose expression is developmentally regulated during puberty and spermatogenesis. In humans, it is absent in infertile males who lack the ability to generate gametes. The localisation of Gas8 in the motility apparatus of post-meiotic gametocytes and mature spermatozoa, together with the detection of Gas8 also in cilia at the apical surfaces of epithelial cells lining the pulmonary bronchi and Fallopian tubes suggests that the Gas8 protein may have a role in the functioning of motile cellular appendages [
].
This family of proteins is uncharacterised. Proteins in this family contain a single completely conserved residue E that may be functionally important.
This entry represents the periplasmic mercury (II) binding protein of the bacterial mercury detoxification system which passes mercuric ion to the MerT transporter for subsequent reduction to Hg(0) by the mercuric reductase MerA [
,
]. MerP contains a distinctive GMTCXXC motif associated with metal binding []. MerP is related to a larger family of metal binding proteins.
Bluetongue virus is a representative of the Orbivirus genus of the Reoviridae [
]. Orbiviruses infect mammalian hosts through insect vectors, causing economically-important diseases of domesticated animals []. They possess a segmented, double-stranded RNA genome within a capsid that comprises four major polypeptides, designated VP2, VP3, VP5 and VP7. On entering a target cell, an outer layer, formed from VP2 and VP5, is removed, leaving an intact core within the cell []. The core, which is 70nm across, contains 780 copies of VP7, which together form 260 trimeric 'bristly' capsomeres clothing an inner scaffold constructed from VP3 [].The 3D structure of VP7 reveals two domains, one a β-sandwich, the other a bundle of α-helices, and a short C-terminal arm, which is thought to unite trimers during capsid formation [
]. A concentration of methionine residues at the core of the molecule could provide plasticity, relieving structural mismatches during assembly [].The 3D structure of baculovirus-expressed core protein VP7 of African horse sickness virus 4 (AHSV-4) has been determined to 2.3A resolution [
]. During crystallisation, the two-domain protein is cleaved, leaving only the top domain, in a manner reminiscent of BTV VP7; this suggests that connections between top and bottom domains are relatively weak for these two distinct orbiviruses []. The top domains of both BTV and AHSV VP7 are trimeric and structurally very similar. Electron density maps indicate an extra density feature along their molecular 3-fold axes, probably the result of an unidentified ion []. The characteristics of the molecular surface indicate the possibility of attachment to the cell via attachment of an Arg-Gly-Asp (RGD) motif in the top domain of VP7 to a cellular integrin for both of these orbiviruses [].
This entry represents the PH domain of DP13A/B.
This entry includes DCC-interacting protein 13-alpha/beta from humans (DIP13A/B, also known as APPL1/2) and similar proteins predominantly found in vertebrates. DIP13A/B are multifunctional adapter proteins that bind to various membrane receptors, nuclear factors and signalling proteins to regulate many processes, such as cell proliferation, immune response, endosomal trafficking and cell metabolism [
,
,
,
,
]. DIP13B may also affect adult neurogenesis in hippocampus and olfactory system via regulating the sensitivity of glucocorticoid receptor [,
]. These proteins consist of a BAR and a PH domain near the N-terminal, and the two domains are thought to function as a unit (BAR-PH domain) [
]. At the C-terminal, they have a PTB domain []. Lipid binding assays show that the BAR, PH, and PTB domains can bind phospholipids [].
This entry represents the N-terminal domain of major prion protein. The proteins consist of mainly alpha helices. Bovine prion has been shown to form a stable helix which inserts in a transmembrane location in the bilayer, with the N -terminal functioning as a cell-penetrating peptide [
].
Nicotinic acid mononucleotide biosynthesis protein
Type:
Family
Description:
This group contains uncharacterised proteins that are implicated in nicotinic acid mononucleotide (NMN) biosynthesis based on the genomic context of the corresponding genes (operon structure, gene neighbourhood) [
]. The Rhizobium loti (Mesorhizobium loti) member (Msi362, ORF1) is encoded by the symbiosis island that contains operons required for the syntheses of nicotinic acid mononucleotide (NMN) and biotin, and belongs to the nadoperon [
,
]. The Neisseria meningitidis member is flanked by nadAand
nadCgenes whose products are part of the pathway for NMN synthesis in Escherichia coli. However, mutation in ORF1 failed to produce the vitamin auxotrophic phenotype [
], which suggests that members of this group may have a non-essential role in this process.
ZZ-type zinc finger-containing protein 3 (ZZZ3) is a component of the ATAC complex [
]. ZZZ3 contains an HTH myb-type domain (PDB:2YUM) and a ZZ-type zinc-finger (PDB:2FC7).The human ADA2A-containing complex (ATAC) is a histone acetyltransferase acting on histones H3 and H4 that is similar to but distinct from a similar complex initially identified in Drosophila melanogaster. The ATAC complex is composed of KAT14, KAT2A, TADA2L, TADA3L, ZZZ3, MBIP, WDR5, YEATS2, CCDC101 and DR1 [
].
Syntaxin binding protein 6 (STXBP6, also called Amisyn) contains, beside the N-terminal PH-like domain, a C-terminal R-SNARE-like domain, which allows it to assemble into SNARE complexes, which in turn makes the complexes inactive and inhibits exocytosis [
]. SNARE complexes mediate membrane fusion, important for trafficking of newly synthesized proteins, recycling of pre-existing proteins and organelle formation. SNARE proteins are classified into four groups, Qa-, Qb-, Qc- and R-SNAREs, with STXBP6 being a R-SNARE [,
].This entry represents the PH domain of STXBP6.
This entry represents the basic Helix-Loop-Helix-zipper (bHLHzip) domain found in CBF1 from Saccharomyces cerevisiae and similar fungal proteins. This domain is also found in psilocybin cluster transcription regulator (PsiR) from psychedelic mushroom, which is a transcription factor that may regulate the expression of the gene cluster that mediates the biosynthesis of psilocybin, a psychotropic tryptamine-derived natural product [
].CBP-1, also termed centromere promoter factor 1 (CPF1), or centromere-binding factor 1 (CBF1), is a bHLHzip protein that is required for chromosome stability and methionine prototrophy. It binds as a homodimer to the centromere DNA elements I (CDEI, GTCACATG) region of the centromere that is required for optimal centromere function [
,
,
,
,
,
,
,
].
AUTS2 is a novel gene identified in a monozygotic twin pair with autism; its translation product is a large protein containing 1,295 amino acids [
]. Following DNA sequence analysis of autism subjects and controls, no autism-specific mutation was observed. Association and linkage analyses were also negative. It is hence considered unlikely that AUTS2 is an autism suspecptibility gene for idiopathic autism, although it may be the gene responsible for the disorder in the twins in this study []. The AUTS2 gene product shares a high level of similarity with the so-called fibrosin-1-like protein, a functionally uncharacterised polypeptide that contains C-terminal Ala-rich and Pro-rich regions.
RMP1, also called RNA-processing protein RMP1, or RNase MRP 23.6 kDa subunit, functions as part of ribonuclease MRP (RNase MRP), which is involved in rRNA processing in mitochondria. RNase MRP complex consists of an RNA moiety and at least 10 protein subunits including POP1, POP3, POP4, POP5, POP6, POP7, POP8, RMP1, RPP1 and SNM1, many of which are shared with the RNase P complex. RMP1 is required for proper rRNA processing [
,
,
].
SPACA1 (also known as SAMP32) is localized to the acrosome of spermatozoa. The acrosome is an organelle transformed from the Golgi apparatus to form a cap over the anterior portion of the spermatozoa head, which contains the sperm nucleus. Mammalian acrosomes contain digestive enzymes that degrade the ovum outer membrane (zona pellucida) to allow fusion of the sperm and ovum nuclei via the acrosomal reaction [
]. In mammals, the acrosome releases hyaluronidase and acrosin. Antibodies against recombinant SAMP32 inhibits both the binding and the fusion of human sperm to zona-free hamster eggs []. Male mice lacking SPACA1 are infertile, and exhibit globozoospermia-like misformed sperm heads []. SPACA1 content has been reported to be diminished in a comparison of round-headed and normal spermatozoa [].
Bms1 is an essential, evolutionarily conserved, nucleolar protein. Its depletion interferes with processing of the 35S pre-rRNA at sites A0, A1, and A2, and the formation of 40S subunits. Bms1, the putative endonuclease Rc11, and the essential U3 small nucleolar RNA form a stable subcomplex that is believed to control an early step in the formation of the 40S subunit [
]. The N terminus of Bms1 contains a guanine nucleotide-binding (G) domain that functions intramolecularly. It is believed that Rc11 activates Bms1 by acting as a guanine-nucleotide exchange factor (GEF) to promote GDP/GTP exchange, and that activated (GTP-bound) Bms1 delivers Rc11 to the preribosomes [].This entry represents the N-terminal domain of Bms1.
This is a family of unknown function mainly found in bacteria and archaea. A number of this family members have a coiled coil domain at their C terminus.
Salmonella invasion protein A (SipA) is a virulence factor that is translocated into host cells by a type III secretion system. In the host cell it binds to actin, stimulates actin polymerisation and counteracts F-actin destabilising proteins. This contributes towards cytoskeletal rearrangements that allow the entry of the pathogen into the host cell [
]. The chaperone-binding domain of SipA consists of a globular domain, represented by this entry, and an adjacent nonglobular polypeptide [
]. Both of these elements are necessary for chaperone binding to occur. The globular domain is composed of eigth alpha helices arranged so that six amphipathic helices surround a predominantly hydrophobic helix in the middle.
Nuclear cap-binding protein subunit 2 (NCBP2, also known as CBC2 and CBP20) forms the CBC complex with the nuclear cap-binding protein subunit 1 (
). The CBC complex binds co-transcriptionally to the 5' cap of pre-mRNAs and is involved in maturation, export and degradation of nuclear mRNAs [
,
,
,
]. In humans, the CBC complex is also involved in mediating U snRNA and intronless mRNAs export from the nucleus and plays a central role in nonsense-mediated mRNA decay (NMD) []. During cell proliferation, the CBC complex is involved in microRNAs (miRNAs) biogenesis via its interaction with SRRT/ARS2, thereby being required for miRNA-mediated RNA interference [].
This presumed domain is functionally uncharacterised. This domain is found in eukaryotes. This domain is about 120 amino acids in length. This domain has a conserved CDCGGWD sequence motif.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [
,
]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [
,
].L15 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L15 is known to bind the 23S rRNA. Ribosomal protein, L15 from bacteria and plant chloroplasts (nuclear-encoded) belong to this family. Vertebrate L27a, Tetrahymena thermophila L29 and fungal L27a (L29, CRP-1, CYH2) also are members of this group.Ribosomal L18E protein from a number of archaebacteria show homology to both the eukaryotic L18 and eubacterial ribosomal protein L15, an observation which has been seen to substantiate the belief that archaea represent an evolutionary stage between bacteria and eukaryotes [
].This signature covers a conserved region in the C-terminal section of these proteins.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [
,
]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [
,
].The small subunit ribosomal proteins can be categorised as: primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins. The small ribosomal subunit protein S19 contains 88-144 amino acid residues. In Escherichia coli, S19 is known to form a complex with S13 that binds strongly to 16S ribosomal RNA. Experimental evidence [
] has revealed that S19 is moderately exposed on the ribosomal surface, and is designated a secondary rRNA binding protein. S19 belongs to a family of ribosomal proteins [,
] that includes: eubacterial S19; algal and plant chloroplast S19; cyanelle S19; archaebacterial S19; plant mitochondrial S19; and eukaryotic S15 ('rig' protein).
This entry represents a conserved site in the C-terminal section of the Ribosomal S15 and S19 proteins.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [
,
]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [
,
].Ribosomal protein L11 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L11 is known to bind directly to the 23S rRNA and plays a significant role during initiation, elongation, and termination of protein synthesis. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities [
], groups bacteria, plant chloroplast, red algal chloroplast, cyanelle and archaeabacterial L11; and mammalian, plant and yeast L12 (YL15). L11 is a protein of 140 to 165 amino-acid residues. L11 consists of a 23S rRNA binding C-terminal domain and an N-terminal domain that directly contacts protein synthesis factors. These two domains are joined by a flexible linker that allows inter-domain movement during protein synthesis. While the C-terminal domain of L11 binds RNA tightly, the N-terminal domain makes only limited contacts with RNA and is proposed to function as a switch that reversibly associates with an adjacent region of RNA [,
,
,
]. In E. coli, the C-terminal half of L11 has been shown [] to be in an extended and loosely folded conformation and is likely to be buried within the ribosomal structure.This entry identifies a conserved region located in the C-terminal section of these proteins.
This entry represents a group proteins from eukaryotes and bacteria that may have chaperone activity and be involved in F1 ATPase complex assembly. The eukaryotic proteins include yeast ATP12 [
] and mammalian homologue ATPAF2 (ATP synthase mitochondrial F1 complex assembly factor 2) [], which are required for assembly of the mitochondrial F1-ATPase.Mitochondrial F1-ATPase is an oligomeric enzyme composed of five distinct subunit polypeptides. The alpha and beta subunits make up the bulk of protein mass of F1. In Saccharomyces cerevisiae both subunits are synthesised as precursors with N-terminal targeting signals that are removed upon translocation of the proteins to the matrix compartment [
].
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [
,
]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [
,
].Ribosomal protein L13 is one of the proteins from the large ribosomal subunit [
]. In Escherichia coli, L13 is known to be one of the early assembly proteins of the 50S ribosomal subunit.The signature pattern of this entry is a conserved region located in the C-terminal section of these proteins.
XRCC4 is essential for non-homologous DNA end joining (NHDJ) in eukaryotes, which is required for double-strand break repair, and V(D)J recombination in immunoglobulin and T-cell receptor genes. XRCC4 forms a complex with DNA ligase IV, and acts as a regulatory element required for the stability and activity of the ligase. XRCC4 forms an elongated dumb-bell-like tetramer consisting of a C-terminal stalk that interacts with DNA ligase IV and an N-terminal globular head. The C-terminal oligomerisation domain consists of oligomers of short identical helices that form parallel coiled-coils [
,
].This superfamily also matches the C-terminal of the coiled-coil myosin heavy chain tail region. Muscle contraction is caused by sliding between the thick and thin filaments of the myofibril. Myosin is a major component of thick filaments and exists as a hexamer of 2 heavy chains [], 2 alkali light chains, and 2 regulatory light chains. The heavy chain can be subdivided into the N-terminal globular head and the C-terminal coiled-coil rod-like tail [].
Chloroplast function requires the import of nuclear encoded proteins from the cytoplasm across the chloroplast double membrane. This is accompished by two protein complexes, the Toc complex located at the outer membrane and the Tic complex loacted at the inner membrane [
]. The Toc complex recognises specific proteins by a cleavable N-terminal sequence and is primarily responsible for translocation through the outer membrane, while the Tic complex translocates the protein through the inner membrane.This entry represents Toc750, a core component of the Toc complex. This protein is deeply embedded in the outer membrane and forms the voltage-dependent translocation channel [
]. Toc75 itself appears to be capable of at least some discrimination between substrate and non-substrate proteins, with recognition based on based on both conformational and electrostatic interactions.
Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication [
]. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base []. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch []. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level []. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA. MutS is a modular protein with a complex structure [
], and is composed of:N-terminal mismatch-recognition domain, which is similar in structure to tRNA endonuclease.Connector domain, which is similar in structure to Holliday junction resolvase ruvC.Core domain, which is composed of two separate subdomains that join together to form a helical bundle; from within the core domain, two helices act as levers that extend towards (but do not touch) the DNA.Clamp domain, which is inserted between the two subdomains of the core domain at the top of the lever helices; the clamp domain has a β-sheet structure.ATPase domain (connected to the core domain), which has a classical Walker A motif.HTH (helix-turn-helix) domain, which is involved in dimer contacts.The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair. Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein [
].This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions []. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts []. This entry represents a family of MutS proteins.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [
,
]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [,
].Ribosomal protein, L27 is found in fungi, plants, algae and vertebrates
[,
].The family has a specific signature at the C terminus.
This family of proteins is found in eukaryotes. Proteins in this family are typically between 253 and 329 amino acids in length. There are two conserved sequence motifs: LLGYP and SFS.
This family includes Saccharomyces cerevisiae type 1 protein phosphatase inhibitor Ypi1 [
] and human protein phosphatase 1 regulatory subunit 11 (Ppp1r11/hcgv), an atypical E3 ubiquitin-protein ligase also known to be a PP1 inhibitor [,
].
PDDEXK_6 is a family of plant proteins that are distant homologues of the PD-(D/E)XK nuclease superfamily. The core structure is retained, as α-β-β-β-α-β. It retains the characteristic PDDEXK motifs II and III in modified forms - xDxxx motif located in the second core β-strand, where x is any hydrophobic residue, and a D/E)X(D/N/S/C/G) pattern. The missing positively charged residue in motif III is possibly replaced by a conserved arginine in motif IV located in the proceeding α-helix []. The family is not in general fused with any other domains, so its function cannot be predicted [].This family of uncharacterised plant proteins are defined by a region found toward the C terminus. This region is strongly conserved (greater than 30 % sequence identity between most pairs of members) but flanked by highly divergent regions including stretches of low-complexity sequence.
This presumed domain is functionally uncharacterised. This domain is found in eukaryotes. This domain is about 50 amino acids in length. This domain has two completely conserved residues (Y and K) that may be functionally important.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [
,
]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [
,
].Ribosomal protein S16 is one of the proteins from the small ribosomal subunit. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:
Eubacterial S16.Algal and plant chloroplast S16.Cyanelle S16.Neurospora crassa mitochondrial S24 (cyt-21).S16 proteins have about 100 amino-acid residues. There are two paralogues in Arabidopsis thaliana, RPS16-1 (chloroplastic) and RPS16-2 (targeted to the chloroplast and the mitochondrion) [].This superfamily represents the structural domain of ribosomal S16 proteins, consisting of an α-β 2 layer sandwich.
This domain family is found in eukaryotes, and is approximately 40 amino acids in length. There is a single completely conserved residue N that may be functionally important.