DAZ associated protein 2 has a highly conserved sequence throughout evolution including a conserved polyproline region and several SH2/SH3 binding sites. It occurs as a single copy gene with a four-exon organisation and is located on chromosome 12. It encodes a ubiquitously expressed protein and binds to DAZ and DAZL1 through DAZ repeats [
,
].
This entry represents the non-structural protein NS6 (also known as ORF6, accessory protein 6, or X3 protein), which is highly conserved among SARS-related coronaviruses [
] (this is distinct from NSP6 which is encoded on the replicase polyprotein). NS6 is located in the endoplasmic reticulum []. It has been reported that NS6 can increase the cellular gene synthesis and it can also induce apoptosis through Jun N-terminal kinase and Caspase-3 mediated stress. This protein can modulate host antiviral responses by inhibiting synthesis and signalling of interferon-beta (IFN-beta) via two complementary pathways. One involves NS6 interaction with host N-Myc (and STAT) interactor (Nmi) protein inducing its degradation via ubiquitin proteasome pathway, suppressing Nmi enhanced IFN signalling. The other pathway suppresses the translocation of signal transducer and activator of transcription 1 (STAT1) and downstream IFN signalling [].This protein interacts with Rae1 and Nup98 to prevent both nuclear import and export, which renders host cells incapable of responding to SARS-CoV-2 infection [
].
All herpesviruses contain a Ubiquitin (Ub)-specific cysteine protease (USP) domain embedded within their large tegument protein. The herpesvirus tegument ubiquitin (Ub)-specific protease (htUSP) domain of ~200 amino acids adopts an α-β-alpha sandwich fold that features a central catalytic cleft, ideally suited to accommodate the C-terminal stretch of Ub. The catalytic triad Cys-His-Asp is strictly conserved, along with a putative oxyanion hole-forming Gln residue. The htUSP domain is a member of peptidase family C76 of clan CA [
,
,
,
,
].BPLF1, the Epstein-Barr-virus-encoded member of this protease family, is a deneddylase that regulates virus production by modulating the activity of cullin-RING ligases. Homologues encoded by other herpesviruses share the deneddylase activity [
].
This entry includes cytoplasmic envelopment protein 3 (or pp28) from herpesviruses. The pp28 protein from human cytomegalovirus is encoded by the UL99 gene. Protein pp28 is myristylated and found in the tegument of the virus particle [
]. Mutation studies have shown that pp28 is essential for the virus to acquire is final membrane. The pp28 protein can be detected in a vesicle budded off from the Golgi, where the virus acquires its outer membrane []. In order to be correctly localized, pp28 forms a complex with UL94 [].
This entry includes the ORF8 gene products (also known as NS8, accessory protein 8) from human SARS coronavirus (SARS-CoV), SARS-CoV-2, Bat coronavirus HKU3 and pangolin coronaviruses [].ORF8 is an accessory protein that is not shared by all members of subgenus sarbecovirus. The presence and location of ORF8 in the SARS-CoV-2 genome has led its classification with SARS-CoV [
,
]. ORF8 is a potential pathogenicity factor which evolves rapidly to counter the immune response and facilitate the transmission between hosts []. ORF8 has been suggested to be one of the relevant genes in the study of human adaptation of the virus [,
].The ORF8 protein is a fast-evolving protein in SARS-related CoVs, with a tendency to recombine and undergo deletions. During the early phases of the SARS (SARS-CoV) epidemic in 2002, human isolates were found to possess a unique continuous ORF8 with 366 nucleotides and a predicted protein with 122 amino acids. During the middle and late phases of the SARS epidemic, two functional ORFs (ORF8a and ORF8b) were emerged; they are predicted to encode two small proteins, 8a with 39 amino acids and 8b with 84 amino acids. Interestingly, SARS-CoV-2 ORF8 has not undergone any significantly measurable deletion events, so its function as a full-length protein might be more important to its pathogenicity [
]. ORF8 plays a role in modulating host immune response [] which may act by down-regulating major histocompatibility complex class I (MHC-I) []. It may inhibit expression of some members of the IFN-stimulated gene (ISG) family including hosts IGF2BP1/ZBP1, MX1 and MX2, and DHX58 []. ORF8 also binds to IL17RA receptor, leading to IL17 pathway activation and an increased secretion of pro-inflammatory factors, contributing to cytokine storm during COVID-19 infection [].
Insect cuticles are composite structures whose mechanical properties are optimised for biological function. The major components are the chitin filament system and the cuticular proteins, and the cuticle's properties are determined largely by the interactions between these two sets of molecules. The proteins can be ordered by species.
This entry represents the N-terminal domain of Torsin-1A-interacting proteins 1 and 2 (TOIP1/2) also known as LAP1 proteins (Lamina-associated polypeptide 1), which are type 2 integral membrane proteins with a single membrane-spanning region of the inner nuclear membrane [
,
,
]. These proteins interact with and activate Torsin A, an AAA ATPase localized to the endoplasmic reticulum (ER), through a perinuclear domain and forms a heterohexameric (LAP1-Torsin)3 ring that targets Torsin to the nuclear envelope. TOIP1/2 has an atypical AAA fold and provides an arginine finger to the Torsin A active site to promote its ATPase activity [
,
]. A single mutation in Torsin A causes early onset primary dystonia, a painful and severely disabling neuromuscular disease [,
]. This domain contains transmembrane helices.
This entry identifies a family of proteins, around 100 amino acids in length, that include a predicted signal sequence and a perfectly conserved motif, RAQPRD, towards the C terminus. They are found in the Pseudomonas putida TOL plasmid pWW0 and in cryptic plasmid regions of Salmonella enterica subsp. enterica serovar Typhi and Pseudomonas syringae pv. tomato str. DC3000. The function of these proteins is unknown.
Pex26 is a type II peroxisomal membrane protein that recruits Pex6-Pex1 complexes to peroxisomes [
]. Mutations in the Pex26 gene cause peroxisome biogenesis disorder complementation group 8 (PBD-CG8) and peroxisome biogenesis disorder 7A/B (PBD7A/B) [].
The entry includes the mitochondrial 28S ribosomal protein S30, a component of the mitochondrial ribosome small subunit (28S) [], and the mitochondrial 39S ribosomal protein S30, a component of the mitochondrial large ribosomal subunit (mt-LSU) [,
]. The mature mammalian 55S mitochondrial ribosome consists of a small (28S) and a large (39S) subunit [].
Members of this protein family are found in a modest number of non-pathogenic Gram-positive bacteria, including three species of Lactococcus. Three paralogues exist in Clostridium acetobutylicum. This protein appears related to the conserved core region of a family of proposed transpeptidases, exosortase (previously EpsH), thought to act on PEP-CTERM proteins. The protein has been assigned the gene symbol XrtG (eXosoRTase family protein of Gram-positives).
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [
,
]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [
,
].Members of this family are components of the mitochondrial large ribosomal subunit. Mature mitochondrial ribosomes consist of a small (37S) and a large (54S) subunit. The 37S subunit contains at least 33 different proteins and 1 molecule of RNA (15S). The 54S subunit contains at least 45 different proteins and 1 molecule of RNA (21S) [
,
].
This entry represents the N-terminal domain of Motility protein A (MotA). MotA is a membrane protein that forms the stator element of the flagellar motor complex together with MotB with 5:2 stoichiometry, which couples ion flow across the cytoplasmic membrane to generate torque, and thus, it is required for rotation of the flagellar motor. This domain contains conserved torque-generating charged residues at the C-terminal [
] and is adjacent , which represents MotA, TMH3 and TMH4.
This family consists of bacterial and archaeal proteins. MA1715 from the archaeon Methanosarcina acetivorans is required for optimal growth with sulfide as the sole sulfur source and has been shown to support both cysteine and homocysteine biosynthesis [
].
This entry represents the C-terminal domain of Tab2 from plants and cyanobacteria. Tab2 was first identified in Chlamydomonas reinhardtii ([swissprot]:Q7X8Y6) as a RNA-binding protein required for translation of the chloroplast PsaB photosystem I subunit []. Later, the Tab2 homologue from Arabidopsis (ATAB2) was found involved in the signalling pathway of light-controlled synthesis of photosystem proteins during early plant development, presumably functioning as an activator of translation with targets at PSI and PSII [,
]. Directed mutagenesis experiments carried out in Tab2 from C.reinhardtii indicated the importance of a highly conserved C-terminal tripeptide WLL for normal psaB translation [].
This entry represents the N-terminal domain of proteins from photosynthetic organisms including plants and cyanobacteria, such as Tab2 proteins. Tab2 was first identified in Chlamydomonas reinhardtii as a RNA binding protein required for translation of the chloroplast PsaB photosystem I subunit [
]. Later, the Tab2 homologue (ATAB2) from Arabidopsis was found involved in the signalling pathway of light-controlled synthesis of photosystem proteins during early plant development [], presumably functioning as an activator of translation with targets at PSI and PSII [].
Pentatricopeptide repeat-containing protein PPR5-like
Type:
Family
Description:
This entry represents a group of plant pentatricopeptide repeat-containing proteins, including PPR5 from maize and rice and At2g30780/At4g39620/At2g48000/At3g06430 from Arabidopsis. PPR5 is a chloroplastic protein that binds to single-stranded RNA and influences both RNA stability and splicing [
]. It has been shown to stabilise the chloroplast trnG pre-RNA by directly binding to a group II intron, where it protects an endonuclease-sensitive site and stimulates splicing []. At4g39620 is involved in embryo morphogenesis []. At3g06430, also known as PPR2 and EMBRYO DEFECTIVE 2750 (EMB2750), binds to plastid 23S rRNA and plays an important role in the first mitotic division during gametogenesis and in cell proliferation during embryogenesis. [].
This entry represents a group of plant proteins, including protein LAX PANICLE 2 (LAX2) from rice. LAX2 is a nuclear protein that acts together with LAX1 in rice to regulate the process of axillary meristem formation [
].
SusE and SusF are two outer membrane proteins composed of tandem starch specific carbohydrate-binding modules (CBMs) with no enzymatic activity. They are are likely to play an important role in starch metabolism in Bacteroides. It has been speculated that they could compete for starch in the human intestinal tract by sequestering starch at the bacterial surface and away from competitors [
]. SusE has higher affinity for starch compared to SusF.
This entry includes a group of plant proteins, including AtPHOS32 and AtPHOS34 from Arabidopsis. They show similarity to bacterial universal stress protein A. AtPHOS32 is a substrate of MAP kinases 3 and 6 [
].
This entry represents the N terminus of intron-binding protein aquarius, a splicing factor which links excision of introns from pre-mRNA with snoRP assembly [
].
The yqgB and yqfZ genes are associated with the genomes of bacteria with distinct pathogenic properties and consequently fall into the category of being virulence genes [
]. However, yqgB and yqfZ genes are not true virulence factors but instead are probably lifestyle determinant genes where the gene products act in concert, enabling the bacteria to cope with its suboptimal physical environment and thus facilitating host colonization [].
This entry includes IQM1-6 from Arabidopsis. They contain the IQ motif, IQxxxRGxxxR, which is one of a few recognition motifs for CaM-binding protein. Besides one copy of IQ motif, they share domains related to pea heavy-metal induced protein 6A and ribosome-inactivating protein, trichosanthin, which is able to bind to and remove adenine residues from rRNA [
]. This entry also includes some uncharacterized bacterial and fungal proteins.
This entry represents proteins that are confined to Rickettsia. The family member RP853 is thought to be a transmembrane protein, but is currently uncharacterised.
This entry represents proteins that are mainly confined to Rickettsia and Orientia. The proteins, which include RP364 and RC0048, are currently uncharacterised.
Spore morphogenesis and germination protein YwcE is required for proper spore morphogenesis and germination. It is repressed by abrB during growth and activated at the onset of sporulation in a spo0A-dependent manner. This protein has similar features to holin [
].
This entry represents Meiosis-expressed gene 1 protein and its homologues, encoded by the gene MEIG1. It may be involved in germ cell differentiation during meiotic prophase. Sequence analysis predicts this protein to be 10.8kDa in size and to be highly charged and lysine rich [
].
THYLAKOID ASSEMBLY 8 (THA8) is a pentatricopeptide repeat (PPR)-containing protein that mediates group II organellar RNA introns splicing. It is required for embryo development in Arabidopsis [
]. In maize, mutation of the THA8 gene causes defects in the biogenesis of chloroplast thylakoid membranes [].
This entry represents thioredoxin-like protein CITRX and related proteins from plants. In Arabidopsis, TRX z (At3g06730) is a thiol-disulfide oxidoreductase that plays a role in chloroplast development [
]. It interacts with fructokinase-like protein 1 (FLN1) represent subunits of the plastid-encoded RNA polymerase (PEP), suggesting a role of chloroplast gene expression [].
This family consists of several bacterial YebG proteins of around 75 residues in length. The exact function of this protein is unknown but it is thought to be involved in the SOS response. The induction of the yebG gene occurs as cell enter into the stationary growth phase and is dependent on is dependent on cyclic AMP and H-NS [
].
ZapC (Z-associated protein C) contributes to the efficiency of the cell division process by in maintaining FtsZ ring stability during cell division [
,
]. It consists of a N-terminal α/β domain which contains a pocket, termed the N-domain pocket, lined with residues important for ZapC function as an FtsZ bundler and a C-terminal domain which contains an additional pocket (C-domain pocket), with a hydrophobic centre surrounded by conserved basic residues, critical for FtsZ binding [,
].
This family consists of several bacterial and archaeal AroM proteins. In Escherichia coli the aroM gene is cotranscribed with aroL [
]. The function of this family is unknown.
This family consists of several bacterial proteins of around 210 residues in length, including YqiJ from E. coli. The function of this family is unknown.
The Gam protein, originally characterised in Bacteriophage Mu, protects linear double stranded DNA from exonuclease degradation
in vitroand
in vivo[
]. This protein is also found in many bacterial species as part of a suspected prophage. Further studies have shown that Gam is a functional counterpart of the eukaryotic Ku protein, which has key roles in DNA repair and in certain transposition events.Gam displays DNA binding characteristics remarkably similar to those of human Ku [
]. In addition, Gam can interfere with Ty1 retrotransposition in Saccharomyces cerevisiae (Baker's yeast). These data reveal structural and functional parallels between bacteriophage Gam and eukaryotic Ku and suggest that their functions have been evolutionarily conserved [].
Cytoplasmic polyadenylation element-binding protein
Type:
Family
Description:
CPEB family members are RNA-binding proteins that are essential regulators of post-transcriptional gene expression. CPEBs target mRNAs defined by a cis-acting sequence in their 3' untranslated region (UTR), known as cytoplasmic polyadenylation element (CPE; with a consensus sequence of UUUUUAU). They are indirectly responsible for both translational repression and translational activation by polyadenylation [
]. They are involved in may biological processes such as cell proliferation, senescence, cell polarity, and synaptic plasticity []. They regulate the translation of maternal mRNAs controlling meiotic cell cycle progression. Mammals have four CPEBs, CPEB1-4, among which CPEB1/2/4 are essential to successful mitotic cell division []. All four paralogues have two RNA recognition motifs (RRMs) and a zinc-binding domain (ZZ domain).Drosophila CPEB homologue is known as Orb2, which may have a role in long-term memory formation [
].
This entry represents the multidrug resistance protein MdtB. MdtA is one subunit of the MdtABC tripartate complex which is a RND-type drug exporter. The complex consists of MdtB/C, which form a transmembrane heteromultimer and MdtA which is a membrane fusion protein [
].
This entry represents Chaperone modulatory protein CbpM from Escherichia coli and similar proteins from Gammaproteobacteria. CbpM (named after "CbpA modulator", [
]) interacts with the N-terminal J-domain of CbpA, forms a stable complex and specifically inhibits its activity [,
]. Together with CbpA, it modulates the activity of the DnaK chaperone system []. It does not inhibit the co-chaperone activity of DnaJ. The crystal structure of CbpM shows similarity to members of the MerR family of transcriptional regulators [].
This entry represents lactate utilization protein A, which is involved in L-lactate degradation and allows cells to grow with lactate as the sole carbon source.
This entry represents the multidrug resistance protein MdtA. MdtA is one subunit of the MdtABC tripartite complex which is a RND-type drug exporter. The complex consists of MdtB/C, which form a transmembrane heteromultimer and MdtA which is a membrane fusion protein [
]. The mdtABC complex confers resistance against novobiocin and deoxycholate [].
This entry represents lactate utilization protein C, which is involved in L-lactate degradation and allows cells to grow with lactate as the sole carbon source [
].
This family consists of several SseB proteins which appear to be found exclusively in Enterobacteria. SseB is known to enhance serine-sensitivity in Escherichia coli [
] and is part of the Salmonella pathogenicity island 2 (SPI-2) translocon []. This entry contains the presumed N-terminal domain of SseB.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [
,
]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [
,
].This entry represents archaeal ribosomal S4 proteins.
This family represents Ycf35, which is encoded in algal chloroplast and in cyanobacteria. The function of these proteins are unknown. As the family is exclusively found in phototrophic organisms it may play a role in photosynthesis.
Lamtor4 is a component of the Ragulator complex, a complex that functions as a guanine nucleotide exchange factor for the Rag GTPases that signal amino acid levels to mTORC1. The mTOR Complex 1 (mTORC1) pathway regulates cell growth in response to numerous cues. Lamtor4 is required for mTORC1 activation by amino acids [
]. The Rag-Ragulator complex also has functions independent of mTOR signaling; RagA and Lamtor4 are essential regulators of lysosomes in microglia [].
This entry consists of several phage scaffolding proteins involved in the icosahedric procapsid assembly [
]. They co-assemble with the capsid proteins to form the procapsid, in which the scaffolding protein is found within the external shell of icosahedrally arranged capsid protein subunits. In a subsequent step the scaffolding protein molecules are released from the procapsid [].
There is currently no experimental data for members of this family or their homologues. Agrobacterium tumefaciens GguC is encoded by the same operon as ChvE (periplasmic sugar-binding protein that transmits the signal to the VirA/VirG signalling system), GguA and GguB (putative ABC-type sugar transporter components) [
]. However, the function of GguC is not known yet.
This set of conserved hypothetical protein has a phylogenetic range that closely matches that of
, and has a putative C-terminal protein targeting signal.
This domain is found at the extreme C terminus of FimV from Pseudomonas aeruginosa, and of TspA of Neisseria meningitidis [
]. Disruption of the former blocks twitching motility from type IV pili, which suggests a role in peptidoglycan layer remodeling required by type IV fimbrial systems [].
This region is found at, or about 200 amino acids from, the N terminus of FimV from Pseudomonas aeruginosa, TspA of Neisseria meningitidis, and related proteins [
]. Disruption of FimV blocks twitching motility from type IV pili; which suggests a role for this family in peptidoglycan layer remodelling required by type IV fimbrial systems []. Most but not all members of this protein family have a C-terminal region recognised by . In between is a highly variable, often repeat-filled region rich in the negatively charged amino acids Asp and Glu.
This entry represents the flagellar biosynthetic protein FlhF. The assembly of flagella is a multi-step process and relies on a complex type III export machinery located in the cytoplasmic membrane. The FlhF protein is essential for the placement and assembly of polar flagella and has been classified as a signal-recognition particle (SRP)-type GTPase [
,
]. It is similar to the 54 kd subunit (SRP54) of the SRP that mediates the transport to or across the plasma membrane in bacteria and the endoplasmic reticulum in eukaryotes [,
]. SRP recognises N-terminal signal sequences of newly synthesized polypeptides at the ribosome. The SRP-polypeptide complex is then targeted to the membrane by an interaction between SRP and its cognate receptor (SR). FlhF activities and the net effect of FlhF on flagellation phenotypes appear to be different among polar flagellates [
].
This entry consists of uncharacterised proteins. All members so far represent bacterial genes found in apparent phage or otherwise laterally transferred regions of the chromosome. Tentatively identified neighbouring proteins tend to be phage tail region proteins. In some species, including Photorhabdus luminescens subsp. laumondii TTO1, several members of this family may be encoded near each other.
MEX1 is a plastid envelope maltose transporter. Together with pGlcT, it contributes significantly to the export of starch degradation products from chloroplasts [
].This plant family also includes Maltose excess protein 1-like, of unknown function.
This entry includes APC membrane recruitment protein 1/2/3 (Amer1/2/3). Amer1, also known as WTX, binds to the tumour suppressor adenomatous polyposis coli and acts as an inhibitor of Wnt signaling by inducing beta-catenin degradation [
]. Amer2 is a negative regulator of Wnt/beta-catenin signaling involved in neuroectodermal patterning [].WTX is a novel gene mutated in a proportion of Wilms' tumors and in patients suffering from sclerosing bone dysplasia [].
The members of this family are similar to gene product 10 (gp10) of bacteriophage T4. gp10 is a peripheral baseplate protein that is part of the tail fibre network [
]. It functions as a molecular lever that rotates and extends the hinged short tail fibres to facilitate cell attachment [].
The members of this family are similar to gene products 9 (gp9) and 10 (gp10) of bacteriophage T4. Both proteins are components of the viral baseplate [
]. Gp9 connects the long tail fibres of the virus to the baseplate and triggers tail contraction after viral attachment to a host cell. The protein is active as a trimer, with each monomer being composed of three domains. The N-terminal domain consists of an extended polypeptide chain and two alpha helices. The alpha1 helix from each of the three monomers in the trimer interacts with its counterparts to form a coiled-coil structure. The middle domain is a seven-stranded β-sandwich that is thought to be a novel protein fold. The C-terminal domain is thought to be essential for gp9 trimerisation and is organised into an eight- stranded antiparallel β-barrel, which was found to resemble the 'jelly roll' fold found in many viral capsid proteins. The long flexible region between the N-terminal and middle domains may be required for the function of gp9 to transmit signals from the long tail fibres []. Together with gp11, gp10 initiates the assembly of wedges that then go on to associate with a hub to form the viral baseplate [].
This family includes baseplate wedge protein gp7 (gp7) from bacteriophage T4 and its relatives. Contractile tail bacteriophages use a multiprotein tubular apparatus resembling a coiled spring wound round a rigid tube to attach to and penetrate host cell membranes. The structure known as the baseplate replays the contraction signal to the sheath [
,
]. gp7 is an intermediate (or inner) baseplate protein involved in tail assembly []. It also forms a complex with gp25 and two molecules of gp6 which is involved in sheath contraction [].
This family includes baseplate wedge protein gp6 (gp6) from bacteriophage T4 and its relatives. Contractile tail bacteriophages use a multiprotein tubular apparatus resembling a coiled spring wound round a rigid tube to attach to and penetrate host cell membranes. The structure known as the baseplate replays the contraction signal to the sheath [
,
]. gp6 is located next to the tail tube and is an intermediate (or inner) baseplate protein involved in tail assembly [,
]. Two molecules of gp6 form a complex with gp25 and gp7 which is involved sheath contraction [].
The elongator complex is a major component of the RNA polymerase II (RNAPII) holoenzyme responsible for transcriptional elongation. It binds to both naked and nucleosomal DNA, can acetylate both core and nucleosomal histones, and is involved in chromatin remodelling [
]. It acetylates histones H3, preferentially at 'Lys-14', and H4, preferentially at 'Lys-8'. ELP3 is required for the complex integrity and for the association of the complex with nascent RNA transcript. ELP3 is thought to act as a highly conserved histone acetyltransferase (HAT) capable of acetylating core histones in vitro, however, it is clearly a multi-domain protein. The HAT activity is thought to be present only in the C-terminal GNAT domain (histone acyltransferase domain). Recent work [] suggests that both the histone acetyltransferase and radical S-adenosylmethionine domains are essential for function, although the exact role of the Radical SAM domain is still unclear. The radical SAM domain is important for the structural integrity of the protein complex, and in yeast (previously demonstrated) []. However, an alternative may be that ELP3 binds and cleave SAM, as seen in the archaean M. jannaschii. It has also been shown in previous studies that the mouse ELP3 does not require the histone acyltransferase domain for zygotic paternal genome demethylation [
,
,
,
].The archaeal protein which is the homologue of the third subunit of the eukaryotic elongator complex (Elp3), catalyses the tRNA wobble uridine modification at C5: the same reaction as the eukaryotic elongator complex. The proposed mechanism of action by Elp3 represents an unprecedented chemistry performed on acetyl-CoA in which the methyl group of the acetly-CoA is activated by the 5'-deoxyadenosyl radical. This then adds to the uridine of tRNA [
]. Some bacterial ELP3 homologues are known as tRNA uridine(34) acetyltransferase [].
This protein family is highly conserved, but its function is unknown. It can be isolated from HeLa cell nucleoli and is found to be homologous with Leydig cell tumour protein whose function is unknown [
].
The proteins in this entry are represented by the Escherichia coli (strain K12) protein FtsB. FtsB is one of the three membrane proteins required for cell division in E. coli: FtsQ, FtsL and FtsB, all of these localize to the cell septum. FtsL and FtsB contain a leucine zipper-like sequence and are dependent on each other for their localization to the septum and each of them is dependent on FtsQ. FtsQ is found at the cell division site in the absence of FtsL and FtsB and requires FtsK for its localization [
].
DivIC was identified as a gene required for vegetative and sporulation septum formation in Bacillus subtilis [
]. DivIC forms a trimeric complex with another two proteins that are essential for cell division, DivIB and FtsL. The trimeric complex localizes at the septum and is regulated during the cell cycle through controlled formation of the DivIC/FtsL heterodimer []. DivIC seems to stabilizes FtsL against RasP cleavage [].
The entry represents the N-terminal domain found in gene II protein (G2P) from Enterobacteria phage M13 and similar proteins from bacteriophages and bacterial prophages. A much shorter protein of unknown function, translated from a conserved in-frame alternative initiator, is designated gene X protein (G10P). Proteins containing this domain are involved in viral DNA replication.
The entry represents the C-terminal domain found in gene II protein (G2P) from bacteriophage. A much shorter protein of unknown function, translated from a conserved in-frame alternative initiator, is designated gene X protein (G10P). Proteins containing this domain are involved in viral DNA replication. This domain roughly covers the G10P region.
This entry represents a region of natively unstructured but highly conserved sequence found in the multiple-PDZ-containing domain proteins in higher eukaryotes. It lies between two PDZ domains. The function is not known.
Proteins in this entry are Actinobacterial proteins of about 150 amino acids in length, with three predicted transmembrane helices and an unusual motif with consensus sequence PGPGW.
Members of this family are encoded between the genes for TrbJ and TrbL of P-type plasmid conjugal transfer systems, and therefore are TrbK, a member of a guild of unrelated TrbK protein families. The similarly located TrbK of plasmid RP4 functions in entry exclusion, and the current family may as well, despite lacking any detectable homology. Members of this family include TrbK of the Ti plasmid from Agrobacterium, shown not to be required for transfer, which would be consistent with a role in entry exclusion rather than transfer itself. An entry exclusion function for TrbK of the Ti plasmid has been suggested [
]. This small protein shares close C-terminal sequence homology to the much longer protein encoded by the neighbouring gene TrbJ.
This family consists of several Enterobacterial PsiA proteins. The function of PsiA is unknown although it is thought that it may affect the generation of an SOS signal in Escherichia coli [
].
FERONIA/SIRENE is a plant protein kinase that mediates the female control of male gamete delivery during fertilization, including growth cessation of compatible pollen tubes ensuring a reproductive isolation barriers, by regulating MLO7 subcellular polarization upon pollen tube perception in the female gametophyte synergids [
,
,
]. It is required for cell elongation during vegetative growth, mostly in a brassinosteroids- (BR-) independent manner. FERONIA is a positive regulator of auxin-promoted growth that represses the abscisic acid (ABA) signaling via the activation of ABI2 phosphatase []. It is required for RALF1-mediated extracellular alkalinization in a signaling pathway preventing cell expansion [].
Uncharacterised protein C20orf85 is part of a family found in eukaryotes. It is 137 amino acids long in protein sequence length and mass is approximately 15.7kDa. The protein is present in the normal lung epithelium, but absent or downregulated in most primary non-small lung cancers. The gene is known as Low in Lung Cancer 1 (LLC1). This protein is thought to have a role in the maintenance of normal lung function and its absence may lead to lung tumourigenesis [
].
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [
,
]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [
,
].L35 is a basic protein of 60 to 70 amino-acid residues from the large subunit [
]. Like many basic polypeptides, L35 completely inhibits ornithine decarboxylase when present unbound in the cell, but the inhibitory function is abolished upon its incorporation into ribosomes []. It belongs to a family of ribosomal proteins, including L35 from bacteria, plant chloroplast, red algae chloroplasts and cyanelles. In plants it is a nuclear encoded gene product, which suggests a chloroplast-to-nucleus relocation during the evolution of higher plants [].This entry represents L35 from the metazoa, which is conserved from nematodes to humans [
].
This entry includes Protein arginine N-methyltransferase SFM1 from Saccharomyces cerevisiae and similar proteins from fungi and archaea. Sfm1 is a S-adenosyl-L-methionine-dependent protein-arginine N-methyltransferase that monomethylates ribosomal protein S3 (RPS3) at 'Arg-146' [
].
This family represents a group of plant proteins containing a RING-H2 finger, including ATL20, ATL21A-C and ATL22. ATL20 has been predicted as an ubiquitin ligase [
].