Each member of this group contains five functional domains catalysing five sequential steps of the shikimate pathway. Each domain corresponds to a monofunctional prokaryotic enzyme of the same pathway: 3-dehydroquinate synthase, 3-dehydroquinate dehydratase, shikimate 5-dehydrogenase, shikimate kinase, and 3-phosphoshikimate 1-carboxyvinyltransferase. The multifunctional gene is regulated in response to amino acid limitation by coordinate regulation, which provides an economical means of simultaneously synthesizing five shikimate pathway enzymes [
].
TTMP (C3orf52) was identified as a gene upregulated after treatment with the potent tumour promoter TPA (12-O-Tetradecanoylphorbol-13-acetate). It encodes a single-pass transmembrane protein localized to the endoplasmic reticulum [
].
AKIP1 regulates the effect of the cAMP-dependent protein kinase (PKAc) signalling pathway on the NF-kB activation cascade [
,
]. AKIP1 binds PKA [] and co-localises with the NF-kB subunit p65 []. In humans, there are three splice variants, whereas only the full-length protein is present in rodents []. AKIP1 seems to act as a modulator of PKA in NF-kappaB signalling; different splice variants have been shown to repress [] or enhance NF-kB transcription [].
MLC1 is a membrane protein, mainly expressed in brain astrocytes, that plays a role in astrocyte osmo-homeostasis [
]. Mutations in the MLC1 gene are linked to a rare leukodystrophy known as megalencephalic leukoencephalopathy with subcortical cysts (MLC). Several studies suggest that MLC1 plays a role in the regulation of ion and water fluxes and cell volume, but its exact function is not known []. It binds the beta-1 subunit of the Na,K-ATPase enzyme [], and interacts with the calcium permeable channel TRPV4 [] and the vacuolar ATPase (V-ATPase) [].
This entry represent a group of transposase-like proteins, such as protein DAYSLEEPER (At3g42170) from Arabidopsis [
] and RICESLEEPER 1-4 from rice []. Protein DAYSLEEPER is essential for plant development and can also regulate global gene expression []. Proteins in this family are derived from a hAT-superfamily transposon and contains many of the features found in the coding sequence of these elements [].
This entry represent s a group of co-chaperone proteins, including p23 (also known as Sba1) from budding yeasts, Wos2 from fission yeasts, prostaglandin E synthase 3 from humans and Atp23-1/2 from Arabidopsis [
]. p23 contains a folded domain and a long unstructured C-terminal tail. p23 and its homologues are co-chaperones of Hsp90 [,
] and may be part of the core chaperone cycle required for efficient client activation []. It may also has Hsp90-independent functions such as the regulation of gene expression by modulating chromatin remodeling and ribosomal biogenesis [,
].
Gemin4 is one of the core components of the SMN complex, which contains survival motor neuron (SMN), the seven Gemin proteins (Gemin2-8), and Unrip. The complex is closely associated with the splicing small nuclear ribonucleoproteins (U snRNPs), which are the subunits of the spliceosome and consist of a U snRNA, the Sm core proteins and several proteins unique to the various RNPs [
]. SMN and Gemin4 both interact specifically with the U snRNA and the Sm core proteins, suggesting they facilitate the assembly by forming a bridge between the two components [,
,
].The Gemin4 protein has also been referred to an important molecule in the RNA-induced silencing complex (RISC) that participates in the mature process of miRNAs [
].
DA26 (E26) is a multifunctional protein. One form of E26 associates with viral DNA or DNA binding proteins, while a second form associates with intracellular membranes [
]. It functions to sort occlusion-derived virus envelope proteins to the inner nuclear membrane [].
This entry includes fasciclin-like arabinogalactan proteins (FLAs) from plants. FLAs are a subclass of arabinogalactan proteins involved in plant growth, development and response to abiotic stress [
,
]. They not only contain one or two AGP-like glycosylated regions, but also include one or two fasciclin (FAS) domains, which is an ancient cell adhesion domain conserved from bacteria to animals [].The FLAs can be classified in four groups (A-D) based on phylogenetic analysis [
]. This entry represents a group that includes FLA1-8, FLA10 and FLA14.
This entry represents a group of plant F-box proteins, including AT1G61340 (FBS1) , AT4G21510 (FBS2/F-box protein SKIP27), AT4G05010 (FBS3) and AT4G35930 (FBS4) from Arabidopsis. AtFBS1 has been shown to interact with ASK1, the component of the SCF complex that binds the F-box. AtFBS1 and AtFBS3 are induced in response to osmotic stress, hormone treatment or A. brassicicola infection [
].
This entry represents a group of Leucine rich-repeat (LRR) receptor-like proteins (RLPs) from plants. In Arabidopsis, 57 members have been described, some of them are known to be involved in basal developmental processes, whereas others are involved in defence responses [
]. This family includes RLP12, RLP44, RLP30 and RLP35 from Arabidopsis. They are involved in perception of extracellular signals. RLP12 is involved in the perception of CLV3 and CLV3-like peptides, that act as extracellular signals regulating meristems maintenance []. RPL30 is a receptor for microbe-associated molecular patterns (MAMPs) that induces a BAK1-dependent basal immune response to necrotrophic fungi (e.g. S.sclerotiorum) in the presence of MAMPs (e.g. flg22 and SCLEROTINIA CULTURE FILTRATE ELICITOR1 (SCFE1) from the necrotrophic fungal pathogen S. sclerotiorum). It seems that RLP30 functionality depends on the presence of the receptor kinase SOBIR1 as an adapter protein [,
]. It is required for full non-host resistance to bacterial pathogens []. RLP44 mediates the response to pectin modification by activating brassinosteroid signaling and is important for the regulation of xylem fate [,
].This family also includes some uncharacterised sequences from bacteria.
Glutaredoxins [
,
,
], also known as thioltransferases (disulphide reductases), are small proteins of approximately one hundred amino-acid residues which utilise glutathione and NADPH as cofactors. Oxidized glutathione is regenerated by glutathione reductase. Together these components compose the glutathione system [].Glutaredoxin functions as an electron carrier in the glutathione-dependent synthesis of deoxyribonucleotides by the enzyme ribonucleotide reductase. Like thioredoxin (TRX), which functions in a similar way, glutaredoxin possesses an active centre disulphide bond [
]. It exists in either a reduced or an oxidized form where the two cysteine residues are linked in an intramolecular disulphide bond. It contains a redox active CXXC motif in a TRX fold and uses a similar dithiol mechanism employed by TRXs for intramolecular disulfide bond reduction of protein substrates. Unlike TRX, GRX has preference for mixed GSH disulfide substrates, in which it uses a monothiol mechanism where only the N-terminal cysteine is required. The flow of reducing equivalents in the GRX system goes from NADPH ->GSH reductase ->GSH ->GRX ->protein substrates [
,
,
,
]. By altering the redox state of target proteins, GRX is involved in many cellular functions including DNA synthesis, signal transduction and the defense against oxidative stress.Glutaredoxin has been sequenced in a variety of species. On the basis of extensive sequence similarity, it has been proposed [
] that Vaccinia virus protein O2L is most probably a glutaredoxin. Finally, it must be noted that Bacteriophage T4 thioredoxin seems also to be evolutionary related. In position 5 of the pattern T4 thioredoxin has Val instead of Pro.NrdH-redoxin is a representative of a class of small redox proteins that contain a conserved CXXC motif and are characterised by a glutaredoxin-like amino acid sequence and thioredoxin-like activity profile. Unlike other the glutaredoxins to which it is most closely related, NrdH apparently does not interact with glutathione/glutathione reductase, but rather with thioredoxin reductase to catalyze the reduction of ribonucleotide reductase [
].
This entry represents nucleoid occlusion proteins belonging to the parB family. Nucleoid occlusion protein effects nucleoid occlusion by binding relatively nonspecifically to DNA and preventing the assembly of the division machinery in the vicinity of the nucleoid, especially under conditions that disturb the cell cycle. It helps to coordinate cell division and chromosome segregation by preventing the formation of the Z ring through the nucleoid, which would cause chromosome breakage [
].
The Saccharomyces cerevisiae EOS1 (Endoplasmic Reticulum-localised and Oxidants Sensitive) gene is not essential for cell growth, but the EOS1 deletion strain YNL080c confers, to some degree, a slow growth phenotype [
]. The gene product, localised to the ER membrane, is involved in oxidative stress tolerance [], and may also be involved in N-glycosylation of cellular proteins. Its deduced amino acid sequence contains four potential transmembrane (TM) domains. Although its precise molecular function has not yet been established, it may be involved in cellular homeostasis [].
Competence is the ability of a cell to take up exogenous DNA from its environment, resulting in transformation. It is widespread among bacteria and is probably an important mechanism for the horizontal transfer of genes. DNA usually becomes available by the death and lysis of other cells. Competent bacteria use components of extracellular filaments called type 4 pili to create pores in their membranes and pull DNA through the pores into the cytoplasm. This process, including the development of competence and the expression of the uptake machinery, is regulated in response to cell-cell signalling and/or nutritional conditions [
].CinA is the first gene in the competence-inducible (cin) operon, and is thought to be specifically required at some stage in the process of transformation [
].This family consists of putative competence-damaged proteins from the cin operon, and nicotinamide-nucleotide (NMN) amidohydrolase proteins. In the case of T. thermophilus, CinA (
) was shown to have both NMN deamidase and ADP-ribose pyrophosphatase activities [
].
YgfZ is a folate-binding protein [
] involved in regulating the level of ATP-dnaA and in the modification of some tRNAs. It is probably a key factor in regulatory networks that act via tRNA modification, such as initiation of chromosomal replication [].
Members of this protein family contain a region of homology to the RimK family of alpha-L-glutamate ligases (
), various members of which modify the Glu-Glu C terminus of ribosomal protein S6, or tetrahydromethanopterin, or a form of coenzyme F420 derivative. Members of this family are found so far in various Vibrio and Pseudomonas species and some other gamma and beta Proteobacteria. The function is unknown.
Toxoplasma gondii is an obligate intracellular apicomplexan protozoan parasite, with a complex lifestyle involving varied hosts [
]. It has two phases of growth: an intestinal phase in feline hosts, and an extra-intestinal phase in other mammals. Oocysts from infected cats develop into tachyzoites, and eventually, bradyzoites and zoitocysts in the extraintestinal host []. Transmission of the parasite occurs through contact with infected cats or raw/undercooked meat; in immunocompromised individuals, it can cause severe and often lethal toxoplasmosis. Acute infection in healthy humans can sometimes also cause tissue damage [].The protozoan utilises a variety of secretory and antigenic proteins to
invade a host and gain access to the intracellular environment []. These originate from distinct organelles in the T. gondii cell, termed micronemes, rhoptries, and dense granules. They are released at specific times during invasion to ensure the proteins are allocated to their correct target destinations []. MIC1, a protein secreted from the microneme, is a 456-residue moiety involved in host cell recognition by the parasite [
]. The protein is released from the apical pole of T. gondii during infection, and attaches tohost-specific receptors [
]. Recent studies have demonstrated that Mic1 is a lactose-binding lectin, and utilises this to enhance its binding to host endothelial cells []. A homologue of Mic1 found in Neospora caninum interacts with sulphated host cell-surface glycosaminoglycans.
This entry represents a bifunctional enzyme which possesses 2-acylglycerophosphoethanolamine acyltransferase and acyl synthetase activity. It is responsible for regeneration of phosphatidylethanolamine from 2-acyl-glycero-3-phosphoethanolamine (2-acyl-GPE) formed by transacylation reactions or degradation by phospholipase A1 [
,
].
The Ndc80 complex is a conserved outer kinetochore protein complex consisting of Ndc80 (Hec1), Nuf2, Spc24 and Spc25. The Ndc80 complex is required for chromosome segregation and spindle checkpoint activity [
,
,
].
This entry includes a group of RNA recognition motif domain containing proteins, including RBM27 and RBM26. The function of RBM27 is not clear.
RBM26, also known as cutaneous T-cell lymphoma (CTCL) tumor antigen se70-2, was identified as a cutaneous lymphoma (CL)-associated antigen [
]. It contains two RNA recognition motifs (RRMs). The RRMs may play some functional roles in RNA-binding or protein-protein interactions.
Peter Pan (PPAN) was initially identified in Drosophila melanogaster. PPAN is highly conserved and essential for maintaining growth and survival [
]. Human PPAN localizes to nucleoli and to mitochondria. It can shuttle between the nucleus and the cytoplasm in response to nucleolar stress and apoptosis induction []. PPAN knockdown has been linked to mitochondrial damage and stimulates autophagy []. The yeast homologues of PPAN, Ssf1 and Ssf2, are nucleolar ribosome biogenesis factors required for maturation of the large ribosomal subunit [].
MFAP1 is a component of the elastin-associated microfibrils [
]. Drosophila MFAP1 binds to Prp38, a tri-small nuclear ribonucleoprotein component, and is required for pre-mRNA processing and G2/M progression [].
Macroautophagy is a bulk degradation process induced by starvation in eukaryotic cells. In yeast, 15 Atg proteins coordinate the formation of autophagosomes. The pre-autophagosomal structure contains at least five Atg proteins: Atg1p, Atg2p, Atg5p, Aut7p/Atg8p and Atg16p, found in the vacuole [
,
]. The C-terminal glycine of Atg12p is conjugated to a lysine residue of Atg5p via an isopeptide bond. During autophagy, cytoplasmic components are enclosed in autophagosomes and delivered to lysosomes/vacuoles. Autophagy protein 16 (Atg16) has been shown to bind to Atg5 and is required for the function of the Atg12p-Atg5p conjugate []. Autophagy protein 5 (Atg5) is directly required for the import of aminopeptidase I via the cytoplasm-to-vacuole targeting pathway [].This entry includes ATG16-1 and ATG16-2 [,
].
This family includes vacuolar protein 8 (Vac8), an armadillo repeat-containing protein that functions in both vacuole inheritance and protein targeting from the cytoplasm to vacuole [
,
]. Its structure has a N-terminal flexible H1 helix followed by 12 armadillo repeats that form a right-handed superhelical structure []. The cationic triad (Arg276, Arg317, and Arg359) motifs in this protein are critical for the interaction with the nuclear membrane protein Nvj1 to form a nucleus-vacuole junction (NVJ) and for the recognition of Atg13, a key component of the cytoplasm-to-vacuole targeting (CVT) pathway [].
This entry contains penicillin binding proteins includes the member from Escherichia coli designated penicillin-binding protein 1C. Members have both transglycosylase and transpeptidase domains and are involved in forming cross-links in the late stages of peptidoglycan biosynthesis. All members of this entry are presumed to have the same basic function.
Bacterial that synthesize a cell wall of peptidoglycan (murein) generally have several transglycosylases and transpeptidases for the task. This family consists of a particular bifunctional transglycosylase/transpeptidase in Escherichia coli and other Proteobacteria, designated penicillin-binding protein 1B. It's structure has been resolved [
].
Adhesive properties in fungi are conveyed by a group of cell-surface proteins called adhesins (sometimes also referred to as agglutinins or flocculins). Several fungal adhesins have been described to date, including the the Candida albicans Als (agglutinin-like sequence) proteins [
]. Als proteins are cell-surface glycoproteins and have a three-domain structure. Each Als protein has a relatively conserved N-terminal domain, a central domain consisting of a tandemly repeated motif of variable number, and a serine-threonine-rich C-terminal domain that is relatively variable across the family. The Als family exhibits several types of variability that indicate the importance of considering strain and allelic differences when studying als genes and their encoded proteins [].This entry represents the tandem repeat found in the central domain of the Als proteins.
A15 is a part of a large complex required for early virion morphogenesis. This complex participates in the formation of virosomes and the incorporation of virosomal contents into nascent immature virions. A15 is required for the stability and kinase activity of F10 [
].
This family consists of several eukaryotic corticotropin-releasing factor binding proteins (CRF-BP or CRH-BP). Corticotropin-releasing hormone (CRH) plays multiple roles in vertebrate species. In mammals, it is the major hypothalamic releasing factor for pituitary adrenocorticotropin secretion, and is a neurotransmitter or neuromodulator at other sites in the central nervous system. In non-mammalian vertebrates, CRH not only acts as a neurotransmitter and hypophysiotropin, it also acts as a potent thyrotropin-releasing factor, allowing CRH to regulate both the adrenal and thyroid axes, especially in development. CRH-BP is thought to play an inhibitory role in which it binds CRH and other CRH-like ligands and prevents the activation of CRH receptors. There is however evidence that CRH-BP may also exhibit diverse extra and intracellular roles in a cell specific fashion and at specific times in development [
].
JAMP is a Jun N-terminal kinase 1 (JNK1)-associated membrane protein. It associates with JNK1 through its C-terminal domain and regulates the duration of JNK1 activity in response to diverse stress stimuli [
]. It is an important component for coordinated clearance of misfolded proteins from the ER [,
].
The function of protein PBDC1 (polysaccharide biosynthesis domain-containing protein 1) is not clear. Proteins in this entry are from fungi and animals. The budding yeast PBDC1 homologue is known as YPL225W.
This entry represents the EMSY-like proteins from plants. They contain an EMSY N-terminal (ENT) domain, a central Agenet domain, and a putative C-terminal coiled-coil structure. Arabidopsis EMSY-like proteins contribute to RPP7-mediated and basal immunity, especially against Hyaloperonospora arabidopsidis isolate Hiks1 [
].
Gp17 from Siphoviridae bacteriophage SPP1 is a tail completion protein located at the interface between the connector protein gp16 and the tail of bacteriophage SPP1 [
]. Gp17 plays a fundamental role in the head-to-tail joining reaction, the ultimate step of virus particle assembly [].
NKAIN (Na,K-Atpase INteracting) proteins are a family of evolutionary conserved transmembrane proteins that localise to neurons, that are critical for neuronal function, and that interact with the beta subunits, beta1 in vertebrates and beta in Drosophila, of Na,K-ATPase. NKAINs have highly conserved trans-membrane domains but otherwise no other characterised domains. NKAINs may function as subunits of pore or channel structures in neurons or they may affect the function of other membrane proteins. They are likely to function within the membrane bilayer [
].
MTH1880 is a hypothetical protein from Methanobacterium thermoautotrophicum. The structure has an alpha/beta fold. The molecular surface of the protein reveals a small, highly acidic pocket comprising loop B, the end of beta2, and loop D, indicating that the protein would have a possible cation binding site. It is proposed that the MTH1880 protein may function as a calcium buffering protein [
].
This family represents a group of proteins found in several proteobacteria, which are part of the CBASS (cyclic oligonucleotide-based antiphage signaling system), providing immunity against bacteriophage. This system is ancestry related with the cGAS-STING innate immune pathway in animals. CBASS systems are composed of an oligonucleotide cyclase and an effector that is activated by the cyclic oligonucleotides, promoting cell death. The CD-NTase protein synthesizes cyclic nucleotides in response to infection, which serve as specific second messenger signals that activate a diverse range of effectors, leading to bacterial cell death and thus abortive phage infection [
,
].CD-NTase-associated protein 3 (CAP3) is part of type II CBASS, which is thought to be an isopeptidase of the JAB/JAMM family, usually found in eukaryotic deubiquitinase enzymes that remove ubiquitin from target proteins. This protein is necessary for the protection against some phage infection suggesting its ancillary function in CBASS system. It is not yet known which protein is the target for CAP3, as ubiquitin is not known to be present in E.coli or V. cholerae [
].
Moraxella catarrhalis is a Gram-negative diplococcus, morphologically and biochemically related to the Neisseria. It is usually found in the commensal flora of both children and adults, and most commonly colonises the mucous membranes of the nasopharyngeal tract. However, M. catarrhalis can also cause severe systemic infections, such as pneumonia, meningitis, otitis media and endocarditis. The microbe is fast becoming an important opportunisitic pathogen, and produces a number of virulent proteins on host invasion.Two ubiquitously expressed proteins on the cell surface of M. catarrhalishave been shown to confer protection in a mouse model against the microbe;both resemble adhesins from other pathogenic bacteria [
]. Designated UspA1 and UspA2, they share significant amino acid sequence similarity and are prime candidates for vaccine development owing to their immunogenic nature. Clinical samples taken from patients has revealed little diversity in the genes; UspA1 appears to be the most promising target [].
This family includes the Septation protein Etd1 from Schizosaccharomyces pombe, which activates the GTPase Spg1 to trigger signaling through the septum initiation network (SIN) pathway and onset of cytokinesis. The proper regulation of Etd1 is crucial for both activation of Spg1 in anaphase and inactivation of Spg1 when cytokinesis is complete [
]. It was suggested that Etd1 is the functional homologue of budding yeast Lte1 as they have the same function but a very low sequence similarity [].
There are two TMEM88 isoforms: TMEM88A/CRA-a, which inhibits the canonical Wnt/beta-catenin pathway through interactions with Dishevelled (Dvl) proteins and regulates the development of myocardial cells, and TMEM88A/CRA-b, which lacks the VWV motif and therefore likely does not interact with Dvl proteins [].
POC5 is a centrin-binding protein required for assembly of full-length centrioles. POC5 is not required for the initiation of procentriole assembly, but is essential for building the distal half of centrioles [
].
DI19 is Cys2/His2 zinc-finger protein implicated in ABA-independent dehydration, high-salinity stress and light signaling pathways [
,
]. It is phosphorylated by AtCPK11 in a Ca(2+)-dependent manner [].
This entry includes protein LONGIFOLIA 1/2 (LNG1/2) from Arabidopsis. They regulate leaf morphology by positively promoting longitudinal polar cell elongation independently of ROT3 [
].
TMEM127 modulates mTOR function in the endolysosome. Its interaction with early endosomal GTPase Rab5 to inhibit mTOR signalling seems to be related with its tumour-suppressive properties [
].
This entry includes a group of plant microtubule-associated proteins, including TORTIFOLIA1/SPIRAL2 and SPIRAL2-like (SP2L) from Arabidopsis. They regulate the orientation of cortical microtubules and the direction of organ growth [
].TORTIFOLIA1-like protein 1 (also known as SPIRAL2) is a microtubule-associated protein (MAP) that regulates the orientation of cortical microtubules and the direction of organ growth [
] and may regulate katanin-based microtubule (MT) severing []. During virus infection, it may stabilize MT crossovers to support the formation and intercellular spread of the viral RNA complexes []. This family also includes SINE1/2 from Arabidopsis, which play a role in nucleus positioning in guard cells [
].
This entry includes microtubule-associated protein Jupiter from fruit flies and Jupiter microtubule associated homologues 1 and 2 from mammals, also known as hematological and neurological expressed 1 (HN1) and hematological and neurological expressed 1-like (HN1L) respectively.Jupiter binds to all microtubule populations [
]. HN1/HN1L may be involved in embryo development []. HN1 is a regulator of GSK3beta-dependent interactions of beta-catenin and E-cadherin [].
Small kinetochore associated protein (SKAP) cooperates with kinetochore and mitotic spindle proteins to regulate the metaphase-to-anaphase transition [
,
]. Its interaction with astrin in the kinetochore is important for normal spindle architecture and chromosome alignment [,
]. It interacts with MIS13 to orchestrate accurate interaction between kinetochore and dynamic spindle microtubules []. It also interacts with mitotic motor CENP-E to orchestrate accurate chromosome segregation in mitosis []. SKAP may also promote UV-induced cell apoptosis by negatively regulating the anti-apoptotic protein Prp19 [].
This family of proteins represents HapK, a protein of unknown function, with two homologues PigK and RedY. The monomer structure of the protein contains a four-stranded anti parallel β-sheet, three α-helices and a short C-terminal tail which it uses for dimer formation [
]. The surface of HapK has a deep cavity with consists of a kinked helix and a β-four strand. HapK could be involved in prodigiosin biosynthesis, specifically the binding of a bipyrrole intermediate such as HBM or MBM [].
This entry represents Torsin-1A-interacting proteins 1 and 2 (TOIP 1/2) also known as LAP1 proteins (Lamina-associated polypeptide 1), which are type 2 integral membrane proteins with a single membrane-spanning region of the inner nuclear membrane [
,
,
]. These proteins interact with and activate Torsin A, an AAA+ ATPase localized to the endoplasmic reticulum (ER), through a perinuclear domain and forms a heterohexameric (LAP1-Torsin)3 ring that targets Torsin to the nuclear envelope. LAP1 has an atypical AAA+ fold and provides an arginine finger to the Torsin A active site to promote its ATPase activity [
,
]. A single mutation in Torsin A causes early onset primary dystonia, a painful and severely disabling neuromuscular disease [,
].
This entry represents the chromosomal protein MC1, which protects DNA against thermal denaturation and shapes DNA by binding to it [
,
]. Its global fold consists of a pseudo barrel with an extension of the β-sheet (beta4-beta5) forming an arm (LP5) []. Some uncharacterised virus proteins are also included in this entry.
Members of this family are related to cytosylglucuronate (CGA) synthase (
), and found in the same clusters as CGA synthase and CGA decarboxylase. These clusters produce peptidyl nucleoside antibiotics with a pyranoside core moiety, found in a number of Streptomyces species. Removal of the S. griseochromogenes member of this family, BlsF, from a heterologous expression system caused an increase, not blockage, of blasticidin S [
].
Members of this family occur regularly as a partner to a member of family (
). Conserved motifs suggest enzymatic activity. Note that its frequent partner protein (
) has a three-cysteine motif that resembles the Cx3CxxC motif of radical SAM proteins, and that in one branch (
) actually becomes Cx3CxxC.
Regulatory protein E2 exists as a homodimer and interacts with a homodimer of E1 to improve the specificity of the DNA binding of E1. Once the complex is bound to DNA, E2 is released [
]. E2 also binds to the E2RE response element (5'-ACCNNNNNNGGT-3'), which is present in multiple copies in the viral genome. Binding of E2 to this element either activates or represses transcription [].
Chromodomain-helicase-DNA-binding protein 1-like (also known as amplified in liver cancer 1, ALC1) is a SNF2 family ATPase and a chromatin-remodeling enzyme that interacts with poly(ADP-ribose) and catalyses PARP1-stimulated nucleosome sliding [
,
]. ALC1 is involved in DNA repair and in controlling the expression of several genes implicated in tumorigenesis and metastasis in mammals [].This entry also includes the probable helicase CHR10 from Arabidopsis.
Members of this bacterial protein family average 80 residues in length, and average nearly 6 Trp residues (two of which are invariant) in the first 45, which are strongly hydrophobic. Past this region, the protein is highly charged, with large numbers of Lys, Arg, Asp, and Glu residues. Members usually are divergently transcribed from a gene encoding a c-type cytochrome.
This family represents a novel S-layer protein widely distributed in Gammaproteobacteria including species of Pseudoalteromonas and Vibrio, and are found exclusively in marine metagenomes. These proteins assemble into paracrystalline sheets with a unique square lattice symmetry [
].
This superfamily represents the surface-adhesion protein E. Adhesin E plays a role in pathogenesis [
]. It binds to host proteins including plasminogen, vitronectin and laminin [].
This family consists of several NPP1-like necrosis inducing proteins from oomycetes, fungi and bacteria. Infiltration of NPP1 into leaves of Arabidopsis thaliana plants result in transcript accumulation of pathogenesis-related (PR) genes, production of ROS and ethylene, callose apposition, and HR-like cell death [
]. Members of this entry are secreted effectors that acts as a pathogen-associated molecular pattern (PAMP) recognised by the plant immune system [].
The nodulation genes of Rhizobia are regulated by the nodD gene product in response to host-produced flavonoids and appear to encode enzymes involved in the production of a lipo-chitose signal molecule required for infection and nodule formation. NodZ is required for the addition of a 2-O-methylfucose residue to the terminal reducing N-acetylglucosamine of the nodulation signal. This substitution is essential for the biological activity of this molecule. Mutations in nodZ result in defective nodulation. nodZ represents a unique nodulation gene that is not under the control of NodD and yet is essential for the synthesis of an active nodulation signal [].
This family consists of Baculovirus proteins and includes Autographa californica multiple nucleopolyhedrovirus (AcMNPV) protein AC81, which is required for nucleocapsid envelopment [
].
This family consists mainly of baculovirus proteins. The member from Autographa californica multiple nucleopolyhedrovirus (AcMNPV), protein Ac76, has been characterised, and is involved in intranuclear microvesicle formation [
].
The Vanadium binding protein, Vanabin2, contains four α-helices connected by nine disulphide bonds. Vanadium accumulates in Ascidians however the biological reason remains unclear [
].
MxiM, a Shigella pilot protein (also known as type 3 secretion system pilotin), is essential for the assembly and membrane association of the Shigella secretin MxiD. MxiM contains an orthologous secretin component and has a specific binding domain for the acyl chains of bacterial lipids [
]. The C-terminal domain of MxiD hinders lipid binding to MxiM [].
Bcl-2-like protein 15, also known as Bfk, is a member of the bcl2 gene family. It is highly expressed in the epididymis in mouse [
] and in tissues of the gastrointestinal tract, where it may help to protect against the development of gastrointestinal malignancy [].
Transmembrane protein 225 (TMEM225) is specifically expressed in testis [
]. It may be involved in the differentiation and function of spermatozoa through the regulation of PP1gamma2 activity [,
].
Dpl is a homologue related to the prion protein (PrP). Dpl is toxic to neurons and is expressed in the brains of mice that do not express PrP. In DHPC and SDS micelles, Dpl shoes about 40% α-helical structure however in aqueous solution it consists of a random coil. The alpha helical segment can adopt a transmembrane localisation also in a membrane. The unprocessed Dpl protein is thought to posses a possible channel formation mechanism which may be related to toxicity through direct interaction with cell membranes and damage to the cell membrane.
Human T-cell leukemia/lymphoma virus type 1 (HTLV-1)-encoded P30II is a nuclear-resident protein that inhibits virus expression by reducing Tax and Rex protein expression [
].
Spermatogenesis-associated protein 22 (Spata22) is required early in meiotic prophase in both male and female germ cells. It is necessary to achieve normal synapsis and for repair of meiotic DNA double-strand breaks [
].
This entry represents a family of phage capsid protein. This family of proteins is found in bacteria and phage. Proteins in this family are approximately 280 amino acids in length.
N4BP3 was identified as a protein expressed during development that interacts with Nedd4 ubiquitin ligase. N4BP3 is not a ubiquitylation substrate of Nedd4, but it can alter Nedd4 subcellular location, indicating a functional interaction [
]. N4BP3 has been shown to be essential in neurite branching and circuit formation [].
This entry represents the uncharacterised TMEM128 protein which contains four transmembrane helices. This protein is resident in the endoplasmic reticulum.
This family consists of kinesin-associated protein 3 (KAP3, also known as SMAP). In human and mouse, KAP3 is involved in tethering the chromosomes to the spindle pole and in chromosome movement. It binds to the tail domain of the KIF3A/KIF3B heterodimer to form a heterotrimeric KIF3 complex and may regulate the membrane binding of this complex [
,
].
This entry represents the bacterial flagellar FliT family of dual-function proteins. Together with FlgN, FliT has been proposed to act as a substrate-specific export chaperone, facilitating the incorporation of the enterobacterial hook-associated axial proteins (HAPs) FlgK/FlgL and FliD into the growing flagellum [
]. FliT has also been shown to act as a transcriptional regulator in Salmonella typhimurium [].
This entry represents the C-terminal domain of protein furry (Fry). Fry plays a crucial role in the structural integrity of mitotic centrosomes and in the maintenance of spindle bipolarity. This domain binds to polo-like kinase 1 (Plk1) through the polo-box domain (PBD) of Plk1 in a manner dependent on the cyclin-dependent kinase 1-mediated Fry phosphorylation, promoting Plk1 activity during early mitosis. Fry also binds to Aurora A and may function as a scaffold promoting the interaction between AURKA and PLK1, thereby enhancing AURKA-mediated PLK1 phosphorylation [
].
This family consists of ataxin-3 (Machado-Joseph disease protein 1) and ataxin-3 homologues. They are deubiquitinating enzymes from peptidase family C86 implicated in protein quality control pathways and transcriptional regulation [
,
,
,
]. Ataxin-3 contains an N-terminal Josephin domain followed by tandem ubiquitin (Ub)-interacting motifs (UIMs) and a polyglutamine stretch. The NMR structure of the Josephin domain of human ataxin-3 (MEROPS identifier C86.001) shows a papain-like fold similar to that found in other deubiquitinases in families C12 and C19 []. Human genes containing triplet repeats can markedly expand in length, leading to neuropsychiatric disease. Expansion of triplet repeats explains the phenomenon of anticipation, i.e. the increasing severity or earlier age of onset in successive generations in a pedigree [
]. The gene for atxin-3 contains CAG repeats and has been identified and mapped to chromosome 14q32.1, the genetic locus for Machado-Joseph disease (MJD). Normally, the gene contains 13-36 CAG repeats, but most clinically diagnosed patients and all affected members of a family with the clinical and pathological diagnosis of MJD show expansion of the repeat number, from 68-79 []. Similar abnormalities in related genes may give rise to diseases similar to MJD. MJD is a neurodegenerative disorder characterised by cerebellar ataxia, pyramidal and extra-pyramidal signs, peripheral nerve palsy, external ophtalmoplegia, facial and lingual fasciculation and bulging. The disease is autosomal dominant, with late onset of symptoms, generally after the fourth decade.A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families [
]. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid,
N-ethylmaleimide or
p-chloromercuribenzoate.
Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [
].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [
]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [
]. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.
This entry represents the presumed VPg protein from human astrovirus. Viral genome-linked proteins (VPgs) are virus-encoded small proteins that are covalently linked to the 5' terminus of many RNA viral genomes through a phosphodiester bond. Viral genome-linked proteins (VPgs) have been identified in several single-stranded positive-sense RNA virus families. The protein resulting from this putative VPg coding region is a highly disordered protein [
]. A common feature of VPgs is that they are rich in basic amino acids [mostly Lys (K), Gly (G), Thr (T), and Arg (R)], which favors the interaction with the negatively charged RNA. Tyr-693 at the conserved TEEEY-like motif has been postulated to be the residue responsible for the covalent linkage to viral RNA. Mutagenesis of Tyr-693 in the VPg protein is lethal for HAstV replication [
].
The lipid-binding protein (LBP) family can be divided divided in four subfamiles. Subfamily I comprises proteins specific for vitamin A derivatives, and can be subdivided into the cellular retinoic acid-binding proteins (CRABP-I and II) and the cellular retinol-binding proteins (CRBP-I, II, III, and IV) [
].Cellular retinol binding protein 1 (CRBP-I) is thought to be essential for vitamin A metabolism and synthesis of retinoic acid [
,
].
Myelin protein P2 is a constituent of peripheral nervous system (PNS) myelin [
], also present in small amounts in central nervous system (CNS) myelin []. As a structural protein, P2 is thought to stabilise the myelin membranes []. Structurally, P2 belongs to the family of cytoplasmic fatty acid binding proteins (FABPs) [,
].
Nucleoporin Nup121 or pore membrane protein POM121 is a essential component of the nuclear pore complex (NPC). It has a repeat-containing domain that may be involved in anchoring components of the pore complex to the pore membrane. When overexpressed in cells, POM121 (Nup121) induces the formation of cytoplasmic annulate lamellae (AL) [
].This entry includes POM121-like protein 1 and related proteins.
This entry represents a group of CCCH-type zinc finger proteins, including ZFP36/ZFP36L1/ZFP36L2 from animals, Zfs1 from fission yeasts and CTH1/2 from budding yeasts. ZFP36/ZFP36L1/ZFP36L2, Zfs1 and CTH1/2 are mRNA decay activator proteins that bind to specific AU-rich elements (ARE) in the 3'-untranslated region of target mRNAs and promotes their degradation [
,
,
]. Interestingly, ZC3H11 from Trypanosoma brucei brucei, also included in this entry, binds AU-rich regions in the 3'-UTR of mRNAs and promotes their stabilisation by recruiting a MKT1-containing complex [,
].This entry also includes some uncharacterised plant CCCH-type zinc finger proteins, among which Arabidopsis HUA1 may be involved in RNA processing [
].
PARP4, also known as vPARP, is the largest member of the poly-ADP-ribose polymerase (PARP) family. PARP4 exhibits PARP activity and can PARylate major vault protein (MVP) and, to a lesser extent, itself. Vaults are the largest ribonucleoprotein particles found in eukaryotic cells [
].Poly(ADP-ribosyl)ation (PARylation) is a post-translational modification of proteins [
]. PARP family members are PAR polymerases that initiate the reaction by converting the substrate nicotinamide adenine dinucleotide (NAD+) to ADP-ribose, and then catalyse ADP-ribose polymerisation on nuclear acceptor proteins [].
This entry includes protein LRATD1 (also known as FAM84A) and LRATD2. LRATD1 has been linked to cancer progression
[,
]. LRATD2 is expressed in esophageal squamous cell carcinomas [].
This family includes Tubulin polymerization-promoting proteins, formerly known as 25kDa proteins (TPPP/p25) that are phosphorylated by a Ser/Thr-Pro kinase [
]. Proteins in this family, including TPP and TPP3 from human, are regulators of microtubule dynamics [,
]. TPP plays a key role in myelination by promoting elongation of the myelin sheath [] and acts as a microtubule nucleation factor in oligodendrocytes; it specifically localizes to the postsynaptic Golgi apparatus region, and promotes microtubule nucleation [,
]. TPP3 is required for embryo implantation [].
This is a family of 15kDa salivary proteins (Salp15) from Acari Arachnids that is induced on feeding and assists the parasite to remain attached to its arthropod host. By repressing calcium fluxes triggered by TCR engagement, Salp15 inhibits CD4+ T cell activation and interleukin (IL)-2 production [
,
]. It downregulates host immune system by binding to both dendritic cells, and CD4+ T cells [,
]. Salp15 shows weak similarity to Inhibin A, a member of the TGF-beta superfamily that inhibits the production of cytokines and the proliferation of T cells.
Representatives of this viral protein domain are found in the vp7 capsid protein of Bluetongue virus [
], and African horsesickness virus [], the vp6 capsid protein of Bovine rotavirus [], and in the haemagglutinin protein of various influenza viruses [,
].The vp7 and vp6 capsid proteins each consist of two domains, one a beta sandwich, and the other an alpha helical bundle. The beta sandwich domain described here is structurally very similar in vp6 and vp7, and may be involved in the attachment of the virus to the cell. In influenza A and B viruses, the haemagglutinin membrane glycoprotein serves to recognise the cell surface receptor sialic acid, and this domain forms the head region. In influenza C virus, a single multifunctional glycoprotein, the haemagglutinin-esterase-fusion protein, possesses a haemagglutinin domain, which recognises the cell surface receptor 9-O-acetylsialic acid, and bears strong structural resemblance to the haemagglutinin protein of influenza A and B viruses, as well as to the vp6 and vp7 capsid proteins. In each case, the domain consists of a β-sandwich, in which the strands making up the sheets exhibit a jellyroll fold. The resultant proteins form trimers.
Ebola virus sp. are non-segmented, negative-strand RNA viruses that causes severe haemorrhagic fever in humans with high rates of mortality. The virus matrix protein VP40 is a major structural protein that plays a central role in virus assembly and budding at the plasma membrane of infected cells. VP40 proteins associate with cellular membranes, interact with the cytoplasmic tails of glycoproteins, and bind to the ribonucleoprotein complex. The VP40 monomer consists of two domains, the N-terminal oligomerization domain and the C-terminal membrane-binding domain, connected by a flexible linker. Both the N- and C-terminal domains fold into beta sandwich structures of similar topology [
]. Within the N-terminal domain are two overlapping L-domains with the sequences PTAP and PPEY at residues 7 to13, which are required for efficient budding []. L-domains are thought to mediate their function in budding through their interaction with specific host cellular proteins, such as tsg101 and vps-4 [].
This entry represents a group of Rab GTPase-activating proteins (RabGAPs), including TBC20 from humans and Gyp8 from Saccharomyces cerevisiae. They contain the TBC/rab GTPase-activating protein (GAP) domain.TBC20 can accelerate the intrinsic GTP hydrolysis rate by more than five orders of magnitude for Rab1 and Rab2 small GTPase families [
,
,
]. This protein have been related to numerous diseases due to its critical role in cellular processes [,
,
]. It was also seen to be essential for the replication and assembly of hepatitis C virus (HCV) [].Gyp8 is involved in the regulation of ER to Golgi vesicle transport in yeast mainly by activating the Ypt/Rab-GTPase Ypt1 [
].
This entry includes chordin-like protein 1/2 (CHRDL1/2). CHRDL1 antagonizes the function of BMP4 by binding to it and preventing its interaction with receptors [
]. CHRDL2 inhibit bone morphogenetic proteins (BMPs) activity by blocking their interaction with their receptor []. CHRDL2 may play a role during myoblast and osteoblast differentiation and maturation []. It's worth noting that there are different results regarding its binding partners; according to [], CHRDL2 interacts with INHBA but not BMP2, BMP4 and BMP6.Bone is composed of an organic matrix that is principally collagenous and is mineralised with inorganic crystals of hydroxyapatite. Demineralisation of the bone results in a demineralised bone matrix [
]. In 1965, Urist M.R demonstrates the decalcified bone matrix (DBM) can induce ectopic bone/cartilage formation []. The morphogens in DBM that induce such skeletogenesis were named as bone morphogenetic proteins (BMPs) []. Since then, more than 20 members of BMPs have been identified. BMPs are potent effectors in almost all crucial biological events, such development, maintenance and regeneration of organs/tissues [].
This entry represents a group of plant SPX domain-containing proteins, including AtSPX1-6 from Arabidopsis. AtSPX1 (
) and AtSPX2 (
) have a cellular Pi-dependent inhibitory effect on Phosphate Starvation Response 1 (PHR1) [
]. Their rice homologue, OsSPX1 (), may play an important role in linking cold stress and Pi starvation signal transduction pathways [
].
This is a family of plant-specific transcription factors including protein indeterminate-domain 1-16 (IDD1-16) from Arabidopsis containing the conserved INDETERMINATE DOMAIN (IDD) with four zinc finger motifs. These proteins are involved in regulation of multiple developmental processes [
,
]. IDD8/NUC, IDD3/MAGPIE (MAG), and IDD10/JACKDAW (JKD) regulate root development and patterning [[,
,
]. IDD8/NUTCRACKER (NUC) also regulates photoperiodic flowering by modulating sugar transport and metabolism []. IDD15/SHOOT GRAVITROPISM 5 (SGR5) is involved in the gravitropic response of inflorescence stems. INDETERMINATE1 plays an important role in regulating the transition to flowering in maize []. This family also includes similar proteins from rice, such as EHD2, which are transcription activators that act as a flowering master switches in both long and short days, independently of the circadian clock [,
].
This protein appears to be a member of a somewhat larger structural family conserved in Firmicutes. It displays structural similarity to the N-terminal domain of YycH, which plays a role in signal transduction and is found immediately downstream of the essential histidine kinase YycG. YycG forms a two-component system together with its cognate response regulator YycF. YycH functions as a modulator of YycG activity, possibly by interacting with YchI. All three molecules (YchG, YchH, and YchI) have been characterized as membrane proteins, and they may be able to form homodimers [
].
Cin1 (cellulose induced protein 1) repeat protein of Venturia inaequalis, the fungus responsible for scab disease of apple, includes eight cysteine-rich repeats and is greatly upregulated within the plant and on cellophane membranes. The crystal structure reveals a pair of disulfide bridges in each repeat. The repeats have been described as adopting a beads-on-a-string organization. Cin1 function is undetermined, however the α-helical structure may be involved in protein-protein or protein-carbohydrate interactions in the extracellular matrix [].