SVIP, small VCP/p97-interacting protein, was identified by yeast two-hybrid screening to be an interactive partner of VCP/p97. Mammalian VCP/p97 and its yeast counterpart Cdc48 participate in the formation of organelles, including the endoplasmic reticulum (ER), Golgi apparatus, and nuclear envelope. Over-expression of SVIP caused the formation of large vacuoles that seemed to be derived from the ER [
]. Proteins in this family have two putative coiled-coil regions and are approximately 80 amino acids in length.
Contact-dependent growth inhibition (CDI) is a mechanism of inter-cellular competition in which Gram-negative bacteria exchange polymorphic toxins using type V secretion systems. Structure analysis of the CDI toxin from Escherichia coli NC101 reveals that it has moderate structural homology to Whirly-like proteins found in plastids, but appears to lack the characteristic Whirly RNA-binding site [
].
This entry represents the predicted immunity protein with an alpha+beta fold and a conserved glutamate residue. This protein is often fused to one or more immunity domains in poly-immunity proteins [
].
This entry represents a predicted immunity protein with an alpha+beta fold and several conserved polar and hydrophobic residues. Proteins containing this domain are present in heterogeneous polyimmunity loci in polymorphic toxin systems [
].
The structures of the immunity protein Rap1a , responsible for the inhibition and neutralization of Ssp1 endopeptidase, revealed two distinct folds. The structure of the Ssp1-Rap1a complex revealed a tightly bound heteromeric assembly with two effector molecules flanking a Rap1a dimer. The Rap1a subunit displays a compact globular structure constructed from five α-helices that assemble to form the highly stable symmetric dimer [
].
The temperate bacteriophage P2 has four defined tail genes: V, J, W and I. Their order is the late gene promoter, VWJI, followed by the tail fibre genes H and G and then a transcription terminator. BAP V protein is the small spike at the tip of the tail and basal plate assembly protein J lies at the edge of the baseplate [
]. This family also includes a number of bacterial homologues, which are thought to have been horizontally transferred.
This is a family of Baculovirus proteins including protein AC142, which is expressed in the cytoplasm and nucleus throughout infection. It is required for nucleocapsid envelopment in the budding virus to form the occlusion-derived virus and subsequent embedding of virions into polyhedra [
].
Adhesin E plays a role in pathogenesis [
]. It binds to host proteins including plasminogen, vitronectin and laminin []. This entry also includes putative uncharacterised protein YjhD from E. coli.
Pinin (PNN) has been identified as central in the establishment and maintenance of corneal epithelial cell-cell adhesion. Three SR-rich proteins have been identified that interact with the C terminus of PNN; one of them is this 130kDa nuclear protein, SRrp130, also known as PNISR [
]. SRrp130 is a protein of 805 amino acid residues with multiple arginine-serine (RS) repeats but had no RNA recognition motif []. Its function is unknown.
C-ets transcription factors are highly conserved members of the ets gene family, with amino acid similarity to the v-ets oncogene of the avian leukemia virus, E26 [
]. They bind to unique DNA sequences, either alone or by association with other proteins [].
C-ets-2, a proto-oncogene and transcription factor, is a member of the ets gene family. It has amino acid similarity to the v-ets oncogene of the avian leukemia virus, E26 [
]. Human and mouse C-ets-2 has been shown to have a role in skeletal development [].
TNF-alpha-inducing protein (Tipalpha) is secreted from Helicobacter pylori as dimers and enters the gastric cells. It binds to DNA via the positively charged surface-patch formed between the two monomers of the crystal structure by the loop between helices alpha1 and alpha2. Each monomer consists of a helical domain and a mixed domain [
,
].
Members of this family are found in the parasite Babesia bigemina. Other rhoptry-associated proteins are found in Plasmodium falciparum but these do not belong to this family. Animal infection with B. bigemina may produce a pattern similar to human malaria [
]. Rhoptry organelles form part of the apical complex in apicomplexan parasites. Rhoptry-associated proteins are antigenic, and generate partially protective immune responses in infected mammals. Thus RAPs are among the targeted vaccine antigens for babesial (and malarial) parasites. However, RAP-1 proteins are encoded by by a multigene family; thus RAP-1 proteins are polymorphic, with B and T cell epitopes that are conserved among strains, but not across species [
,
,
]. Antibodies to B. bigemina RAP-1 may also be helpful in the serological detection of B. bigemina infections [].
Pim is the immunity protein produced by Yersinia pestis and other Gammaproteobacteria to protect themselves against the bacteriostatic activity of the toxin pesticin (
) [
].
This group represents a SPARC-like (Secreted Protein Acidic and Rich in Cysteine) protein 1 (SPARCL1; also known as Hevin). SPARCL1 is a secreted glycoprotein, belonging to SPARC family of matricellular proteins []. SPARCL1 is downregulated in various tumours and may have a tumor-suppressor function [,
].
TTLL10 is a polyglycylase, which modifies both tubulin and non-tubulin proteins, generating side chains of glycine on the γ-carboxyl groups of specific glutamate residues of target proteins [
]. It polyglycylates alpha tubulin and beta tubulin, but is not able to initiate glycylation and only has activity toward monoglycylated tubulin [
]. It has the ability to polyglycylate non-tubulin proteins such as NAP1; in this case it can initiate glycylation and does not require preliminary monoglycylation by another glycylase [].This family also includes
, which is inactive [
].
Proteins in this family contain the START domain. Their function is not clear.START (StAR-related lipid-transfer) is a lipid-binding domain in StAR, HD-ZIP and signalling proteins [
]. StAR (Steroidogenic Acute Regulatory protein) is a mitochondrial protein that is synthesised in response to luteinising hormone stimulation [].Expression of the protein in the absence of hormone stimulation is sufficient to induce
steroid production, suggesting that this protein is required in the acute regulation ofsteroidogenesis. Representatives of the START domain family have
been shown to bind different ligands such as sterols (StAR protein) andphosphatidylcholine (PC-TP). Ligand binding by the START domain can also
regulate the activities of other domains that co-occur with the START domainin multidomain proteins such as Rho-gap, the homeodomain,
and the thioesterase domain [,
].
This s a group of uncharacterised multi-pass membrane proteins found in eukaryotes. Proteins in this family are typically between 150 and 211 amino acids in length.
Contact-dependent growth inhibition (CDI) is an important mechanism of inter-bacterial competition found in many Gram-negative pathogens. CDI+ cells express cell-surface CdiA proteins that bind neighbouring bacteria and deliver C-terminal toxin domains (CdiA-CT) to inhibit target-cell growth. CDI+ bacteria also produce CdiI immunity proteins, which specifically neutralize cognate CdiA-CT toxins to prevent self-inhibition. Structure analysis of CdiI immunity protein from Yersinia kristensenii shows that it is composed of eight α-helices packed together to form a nearly spherical structure with weak structural homology to a putative TetR family transcriptional repressor. The CdiI protein fits into the curved cavity of the CdiA-CTYkris toxin domain where it most likely neutralizes toxin activity by blocking access to RNA substrates [
]. This domain is mostly found in gammaproteobacteria.
The assembly of a macromolecular structure proceeds via a specific pathway of ordered events and occurs by changing of protein conformations as they join the assembly. The assembly process is aided by scaffolding proteins, which act as chaperones. In bacteriophages, scaffolding proteins B and D are responsible for procapsid formation. Copies of protein D (240) form the external scaffold, while 60 copies of protein B form the internal scaffold [
].
The avirulence protein ATR13 is expressed by the plant pathogen oomycete Hyaloperonospora. Such phytopathogenic oomycetes like the one that infects Arabidopsis, Hyaloperonospora arabidopsidis (Hpa), grow intercellularly, forming parasitic structures called haustoria. Haustoria play a role in feeding and suppression of host defence systems. A whole range of pathogen proteins, called effectors, are secreted across this haustorial membrane, a subset of which are further translocated across the plant plasma membrane by an unknown mechanism that is present in both plants and animals. ATR13 is an RxLR effector from the downy mildew oomycete, and is a very dynamic protein. It contains two surface-exposed patches of polymorphism, one of which is involved in the specific recognition by host R-genes. The R-gene-products detect the presence of the infection by recognising the effector proteins. Once detected, the host R-genes trigger apoptosis of the host cell. The R-gene-products carry a specific motif, RxLR, that is recognises the effector proteins [].
This family consists of predicted immunity proteins with an alpha+beta fold. They are encoded by genes present in bacterial polymorphic toxin systems as an immediate gene neighbour of the toxin gene, which usually contains toxin domains of the Tox-ARC family. They are also found in heterogeneous polyimmunity loci [
].
This family includes Wadjet protein JetA, a component of anti plasmid transformation system Wadjet type I, composed of JetA, JetB, JetC and JetD. Expression of Wadjet type I in B.subtilis (strain BEST7003) reduces the transformation efficiency of plasmid pHCMC05 [
].
Keratinocyte differentiation-associated protein (KRTDAP) is secreted by keratinocytes and may serve as a soluble regulator of keratinocyte differentiation [
].
Proteins in this family are uncharacterised single-pass membrane proteins found in eukaryotes. Proteins in this family are typically between and 302 amino acids in length. There are two conserved sequence motifs: QDC and RLF. The function of this family is unknown.
This entry represents a predicted immunity protein with an alpha+beta fold and conserved tyrosine and tryptophan residues. Proteins in this entry are present in bacterial polymorphic toxin systems as an immediate gene neighbour of the toxin gene, which usually contains toxin domains of the Tox-REase-10 family [
].
This entry represents Sin3 binding proteins conserved in fungi. Sin3p does not bind DNA directly even though the yeast SIN3 gene functions as a transcriptional repressor. Sin3p is part of a large multiprotein complex [
]. Stb3 appears to bind directly to ribosomal RNA Processing Elements (RRPE) although there are no obvious domains which would accord with this, implying that Stb3 may be a novel RNA-binding protein [].
This is a family of Chordopoxvirus proteins composing one of the two subunits that make up VITF-3, a virally encoded complex necessary for intermediate stage transcription [
].
This family includes the radial spoke head proteins RSP4 and RSP6 from Chlamydomonas reinhardtii, and several eukaryotic homologues, including mammalian RSHL1, the protein product of a familial ciliary dyskinesia candidate gene []. The radial spoke head proteins are important in maintaining normal movement in motile, "9+2"-structure cilia and flagella. Mutations of the human RSPH9 and RSPH4A genes have been linked to primary ciliary dyskinesia, a genetically heterogeneous inherited disorder arising from dysmotility of motile cilia and sperm [
]. RSPH6A has also been shown to be required for sperm flagellum formation and male fertility in mice [].
The virD operon in Agrobacterium encodes a site-specific endonuclease, and a number of other poorly characterised products [
]. This family represents the VirD5 protein.
NleF (Non-LEE-encoded type III effector F) is an effector protein that alters host cell physiology and promotes bacterial survival in host tissues. It binds to and inhibits caspase-9, caspase-8 and caspase-4, and therefore preventing caspase-induced apoptosis in the host cell [
,
].
This entry represents a conserved sequence region found a family of poorly characterised fungal proteins, including YTP1 from S. cerevisiae. It appears to contain regions similar to mitochondrial electron transport proteins. The C-terminal domain is hydrophobic and negatively charged. There are consensus sites for both N-linked glycosylation and cAMP-dependent protein kinase phosphorylation [
].
This entry represents a conserved region found in fungal conidiation-specific protein 6 [
]. This protein is expressed approximately 6 hours after the induction of development and is induced just prior to major constriction-chain growth [].
This family consists of Vpr-like accessory proteins from maedi-visna and caprine/ovine lentivirus. This small open reading frame (ORF) in maedi-visna virus (MVV) and caprine arthritis encephalitis virus (CAEV) was initially named "tat"by analogy with a similarly placed ORF in the primate lentiviruses [
,
].
PTIP-associated protein 1 (PA1), also known as PAXIP1-associated-protein-1, is found in eukaryotes. PA1 and PTIP form a stable complex, which is recruited to DNA damage sites via the RNF8-dependent pathway and is required for cell survival in response to DNA damage [
].
This family consists of predicted immunity proteins with a mostly all-beta fold and a conserved GxS motif. These proteins are present in bacterial polymorphic toxin systems as an immediate gene neighbour of the toxin gene, usually containing a domain of the Ntox17 or Ntox7 families [
].
This entry represents Escherichia coli effector protein NleD-like proteins. Enteropathogenic E. coli infection can trigger an inflammatory response in host cells. This inflammatory response can then be inhibited by injecting several effectors that block the NF-kappa B pathway to the host cells. One of these effectors from E. coli, NleD, cleaves and inactivates c-Jun N-terminal kinase (JNK), which is a serine/threonine kinase that affects the regulation of cellular proliferation, apoptosis, inflammation, and tumorigenesis [
]. NleD contains an HEXXH motif, typical of zinc metallopeptidases [].
This family consists of YgfN, whose function is not clear. YgfN was identified as a molybdenum-binding subunit of a putative molydbopterin-containing selenate reductase complex consisting of proteins YgfK, YgfM and YgfN [
,
]. However, another study showed that a mutation in the gene (b2881) encoding this protein conferred sensitivity to adenine, which is associated with impaired synthesis of guanine nucleotides from adenine during purine salvage []. YgfN is also known as XdhD, as it has been suggested to be a molybdenum-containing protein of the xanthine oxidase family [].
The antiactivator protein ExsD represses the transcriptional activator ExsA. ExsA activates expression of type III secretion system genes. ExsD inhibits the DNA-binding and self-association properties of ExsA [
]. Repression of ExsA by ExsD is relieved by the secretion chaperone ExsC [,
].
YdiH is a family of proteins found in bacteria. Proteins in this family are typically between 62 and 80 amino acids in length. The function is not known.
This entry represents the centromere protein T (CENP-T). CENP-T is a component of the CENPA-NAC (nucleosome-associated) complex, which plays a central role in assembly of kinetochore proteins, mitotic progression and chromosome segregation [
,
]. CENP-T and CENP-W form a complex that is directly involved in establishment of centromere chromatin structure coordinately with CENP-A []. Histone-fold-containing CENP-T-W and CENP-S-X complexes can form a stable CENP-T-W-S-X heterotetramer. Structural analysis of the CENP-T-W-S-X complex suggests that it can form a unique nucleosome-like structure to generate contacts with DNA []. The CENP-T homologue in S. pombe is also known as Cnp20 [
].
DMP12 is a DNA-mimic protein from Neisseria species. In its monomeric form DMP12 interacts with the Neisseria dimeric form of the bacterial histone-like protein HU. HU proteins promote the assembly of higher-order DNA-protein structures. The interaction between DMP12 and HU may be instrumental in controlling the stability of the nucleoid in Neisseria, as DMP12 prevents Neisseria HU protein from being digested by trypsin [
].
This entry represents packaging protein 3 (also known as L1) from adenovirus. It is involved in viral genome packaging through its interaction with packaging protein 1 and 2 [
,
].
This entry includes uncharacterised sequences found in bacteria and virus. The viral sequences are annotated as baseplate proteins, a structural component of the phage baseplate, mostly from Siphoviridae [
].
This is a family of proteins found in eukaryotes that consists of proline-rich 15 (PRR15) and proline-rich 15-like proteins. PRR15 is expressed almost exclusively in post-mitotic cells both during foetal development and in adult tissues, such as the intestinal epithelium and the testis. Its expression in mouse and human gastrointestinal tumours is linked, directly or indirectly, to the disruption of the Wnt signaling pathway [
].
This family consists of predicted immunity proteins with an alpha+beta fold and a conserved tryptophan, and WE and PGW motifs. Proteins containing this domain are present in bacterial polymorphic toxin systems as an immediate gene neighbour of the toxin gene, usually containing a domain of the Ntox24 or Ntox10 families [
].
SmbP (small metal-binding protein) is a histidine-rich protein that capable of binding multiple equivalents of a variety of divalent and trivalent metals, including Cu2+ and Fe3+ but also Mn2+, Ni2+, Mg2+ and Zn2+. This protein is found to bind up to six Cu(II) atoms. It may have a role in cellular copper management in the ammonia-oxidizing bacterium N. europaea [
].
This group represents the fimbrillin protein MatB [
]. It is also known as EcpA, and is part of the ecpRABCDE operon, which encodes the E.coli common pilus (ECP), an adhesive structure produced by all E. coli pathogroups [] and plays a dual role in early-stage biofilm development and host cell recognition [].
This family includes proliferation-associated protein 2G4 (PA2G4, also known as ErbB3-binding protein 1) and its homologues such as ARX1 from yeast [
,
]. PA2G4 is a RNA-binding protein involved in growth regulation. This protein is present in pre-ribosomal ribonucleoprotein complexes and may be involved in ribosome assembly and the regulation of intermediate and late steps of rRNA processing [,
]. This protein can interact with the cytoplasmic domain of the ErbB3 receptor and may contribute to transducing growth regulatory signals []. This protein is also a transcriptional co-repressor of androgen receptor-regulated genes and other cell cycle regulatory genes through its interactions with histone deacetylases [,
]. ARX1 from yeast is involved in proper assembly of pre-ribosomal particles during the biogenesis of the 60S ribosomal subunit []. Although these proteins belong to the peptidase M24 family, they do not contain metal cofactors and lack aminopeptidase activity.
Hop1 is a key structural component of Saccharomyces cerevisiae synaptonemal complex, a complex found at synapses between homologous chromosomes during meiosis, and form when sister chromatids condense upon axial elements [
,
]. Hop1 is involved in both gene conversion and crossing over between homologues, as well as enforces meiotic recombination checkpoint control over the progression of recombination intermediates. It interacts with the Holliday junction, changes its global conformation and blocks the dissolution of the junction by a RecQ helicase in vitro []. Red1, Hop1 and Mek1 are three yeast meiosis-specific chromosomal proteins that uphold the interhomologue (IH) bias of meiotic recombination [
]. It has been suggested that Hop1 and Mek1 promote interactions between homologous chromosomes rather than inhibiting interactions between sister chromatids []. The localisation of Hop1 and Mek1 to axial elements is dependent on Red1 []. Hop1 contains an intrinsically disordered N-terminal domain and a protease-resistant C-terminal domain. The N-terminal domain is necessary for spore formation, while the C-terminal domain exhibits strong homotypic as well as heterotypic protein-protein interactions [
].
Protein Spindly is required for the localisation of dynein and dynactin to the mitotic kintochore [
]. It is required for silencing the spindle assembly checkpoint [,
]. It localises to microtubule plus ends in interphase and to kinetochores during mitosis []. The localisation of human Spindly (hSpindly) to kinetochore-microtubules is controlled by the Rod/Zw10/Zwilch (RZZ) complex and Aurora B [,
].
Members of this family have a perfect 4Fe-4S binding motif C-x(2)-C-x(2)-C-x(3)-CP followed by either a perfect or imperfect (the first Cys replaced by Ser) second copy. Members probably bind two 4fe-4S iron-sulphur clusters.
This domain is found in chromosomal and plasmid partition proteins related to ParB, including Spo0J, RepB, and SopB. ParB is involved in chromosome partition [
]. It localises to both poles of the predivisional cell following completion of DNA replication. ParB binds to DNA sequences adjacent to the origin of replication suggesting that this region is tethered to the poles of the cell at a specific time in the cell cycle. Spo0J has been shown to bind a specific DNA sequence that, when introduced into a plasmid, can serve as partition site. Study of RepB, which has nicking-closing activity, suggests that it forms a transient protein-DNA covalent intermediate during the strand transfer reaction.
This entry contains proteins related to the phage/prophage-immunity proteins and includes the immunity (imm) gene of the Escherichia coli bacteriophage T4. The Imm protein affects the exclusion of phage superinfecting cells already infected with T4. Phage which were excluded upon infection of cells possessing a plasmid-encoded Imm protein ejected only about one-half of their DNA. Therefore, the Imm protein inhibited, directly or indirectly, DNA ejection [
].
This family of proteins are variously described as 'hypothetical protein yifB', 'competence protein', 'hypothetical protein' or 'Mg chelatase-related protein'. Sequence comparison shows that YifB is closest to the chelatase family [
]. This family includes ComM, a protein that is induced during competence development [].
An impressive property of mussels is their ability to stick to wet surfaces.
Exactly how they do this is unclear, but they are known to exploit bundlesof threads, each of which has a fibrous collagenous core coated with
adhesive proteins []. These proteins are able to displace water from a wet surface and then set to form tight junctions.The adhesive protein of Mytilus coruscus (Sea mussel) contains 848 amino acids, including a 20-residue signal peptide, a 21-residue non-repetitive linker and a repetitive domain that constitutes the bulk of the protein. The representative repeat motif of this domain, YKPK(I/P)(S/T)YPP(T/S), is similar to that of Mytilus galloprovincialis (Mediterranean mussel). The codon usage patterns for
the same amino acids differ in different positions of the decapeptide motif[
,
]. Almost identical nucleotide sequences appear several times in the repetitive region, suggesting that mussel adhesive protein genes have evolved through repeat duplication []. The repeat motif is reminiscent of repeat units found in extensins, a group of plant proteins involved in the strengthening of the cell wall in response to mechanical stress.
This entry includes a group of basic-leucine zipper domain containing proteins, including CCAAT/enhancer-binding protein (C/EBP) from animals and BZIP23 (AT2G16770,
) and BZIP19 (At4g35040,
) from Arabidopsis. They are zipper transcription factors that bind to sequence-specific DNA.
Drosophila C/EBP may be required for the expression of gene products mediating border cell migration [
]. Arabidopsis BZIP19 and BZIP23 regulate the adaptation to zinc deficiency [].Mammalian C/EBP consists of CEBPalpha/beta/delta/epsilon/gamma. C/EBP-alpha coordinates proliferation arrest and the differentiation of myeloid progenitors, adipocytes, hepatocytes, keratinocytes, and cells of the lung and placenta []. C/EBP-beta is important in the differentiation and maturation of adipocytes and is increased during ER stress and proinflammatory conditions [].
MauM ferredoxin-type protein is involved in methylamine utilization [
]. NapG ferredoxin-type protein is associated with nitrate reductase activity []. The two proteins are highly similar.
Periplasmic nitrate reductase (NapABC enzyme) is responsible for nitrate dissimilation [
]. NapF protein is the auxiliary protein of the Nap systems. It interacts with the catalytic subunit, NapA, and may be an accessory protein for NapA maturation [].
This group represents a phycobilisome rod linker polypeptide, phycocyanin-associated. They are linker polypeptides that determine the state of aggregation and the location of the disk-shaped phycobiliprotein units within the phycobilisome and modulate their spectroscopic properties in order to mediate a directed and optimal energy transfer [
,
].
Members of this protein family have a signal peptide, a strongly conserved SH3 domain, a variable region, and then a C-terminal hydrophobic transmembrane alpha helix region.
This entry includes yeast TEL2-interacting protein 1 (Tti1) and its homologues from animals. Budding yeast Tti1 is a subunit of the ASTRA complex, which is involved in chromatin remodelling [
]. Human Tti1 homologue is part of the TTT complex that is required to stabilise protein levels of the phosphatidylinositol 3-kinase-related protein kinase (PIKK) family proteins. The TTT complex is involved in the cellular resistance to DNA damage stresses, like ionizing radiation (IR), ultraviolet (UV) and mitomycin C (MMC) [].
This entry includes vesicle-associated membrane proteins VAMP1/synaptobrevin-1, VAMP2/synaptobrevin-2, VAMP3/synaptobrevin-3 and VAMP8/endobrevin.VAMPs are a group of small, integral membrane proteins of synaptic vesicles that is mostly involved in vesicle fusion. The heterotrimeric SNARE complex is
formed by syntaxin 1, synaptosomal-associated protein 25kDa (SNAP25), and vesicle-associated membrane protein (VAMP)/synaptobrevin []. VAMP1 plays an essential, non-redundant role in Ca2+-triggered vesicle exocytosis at the mouse neuromuscular junction []. VAMP2 is extensively expressed in the central nervous system []. Neurotransmitter release involves the assembly of a heterotrimeric SNARE complex composed of SNAP25, VAMP2 and syntaxin 1 []. VAMP8 may be involved in regulated exocytosis of the exocrine system [,
].
A conserved heterotrimeric integral membrane protein complex--the Sec61 complex (eukaryotes) or SecY complex (prokaryotes)--forms a protein-conducting channel that allows polypeptides to be transferred across (or integrated into) the endoplasmic reticulum (eukaryotes) or across the cytoplasmic membrane (prokaryotes) [
,
]. This complex is itself a part of a larger translocase complex.The alpha subunits (
), called Sec61alpha in mammals, Sec61p in Saccharomyces cerevisiae (Baker's yeast), and SecY in prokaryotes, and the gamma subunits, called Sec61gamma in mammals, Sss1p in S. cerevisiae, and SecE in prokaryotes, show significant sequence conservation. Both subunits are required for cell viability in S. cerevisiae and Escherichia coli. The beta subunits, called Sec61beta in mammals, Sbh in S. cerevisiae, and SecG in archaea, are not essential for cell viability. They are similar in eukaryotes and archaea, but show no obvious homology to the corresponding SecG subunits in bacteria. This family includes Sec61 subunit beta, Sbh1 and Sbh2 from the eukaryotic Sec61 complex.
Multidrug resistance proteins are molecular pumps that participate in a low energy shock adaptive response in some bacteria. They mediate cellular resistance to toxicants such as cycloheximide (CYH), 4-nitroquinoline N-oxide (4-NQO), cadmium, and hydrogen peroxide in yeast [
].
Members of the ACR3 family of arsenite (As(III)) permeases confer resistance to arsenic by extrusion from cells [
]. They exist in prokaryotes and eukaryotes (lower plants and fungi) [,
]. The ACR3 permeases have ten-transmembrane span topology []. Corynebacterium glutamicum has three Acr3 proteins, CgAcr3-1, CgAcr3-2, and CgAcr3-3. CgAcr3-1 is thought to be an antiporter that catalyses arsenite-proton exchange [].The Shewanella oneidensis Acr3 is not able to transport As(III) and confers resistance only to arsenate (As(V)) [
], whereas the Acr3 orthologue from Synechocystis mediates tolerance to As(III), As(V) and antimonite (Sb(III)) [].In budding yeast, overexpression of the Acr3 gene confers an arsenite- but not an arsenate-resistance phenotype [
]. Saccharomyces cerevisiae Acr3 is a plasma membrane metalloid/H+ antiporter that transports arsenite and antimonite [].
Proteins in this group include:A chemotactic transduction protein from Pseudomonas aeruginosa.The homoserine/homoserine lactone efflux protein from Escherichia coli.and a number of hypothetical proteins.
This entry includes proteins PIN-LIKES 2 and 6 from Arabidopsis. They regulate intracellular auxin accumulation at the endoplasmic reticulum and thus auxin availability for nuclear auxin signalling [
].
This entry includes proteins PIN-LIKES 1/3/4/5/7 from Arabidopsis. They regulate intracellular auxin accumulation at the endoplasmic reticulum and thus auxin availability for nuclear auxin signalling [
,
].
Repressor protein C1 is a sequence-specific DNA-binding protein required for the establishment and maintenance of lysogeny. The binding of C1 to operator DNA is sensitive to N-ethylmaleimide [
].
Drug antiporters which use the proton motive force for the active efflux of the drug. This family includes tetracycline-resistance determinant (TetV) [
] and macrolide efflux protein A (MefA) [].
Peptidase family S49 includes signal peptide peptidases (SppAs). Site-directed mutagenesis and sequence analysis have shown these SppAs to be serine proteases. The predicted active site serine for members in this family occurs in a transmembrane domain. Mutagenesis studies also suggest that the catalytic centre comprises a Ser-Lys dyad and not the usual Ser-His-Asp catalytic triad found in the majority of serine proteases [
].Protein C from bacteriophage lambda (MEROPS identifier S49.003) is the peptidase responsible for the processing of several viral proteins. It activates itself, degrades the internal scaffold protein upon which the procapsid is assembled, and processes the portal complex protein gpB [
]. This entry also includes the scaffold protein from bacteriophage P21 and proteins from bacteria which are presumably the result of integration of viral DNA.
There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function. This entry includes the membrane protein YqfB from Bacillus subtilis.
This entry describes a family of small cytosolic proteins, about 80 amino acids in length, in which the eight invariant residues include three His residues and two Cys residues. Two pairs of these invariant residues occur in motifs HxH (where x is A or G) and CxH, both of which suggest metal-binding activity. This protein family was identified by searching with a phylogenetic profile based on an anaerobic sulphatase-maturase enzyme, which contains multiple 4Fe-4S clusters. The linkages by phylogenetic profiling and by iron-sulphur cluster-related motifs together suggest this protein may be an accessory protein to certain maturases in sulphatase/maturase systems.
This family includes the brain-expressed X-liked proteins (which include)human p75NTR-associated cell death executor), which may be a signalling adaptor molecule involved in p75NTR-apoptosis induced by nerve growth factor. It may be important in neurogenetic diseases [
].
This entry represents a group of 3xHMG-box proteins (containing three copies of the HMG-box domain and a unique basic N-terminal domain), including 3xHMG-box1 (AT4G11080, also known as HMGB13) and 3xHMG-box2 (AT4G23800, also known as HMGB6) from Arabidopsis. Their three HMG-box domains and the basic N-terminal domain contribute to DNA binding [
]. Their expression is induced in late G2 phase of the cell cycle, and upon nuclear envelope breakdown in prophase they rapidly associate with the chromosomes. Shortly after mitosis they are degraded and an N-terminal destruction-box mediates the proteolysis. They may play a role in the organisation of plant mitotic chromosomes [].
This entry represents the RNA-binding pleiotropic regulator Hfq, a small, Sm-like protein of bacteria. It helps pair regulatory non-coding RNAs with complementary mRNA target regions. It enhances the elongation of poly(A) tails on mRNA. It appears also to protect RNase E recognition sites (A/U-rich sequences with adjacent stem-loop structures) from cleavage. Being pleiotropic, it differs in some of its activities in different species. Hfq binds the non-coding regulatory RNA DsrA (see Rfam:RF00014) in the few species known to have it: Escherichia coli, Shigella flexneri, Salmonella spp. In Azorhizobium caulinodans, an hfq mutant is unable to express nifA, and Hfq is called NrfA, for nif regulatory factor (see
). The name Hfq reflects phenomenology as a host factor for phage Q-beta RNA replication.
The Hfq protein is conserved in a wide range of bacteria and varies in length from 70 to 100 amino acids. In all cases, a conserved Sm motif is located in the N-terminal halves of the molecules. The Hfq protein of E. coli is an 11kDa polypeptide that forms a hexameric ring-shaped structure. Structural studies have suggested that the beta-4 strand in one molecule dimerises with the beta-5 strand of a neighbouring subunit to form the hexamer. These two strands move with a concerted mobility which may explain the stability of the entire structure [
].The architecture of the Hfq-RNA complex suggests two, not mutually exclusive, mechanisms by which Hfq might exert its function as modulator of RNA-RNA interactions. First, when Hfq binds single-stranded RNA, the target site is unwound in a circular manner. This would greatly destabilise surrounding RNA structures that are located several nucleotides on either side of the binding site, thereby permitting new RNA-RNA interactions. Secondly, the repetition of identical BPs on the Hfq hexamer implies that the binding surface can accommodate more than just a single RNA target. This would allow simultaneous binding of two RNA strands and could greatly enhance interaction between the strands [
].
This family represents the RNA-binding protein KhpA, a probable RNA chaperone. It forms a complex with KhpB (also known as Jag or EloR) which binds to cellular RNA and controls its expression. KhpA plays a role in peptidoglycan (PG) homeostasis and cell length regulation [
]. The KhpA-KhpB complex stimulates or controls elongasome-mediated lateral cell wall biosynthesis. It is not clear what is the function of the KhpA homooligomer in vivo []. KhpA forms homodimers, additionally, it interacts with the KH-II domain of EloR, a key regulator of cell elongation in S. pneumoniae, forming an EloR/KhpA heterodimer [].
This entry represents the N-terminal domain of argonaute proteins from eukaryotes. This domain is composed of an antiparallel four-stranded β-sheet core that has two α-helices positioned along one face of the sheet and an extended β-strand towards its N terminus. The core fold of the N domain most closely resembles the catalytic domain of replication-initiator protein Rep. The N domain is linked to the PAZ domain via linker 1 region, and together these three regions are designated the PAZ-containing lobe of argonaute [
].
The CRISPR-Cas system is a prokaryotic defense mechanism against foreign genetic elements. The key elements of this defense system are the Cas proteins and the CRISPR RNA. Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes [
]. CRISPRs appear to provide acquired resistance against mobile genetic elements (viruses, transposable elements and conjugative plasmids). CRISPR clusters contain sequences complementary to antecedent mobile elements and target invading nucleic acids. CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA).The defense reaction is divided into three stages. In the adaptation stage, the invader DNA is cleaved, and a piece of it is selected to be integrated as a new spacer into the CRISPR locus, where it is stored as an identity tag for future attacks by this invader. During the second stage (the expression stage), the CRISPR RNA (pre-crRNA) is transcribed and subsequently processed into the mature crRNAs. In the third stage (the interference stage), Cas proteins, together with crRNAs, identify and degrade the invader [
,
,
].The CRISPR-Cas systems have been sorted into three major classes. In CRISPR-Cas types I and III, the mature crRNA is generally generated by a member of the Cas6 protein family. Whereas in system III the Cas6 protein acts alone, in some class I systems it is part of a complex of Cas proteins known as Cascade (CRISPR-associated complex for antiviral defense). The Cas6 protein is an endoribonuclease necessary for crRNA production whereas the additional Cas proteins that form the Cascade complex are needed for crRNA stability [
]. This entry represents the N-terminal region from a family of Cas proteins that includes TM1795 from Thermotoga maritima. The N-terminal half of these proteins is the region with the strongest level of conservation.
TMEM65 is an intercalated disc protein that interacts with with connexin 43 (Cx43) and is required for correct localization of Cx43 to the intercalated disc. It is is essential for cardiac function in zebrafish
[].
FecARI are responsible for induction of the ferric citrate uptake system. When the inducer, ferric citrate, binds FecA, which is also responsible for transport of ferric citrate across the outer membrane, a signal is transmitted across the outer membrane to the cytoplasmic membrane protein FecR. FecR transmits the signal across the cytoplasmic membrane and activates the sigma-70 family protein FecI. The periplasmic N terminus of FecA interacts with the periplasmic C-terminal portion of FecR [
].
This entry represents the N terminus of protein lines [
]. In Drosophila this protein is involved in embryonic segmentation and may function as a transcriptional regulator [,
].