Dynein regulatory complex protein 9, also known as IQ domain-containing protein G (IQCG), is an IQ motif-containing protein that is essential for sperm flagellum formation during spermiogenesis [
,
].
This entry describes a small, well-conserved bacterial protein family. Its sequence largely consists of a domain, HhH-GPD, found in a variety of related base excision DNA repair enzymes.
This entry represents a group of TPP-binding domain containing proteins, including oxalyl-CoA decarboxylase from E. coli, IorA from Thermococcus kodakarensis and HACL1 from animals. In general, they are thiamine pyrophosphate-dependent enzymes that catalyse the carbon-carbon bond cleavage [
]. In humans, HACL1 is a peroxisomal 2-OH acyl-CoA lyase involved in the cleavage (C1 removal) reaction in the fatty acid alpha-oxidation in a thiamine pyrophosphate (TPP)-dependent manner [,
,
]. It is also involved in the degradation of 3-methyl-branched fatty acids like phytanic acid and the shortening of 2-hydroxy long-chain fatty acids [,
,
].
Members of this protein family are only found archaeal methanogens. They show sequence similarity
AIR synthase related protein N- and C-terminal domains (and
). The functions of proteins in this family are unknown, but their role is likely one essential to methanogenesis.
Members of this protein family represent a distinct clade among the larger set of proteins that belong to
. Proteins from this clade are found in genome sequence if and only if the species sequenced is one of the methanogens. All methanogens belong to the archaea; some but not all of those sequenced are hyperthermophiles. This protein family was detected by the method of partial phylogenetic profiling [
].
This entry represents a domain found at the C-terminal region of the type VI secretion system (T6SS) immunity protein Tdi1 from Agrobacterium tumefaciens (Atu4351,
) and similar bacterial proteins. Tdi1 is organised into a N-terminal GAD-related domain (
) and a C-terminal domain likely to exist as an insertion in the GAD-like domain. This domain, previously known as DUF1851, shows a two-stranded antiparallel β-sheet and four helices. A positive groove that extends to both domains may be associated with nucleotide binding [
].
This entry includes a group of bacterial proteins, including EipB from Brucella. EipB is a periplasmic protein that functions as part of a system required for cell envelope homeostasis. It adopts a β-spiral fold, consisting of 14 β-strands and 2 α-helices [
].
This entry includes the mammalian mitochondrial 39S ribosomal protein L46 [
] and the mitochondrial 54S ribosomal protein L17 from the yeasts Saccharomyces cerevisiae and Schizosaccharomyces pombe [].
This entry represents Sm-like protein Lsm3. It can be found in the nuclear Lsm2-8 complex or in the cytoplasmic Lsm1-7 complex. The Lsm2-8 complex associates with multiple snRNP complexes containing the U6 snRNA (U4/U6 snRNP, U4/U6.U5 snRNP, and free U6 snRNP). It binds and stabilizes the 3'-terminal poly(U) tract of U6 snRNA and facilitates the assembly of U4-U6 di-snRNP and U4-U6-U5 tri-snRNP [
,
,
]. The Lsm1-7 complex associates with deadenylated mRNA and promotes decapping in the 5'-3' mRNA decay pathway [,
]. The Sm and the Lsm proteins, characterised by the Sm-domain, have RNA-related functions. The Sm heptamer ring associates with four (U1, U2, U4, U5) snRNPs, while Lsm2-8 heptamer is part of the U6 snRNP. Another Lsm heptameric complex, Lsm1-7, which differs from Lsm2-8 by one Lsm protein, functions in mRNA decapping, a crucial step in the mRNA degradation pathway [
].
This family of proteins is functionally uncharacterised. This family of proteins is found in mammals and includes two human proteins: C12orf60 and C12orf69.
ATP synthase (F1F0 ATPase), also called complex V, is an enzyme that uses a stream of protons passing through the inner membrane of the mitochondria to synthesize ATP. It consists of a part located on the mitochondrial membrane (F0) and containing a proton channel, and a catalytic component (F1) connected to F0 and locating on the side of the mitochondrial matrix [
]. The F0 component contains 3-9 protein subunits (9 in humans) including subunits 6 and 8, encoded by the mtDNA genes ATP6 and ATP8, respectively [].This entry represents ATP8 from mammals.
Transmembrane ATPases are membrane-bound enzyme complexes/ion transporters that use ATP hydrolysis to drive the transport of protons across a membrane. Some transmembrane ATPases also work in reverse, harnessing the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. There are several different types of transmembrane ATPases, which can differ in function (ATP hydrolysis and/or synthesis), structure (e.g., F-, V- and A-ATPases, which contain rotary motors) and in the type of ions they transport [
,
]. The different types include:F-ATPases (ATP synthases, F1F0-ATPases), which are found in mitochondria, chloroplasts and bacterial plasma membranes where they are the prime producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts).V-ATPases (V1V0-ATPases), which are primarily found in eukaryotes and they function as proton pumps that acidify intracellular compartments and, in some cases, transport protons across the plasma membrane [
]. They are also found in bacteria [].A-ATPases (A1A0-ATPases), which are found in Archaea and function like F-ATPases, though with respect to their structure and some inhibitor responses, A-ATPases are more closely related to the V-ATPases [
,
].P-ATPases (E1E2-ATPases), which are found in bacteria and in eukaryotic plasma membranes and organelles, and function to transport a variety of different ions across membranes.E-ATPases, which are cell-surface enzymes that hydrolyse a range of NTPs, including extracellular ATP.F-ATPases (also known as ATP synthases, F1F0-ATPase, or H(+)-transporting two-sector ATPase) (
) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), with additional subunits in mitochondria. Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis [
]. These ATPases can also work in reverse in bacteria, hydrolysing ATP to create a proton gradient.This entry represents subunit 8 found in the F0 complex of mitochondrial F-ATPases from Metazoa. This subunit appears to be an integral component of the stator stalk in yeast mitochondrial F-ATPases [
]. The stator stalk is anchored in the membrane, and acts to prevent futile rotation of the ATPase subunits relative to the rotor during coupled ATP synthesis/hydrolysis. This subunit may have an analogous function in Metazoa. Subunit 8 differs in sequence between Metazoa, plants () and fungi (
).
This entry represents Protein YfbM from Escherichia coli (strain K12) and similar proteins mainly found in bacteria. The structure of YfbM has been solved, showing alpha/beta/alpha layers with an antiparallel β-sheet. Although this protein is been suggested to be a binding site for peptide nucleic acids (PNAs, species-selective antibacterials, [
]), its function is yet to be characterised.
This entry represents the eukaryotic 60S acidic ribosomal protein P0, which is the functional equivalent of E.coli protein L10. It is involved in interaction between translational elongation factors and the ribosome [
].
This is the TIR-like domain of ThsB proteins, which adopts a Rossmann-like fold [
]. ThsB is responsible for recognizing phage infection [].Thoeris is a bacterial antiphage defense system, which consists of two genes, thsA and thsB, via NAD+ degradation [
,
,
]. ThsA has robust NAD+ cleavage activity and and a two-domain architecture containing a N-terminal NAD-binding domain (denoted as sirtuin-like or Macro) and C-terminal SLOG-like domain. In some instances, such as in B. amyloliquefaciens, ThsA has an N-terminal transmembrane domain []. ThsB (also referred to as TIR1 and TIR2) is structurally similar to TIR domain proteins but without enzymatic activity.
This entry represents a group of uncharacterised plant proteins similar to the putative BPI/LBP family protein At3g20270 from Arabidopsis thaliana. Proteins in this family show protein sequence similarity with the bactericidal permeability-increasing protein (BPI) and the lipopolysaccharide-binding protein (LBP), which may represent a common gene family of lipid-binding proteins [
,
].
The structural and functional relationships among independently clonedsegments of the plasmid ColE1 region that regulates and codes for colicin E1
(cea), immunity (imm) and the mitomycin C-induced lethality function (lys)have been analysed [
]. A model for the structure and expression of the colicin E1 operon has been proposed in which the cea and lys genes are
expressed from a single inducible promoter that is controlled by the lexArepressor in response to the SOS system of Escherichia coli [
]. The imm gene lies between the cea and lys genes and is expressed by transcription
in the opposite direction from a promoter located within the lys gene [].This arrangement indicates that the transcriptional units for all three
genes overlap. It is proposed that the formation of anti-sense RNA may be an important element in the coordinate regulation of gene expression
in this system []. Hydropathy analysis of the imm gene products suggests that they have
hydrophobic domains characteristic of membrane-associated proteins [].The microcin E1 immunity protein is able to protect a cell that harbours
the plasmid ColE1 encoding colicin E1 against colicin E1; it is thusessential both for autonomous replication and colicin E1 immunity [
].
Cilia- and flagella-associated protein 221 (Cfap221, also known as PCDP1) plays an important role in ciliary and flagellar biogenesis and motility [
]. It is expressed in spermatogenic cells and motile ciliated epithelial cells. Mutations in PCDP1 gene cause primary ciliary dyskinesia (PCD), which is characterised by sinusitis, male infertility, hydrocephalus, and situs inversus [].
Molybdenum storage protein subunits alpha and beta (MosA/MosB) are involved in intracellular storage of molybdenum. Each protein molecule can store at least 90 Mo atoms. The Mo storage protein from the nitrogen-fixing bacterium, Azotobacter vinelandii, is characterized as an alpha4-beta4 octamer containing a polynuclear molybdenum-oxide cluster which is ATP-dependent to bind Mo and pH-dependent to release Mo [
]. MosA and MosB are related to uridine monophosphate kinase (UMPK) enzymes.
This entry represents circadian clock protein kinase KaiC from bacteria and some uncharacterised KaiC-like proteins from archaea.The circadian clock protein KaiC, is encoded in the kaiABC operon that controls circadian rhythms and may be universal in
Cyanobacteria. Each member contains two copies of the KaiC domain, which is alsofound in other proteins. KaiC performs autophosphorylation and acts as its own transcriptional repressor. Kai proteins (KaiA and KaiC) appear to positively and negatively regulate kaiBC transcription which is consistent with a transcription/translation oscillatory (TTO) feedback model, believed to be at the core of all self-sustained circadian timers. However, the cyanobacterial circadian clock is able to function without de novo synthesis of clock gene mRNAs and the clock proteins, and the period is accurately determined without TTO feedback and the system is also temperature-compensated. It has been demonstrated that these three purified proteins form a temperature-compensated molecular oscillator in vitro that exhibits rhythmic phosphorylation and dephosphorylation of KaiC[
].A negative-stain electron microscopy study of Synechococcus elongatus (Thermosynechococcus elongatus) and Thermosynechococcus elongatus BP-1 KaiA-KaiC complexes in combination with site-directed mutagenesis reveals that KaiA binds exclusively to the CII half of the KaiC hexamer. The EM-based model of the KaiA-KaiC complex reveals protein-protein interactions at two sites: the known interaction of the flexible C-terminal KaiC peptide with KaiA, and a second postulated interaction between the apical region of KaiA and the ATP binding cleft on KaiC. This model brings KaiA mutation sites that alter clock period or abolish rhythmicity into contact with KaiC and suggests how KaiA might regulate KaiC phosphorylation [
].
In twenty or so anoxygenic photosynthetic alpha-Proteobacteria known so far, a gene for a member of this protein family is present and is found in the vicinity of puhA, which encodes a component of the photosynthetic reaction centre, and other genes associated with photosynthesis. This protein family is suggested, consequently, as a probable assembly factor for the photosynthetic reaction centre, but its seems its actual function has not yet been demonstrated.
Members of this entry belong to the alpha/beta fold family of hydrolases. Members are found in bacterial genomes, if and only if, they encode anoxygenic photosynthetic systems similar to that found in Rhodobacter capsulatus (Rhodopseudomonas capsulata) and other alpha-Proteobacteria.
Also proteins in this entry are often encoded in the same operon as subunits of the protoporphyrin IX magnesium chelatase, and were once designated BchO. No literature supports a role as an actual subunit of magnesium chelatase, but an accessory role is possible, as suggested by its genomic context and by its probable hydrolase activity.
This FliL protein controls the rotational direction of the flagella during chemotaxis [
]. FliL is a cytoplasmic membrane protein associated with the basal body [].
Conserved hypothetical protein CHP03083, actinobacterial-type
Type:
Family
Description:
This protein family pulls together several groups of proteins, each very different from the others. They share in common three conserved regions. The first is a region of about 38 amino acids, nearly always at the N terminus of a protein. This region has a bulky hydrophobic residue, usually Trp, at position 29, and a His residue at position 37 that is invariant, so far, in over 150 instances. The second conserved region has a motif [DE]xxxHxxD. The third conserved region contains a hydrophobic patch and a well-conserved Arg residue. Most examples are found in the Actinobacteria, including the genera Mycobacterium, Corynebacterium, Streptomyces, Nocardia, Frankia, etc. The pattern of near-invariant residues against a backdrop of extreme sequence divergence suggests enzymatic activity and conservation of active site residues.
FtsQ is an essential cell division protein. It may link together the upstream cell division proteins, which are predominantly cytoplasmic, with the downstream cell division proteins, which are predominantly periplasmic [
]. FtsQ may control the correct divisome assembly []. DivIB is a cell division protein from Gram-positive bacteria, probably homologous to Escherichia coli FtsQ. DivIB interacts with FtsL, DivIC and PBP-2B [
,
]. DivIB plays an essential role in division at high temperatures, maybe by protecting FtsL from degradation or by promoting formation of the FtsL-DivIC complex []. It is also required for efficient sporulation at all temperatures [].FtsQ and DivIB have a short N-terminal cytoplasmic domain and a larger C-terminal periplasmic domain [,
]. This entry represents the C-terminal region.
This protein family is restricted to a subset of endospore-forming bacteria, such as Bacillus subtilis, all of which are in the Firmicutes (low-GC Gram-positive) lineage. It is a minor SASP (small, acid-soluble spore protein) designated SspI. The gene in B. subtilis was previously designated ysfA.
This entry represents a group of tandem repeats, found in Arabidopsis species, whose sequence is distantly related to the FARP (FMRFamide) group of neuropeptides (
). The function of these repeats is not known, being mostly found in uncharacterised proetins, but they are also present in the nuclear migration protein NUM1 [
].
Conserved hypothetical protein CHP03067, planctomycetes
Type:
Domain
Description:
This domain occurs in several species, mostly from the Planctomycetes division of the bacteria. It has expanded into a paralogous family of at least twenty-five members in Gemmata obscuriglobus UQM 2246. This family appears related to
, which has also expanded into a large paralogous family in G. obscuriglobus.
Sigma-E is important for the induction of proteins involved in heat shock response. RseA binds sigma-E via its N-terminal domain, sequestering sigma-E and preventing transcription from heat-shock promoters [
]. The C-terminal domain is located in the periplasm, and may interact with other protein that signal periplasmic stress.
Sigma-E is important for the induction of proteins involved in heat shock response. RseA binds sigma-E via its N-terminal domain, sequestering sigma-E and preventing transcription from heat-shock promoters [
]. The C-terminal domain is located in the periplasm, and may interact with other protein that signal periplasmic stress.
Non-structural protein NSP3, N-terminal, rotavirus
Type:
Homologous_superfamily
Description:
The NSP3 protein has been shown to bind viral RNA. It consists of 3 conserved functional domains; a basic region which binds ssRNA, a region containing heptapeptide repeats mediating oligomerisation and a leucine zipper motif [
]. NSP3 may play a central role in replication and assembly of genomic RNA structures []. Rotaviruses have a dsRNA genome and are a major cause of acute gastroenteritis in the young of many species [].The NPS3 homodimer fold has two domains, one all-alpha and the other alpha beta. It is asymmetric with each domain intertwining with its counterpart.This entry represents the N-terminal domain of NSP3 from rotavirus. It consists of 5 alpha helices.
This presumed family is about 160 residues long. In
it is associated with a helix-turn-helix domain. This suggests that this may be a ligand-binding family.
The proteins in this family are around 140-170 residues in length. The proteins contain many conserved residues, with the most conserved motifs found in the central and C-terminal region. The function of these proteins is unknown.
This TPR repeat-containing protein is the CcmI protein (also called CycH) of c-type cytochrome biogenesis. CcmI is thought to act as an apo-cytochrome c chaperone. This entry describes the N-terminal region.
Within mitochondria and bacteria, a family of related proteins is involved in the assembly of periplasmic c-type cytochromes: these include CycK [
], CcmF [,
], NrfE [] and CcbS []. These proteins may play a role in guidance of apocytochromes and haem groups for their covalent linkage by the cytochrome-c-haem lyase. Members of the family are probably integral membrane proteins, with up to 16 predicted transmembrane (TM) helices.The gene products of the hel and ccl loci have been shown to be required specifically for the biogenesis of c-type cytochromes in the Gram-negative photosynthetic bacterium Rhodobacter capsulatus. The ccl locus contains two genes, ccl1 and ccl2, each of which possesses typical signal sequences to direct them to the periplasm [
]. Ccl1 is similar to proteins encoded by chloroplast and mitochondrial genes, suggesting analogous functions in these organelles. It is believed that the hel-encoded proteins are required for the export of haem to the periplasm, where it is subsequently ligated to the c-type apocytochromes []. The CycK and CycL proteins of Bradyrhizobium japonicum share up to 53% amino acid sequence identity with R. capsulatus proteins Cc11 and Cc12 proteins, respectively. CycK and CycL proteins, which are encoded by the cycHJKL-cluster, may form part of a cytochrome c-haem lyase complex whose active site faces the periplasm [
].
This family represents proteins involved in cytochrome c assembly in mitochondria, chloroplast and bacteria; including CcmC from Escherichia coli [
] and CcsA from Chlamydomonas []. CcsA is called ResC in Bacillus (where there is additional N-terminal sequence) [], while chloroplast proteins are consistently named CcsA.CcmC interacts directly with heme, and it is the only protein of the ccm operon that is strictly required for heme transfer [
]. CcsA is required during biogenesis of c-type cytochromes at the step of heme attachment [].
This entry is a small cysteine-rich repeat. The cysteines mostly follow a C-X(2)-C-X(3)-C-X(2)-C-X(3) pattern, though they often appear at other positions in the repeat as well [
].
This entry represents the Csf3 family of Cas proteins. Members of this family show up near CRISPR repeats in Acidithiobacillus ferrooxidans ATCC 23270, Azoarcus sp. (strain EbN1), and Rhodoferax ferrireducens (strain DSM 15236/ATCC BAA-621/T118). In the latter two species, the CRISPR/cas locus is found on a plasmid. This family is one of several characteristic of a type of cas gene cluster we designate Aferr after A. ferrooxidans, where it is both chromosomal and the only type of cas gene cluster found. The gene is designated csf3 (CRISPR/cas Subtype as in A. ferrooxidans protein 3), as it lies third closest to the repeats.The CRISPR-Cas system is a prokaryotic defense mechanism against foreign genetic elements. The key elements of this defense system are the Cas proteins and the CRISPR RNA. Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes [
]. CRISPRs appear to provide acquired resistance against mobile genetic elements (viruses, transposable elements and conjugative plasmids). CRISPR clusters contain sequences complementary to antecedent mobile elements and target invading nucleic acids. CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA).The defense reaction is divided into three stages. In the adaptation stage, the invader DNA is cleaved, and a piece of it is selected to be integrated as a new spacer into the CRISPR locus, where it is stored as an identity tag for future attacks by this invader. During the second stage (the expression stage), the CRISPR RNA (pre-crRNA) is transcribed and subsequently processed into the mature crRNAs. In the third stage (the interference stage), Cas proteins, together with crRNAs, identify and degrade the invader [
,
,
].The CRISPR-Cas systems have been sorted into three major classes. In CRISPR-Cas types I and III, the mature crRNA is generally generated by a member of the Cas6 protein family. Whereas in system III the Cas6 protein acts alone, in some class I systems it is part of a complex of Cas proteins known as Cascade (CRISPR-associated complex for antiviral defense). The Cas6 protein is an endoribonuclease necessary for crRNA production whereas the additional Cas proteins that form the Cascade complex are needed for crRNA stability [
].
This entry represents the Csf2 family of Cas proteins. Members of this family show up near CRISPR repeats in Acidithiobacillus ferrooxidans ATCC 23270, Azoarcus sp. (strain EbN1), and Rhodoferax ferrireducens (strain DSM 15236/ATCC BAA-621/T118). In the latter two species, the CRISPR/cas locus is found on a plasmid. This family is one of several characteristic of a type of cas gene cluster we designate Aferr after A. ferrooxidans, where it is both chromosomal and the only type of cas gene cluster found. The gene is designated csf2 (CRISPR/cas Subtype as in A. ferrooxidans protein 2), as it lies second closest to the repeats.The CRISPR-Cas system is a prokaryotic defense mechanism against foreign genetic elements. The key elements of this defense system are the Cas proteins and the CRISPR RNA. Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes [
]. CRISPRs appear to provide acquired resistance against mobile genetic elements (viruses, transposable elements and conjugative plasmids). CRISPR clusters contain sequences complementary to antecedent mobile elements and target invading nucleic acids. CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA).The defense reaction is divided into three stages. In the adaptation stage, the invader DNA is cleaved, and a piece of it is selected to be integrated as a new spacer into the CRISPR locus, where it is stored as an identity tag for future attacks by this invader. During the second stage (the expression stage), the CRISPR RNA (pre-crRNA) is transcribed and subsequently processed into the mature crRNAs. In the third stage (the interference stage), Cas proteins, together with crRNAs, identify and degrade the invader [,
,
].The CRISPR-Cas systems have been sorted into three major classes. In CRISPR-Cas types I and III, the mature crRNA is generally generated by a member of the Cas6 protein family. Whereas in system III the Cas6 protein acts alone, in some class I systems it is part of a complex of Cas proteins known as Cascade (CRISPR-associated complex for antiviral defense). The Cas6 protein is an endoribonuclease necessary for crRNA production whereas the additional Cas proteins that form the Cascade complex are needed for crRNA stability [
].
StAR-related lipid transfer protein 4 (StarD4) contains a steroidogenic acute regulatory-related lipid transfer (START) domain, a lipid binding domain which appears in a wide range of proteins involved in several cellular functions [
]. StarD4 is an intracellular cholesterol transporter that plays an important role in the maintenance of cellular cholesterol homeostasis. It binds free cholesterol and increases cholesteryl ester formation []. It is also involved in translocation of 7-hydroperoxycholesterol to isolated mitochondria []. The expression of StarD4 can be sterol-repressed through the SREBP-2 pathway [].
Kelch-like protein 3 (KLHL3) is a substrate adaptor protein in the CUL3-KLHL3 E3 ubiquitin ligase complex. It targets WNK4 kinase 1 for ubiquitination and degradation, which in turn regulates electrolyte homeostasis [
]. Mutations in the KLHL3 gene cause cause pseudohypoaldosteronism type II (PHAII), which is a rare Mendelian syndrome featuring hypertension and hyperkalemia resulting from constitutive renal salt reabsorption and impaired K(+) secretion [,
]. The CUL3-KLHL3 E3 ligase complex may regulate blood pressure via its ability to interact with and ubiquitylate WNK isoforms [,
].The KLHL (Kelch-like) proteins generally have a BTB/POZ domain, a BACK domain, and five to six Kelch motifs. They constitute a subgroup at the intersection between the BTB/POZ domain and Kelch domain superfamilies. The BTB/POZ domain facilitates protein binding [
], while the Kelch domain (repeats) form β-propellers. The Kelch superfamily of proteins can be subdivided into five groups: (1) N-propeller, C-dimer proteins, (2) N-propeller proteins, (3) propeller proteins, (4) N-dimer, C-propeller proteins, and (5) C-propeller proteins. KLHL family members belong to the N-dimer, C-propeller subclass of Kelch repeat proteins []. In addition to BTB/POZ and Kelch domains, the KLHL family members contain a BACK domain, first described as a 130-residue region of conservation observed amongst BTB-Kelch proteins []. Many of the Kelch-like proteins have been identified as adaptors for the recruitment of substrates to Cul3-based E3 ubiquitin ligases [,
].This entry represents the BACK (BTB and C-terminal Kelch) domain of KLHL3.
The KLHL (Kelch-like) proteins generally have a BTB/POZ domain, a BACK domain, and five to six Kelch motifs. They constitute a subgroup at the intersection between the BTB/POZ domain and Kelch domain superfamilies. The BTB/POZ domain facilitates protein binding [
], while the Kelch domain (repeats) form β-propellers. The Kelch superfamily of proteins can be subdivided into five groups: (1) N-propeller, C-dimer proteins, (2) N-propeller proteins, (3) propeller proteins, (4) N-dimer, C-propeller proteins, and (5) C-propeller proteins. KLHL family members belong to the N-dimer, C-propeller subclass of Kelch repeat proteins []. In addition to BTB/POZ and Kelch domains, the KLHL family members contain a BACK domain, first described as a 130-residue region of conservation observed amongst BTB-Kelch proteins []. Many of the Kelch-like proteins have been identified as adaptors for the recruitment of substrates to Cul3-based E3 ubiquitin ligases [,
].Kelch-like protein 22 (KLHL22) is a substrate-specific adaptor of the BCR (BTB-CUL3-RBX1) E3 ubiquitin ligase complex. It targets Polo-like kinase 1 (PLK1) and regulates its kinetochore localisation during mitosis [
]. The BCR(KLHL22) ubiquitin ligase complex mediates monoubiquitination of PLK1, leading to PLK1 dissociation from phosphoreceptor proteins and subsequent removal from kinetochores, allowing silencing of the spindle assembly checkpoint (SAC) and chromosome segregation [,
,
].
This entry describes the DndD protein encoded by an operon associated with a sulphur-containing modification to DNA [
]. The operon is sporadically distributed in bacteria, much like some restriction enzyme operons. DndD is described as a putative ATPase. The small number of examples known so far include species from among the Firmicutes, Actinomycetes, Proteobacteria, and Cyanobacteria.
Kelch-like protein 38 (KLHL38) belongs to the KLHL family [
]. Its function is not clear.The KLHL (Kelch-like) proteins generally have a BTB/POZ domain, a BACK domain, and five to six Kelch motifs. They constitute a subgroup at the intersection between the BTB/POZ domain and Kelch domain superfamilies. The BTB/POZ domain facilitates protein binding [
], while the Kelch domain (repeats) form β-propellers. The Kelch superfamily of proteins can be subdivided into five groups: (1) N-propeller, C-dimer proteins, (2) N-propeller proteins, (3) propeller proteins, (4) N-dimer, C-propeller proteins, and (5) C-propeller proteins. KLHL family members belong to the N-dimer, C-propeller subclass of Kelch repeat proteins []. In addition to BTB/POZ and Kelch domains, the KLHL family members contain a BACK domain, first described as a 130-residue region of conservation observed amongst BTB-Kelch proteins []. Many of the Kelch-like proteins have been identified as adaptors for the recruitment of substrates to Cul3-based E3 ubiquitin ligases [,
].
The atp operon of most prokaryotes contains the structural genes for the F-ATPase (ATP synthase), which are preceded by an atpI gene that encodes
a membrane protein of unknown function. AtpI is thought to support optimal ATP synthase assembly and stability [,
]. A role in magnesium uptake has also been suggested [].Proteins in this entry are found only in Gram-positive bacteria from the phylum Firmicutes.
The DNA single-strand annealing proteins (SSAPs), such as RecT, Red-beta, ERF and Rad52, function in RecA-dependent and RecA-independent DNA recombination pathways. This superfamily includes proteins related to Rad52. These proteins contain two helix-hairpin-helix motifs [
].Rad52 was identified in Saccharomyces cerevisiae (Baker's yeast) as a component of the homologous recombination repair pathway and to play an important role in both meiotic and mitotic recombination. The human protein is highly homologous in both structure and function. In the presence of absence of DNA, Rad52 forms ring-shaped oligomers which bind both single and double stranded DNA, stimulating annealing of complimentary DNA strands and promoting ligation of both cohesive and blunt-end fragments. Rad52 may act as a recombination mediator, optimising catalysis of strand exchange by the Rad51 protein.A C-terminal self-association domain has been identified that mediates formation of higher order oligomers of Rad52 rings. Formation of these oligomers may be important for interaction with more than one DNA molecule [
].
Macroautophagy is a bulk degradation process induced by starvation in eukaryotic cells. In yeast, 15 Atg proteins coordinate the formation of autophagosomes. The pre-autophagosomal structure contains at least five Atg proteins: Atg1p, Atg2p, Atg5p, Aut7p/Atg8p and Atg16p, found in the vacuole [
,
]. The C-terminal glycine of Atg12p is conjugated to a lysine residue of Atg5p via an isopeptide bond. During autophagy, cytoplasmic components are enclosed in autophagosomes and delivered to lysosomes/vacuoles. Autophagy protein 16 (Atg16) has been shown to bind to Atg5 and is required for the function of the Atg12p-Atg5p conjugate []. Autophagy protein 5 (Atg5) is directly required for the import of aminopeptidase I via the cytoplasm-to-vacuole targeting pathway [].Atg5 comprises two ubiquitin-like domains that flank a helix-rich domain. The N- and C-terminal ubiquitin-like domains are called UblA and UblB, respectively, and the helix-rich domain between UblA and UblB, is called HR. Both UblA and UblB comprise a five-stranded -sheet and two-helices, which is a conserved feature in all ubiquitin superfamily proteins. The HR domain consists of three alpha helices [
].This superfamily represents the UblA domain.
Tac2-N (tandem C2 domains nuclear protein) represents a novel class of C-type tandem C2 proteins. Tac2-N proteins are almost exclusively localised in the nucleus [
].
Pop1 is a component of ribonuclease P, a ribonucleoprotein enzyme that cleaves precursor tRNA transcripts to give mature 5' ends. It is also a component of RNase MRP [
,
,
].
Bifunctional adenosylcobalamin biosynthesis protein CobU/CobP
Type:
Family
Description:
This family includes Bifunctional adenosylcobalamin biosynthesis protein CobU, Bifunctional adenosylcobalamin biosynthesis protein CobP and similar proteins mainly found in bacteria. This group of bifunctional cobalamin biosynthesis enzymes display cobinamide kinase and cobinamide phosphate guanyltransferase activity. The crystal structure of the enzyme reveals the molecule to be a trimer with a propeller-like shape [
].
Hopanoid biosynthesis-associated membrane protein HpnM
Type:
Family
Description:
The genes encoding the proteins in this entry are all found in bacteria containing the machinery necessary for biosynthesis of hopanoid lipids. Furthermore, these genes are usually located proximal to other components of this biological process. The proteins are members of a family of putative transporters known as "toluene tolerance protein Ttg2D", although it is unlikely that the proteins in this entry have anything to do with toluene per-se.
In many bacterial species ribosmal protein S12 is posttranslationally modified by the methylthiolation of the aspartate residue at position 88. The enzyme responsible for this modification is RimO, a radical S-adenoslymethionine protein [
].
This model identifies the generic virulence translocation proteins in bacteria. It derives its name:'Yop' from Yersinia enterocolitica species, where this virulence protein was identified. In bacterial pathogenesis, Yop effector proteins are translocated into the eukaryotic cells.
Transcription termination/antitermination protein NusA, bacterial
Type:
Family
Description:
NusA, or N utilisation substance protein A, is a multidomain regulator of transcript elongation in bacteria and archaea. This entry represents bacterial NusA, which interacts with elongating complexes and the nascent RNA transcript in ways that stimulate pausing and termination, but that can be switched to antipausing and antitermination by other accessory proteins [
,
]. It is also involved in both DNA repair and damage tolerance pathways [].
Dynein regulatory complex protein 10 (DRC10), also known as IQCD, is a component of the nexin-dynein regulatory complex (N-DRC), a key regulator of ciliary/flagellar motility which maintains the alignment and integrity of the distal axoneme and regulates microtubule sliding in motile axonemes [
,
]. It interacts with mammalian homologue of C. elegans uncoordinated gene 13 (Munc13) in spermatozoa and may participate in acrosome exocytosis []. I
ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [
].The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [,
,
].The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly β-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel β-sheet of armI by a two-fold axis [
,
,
,
,
,
].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions [
]. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [,
]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [,
,
].This entry represents the ATP binding subunit of the multisubunit cobalt transporter in bacteria and its equivalents in archaea. This superfamily includes two groups, one which catalyses the uptake of small molecules, including ions from the external milieu and the other group which is engaged in the efflux of small molecular weight compounds and ions from within the cell. Energy derived from the hydrolysis of ATP drives both the process of uptake and efflux.
This entry represents a set of hypothetical bacterial proteins containing a core of six α-helices, where one central helix is surrounded by the other five. The exact function of this family has not, as yet, been determined [
].
This domain is found in a wide range of outer membrane proteins. This domain assumes a membrane bound β-barrel fold. It is part of a Pfam clan that includes other outer membrane protein β-barrel domains.
Members of this family of Mycoplasma hypothetical proteins adopt a helical structure, with one central α-helix surrounded by five others, in a NusB-like fold. Their function has not, as yet, been determined [
].
This family adopts a secondary structure consisting of six alpha helices, with four long helices (alpha1, alpha2, alpha5, alpha6) forming a left-handed, antiparallel alpha helical bundle. The function of this family of archaeal hypothetical proteins has not, as yet, been defined [
].
Multiple myeloma tumor-associated protein 2 (MMTAG2) is an uncharacterized protein initially identified in the human multiple myeloma cell line ARH-77. The protein contains phosphorylation sites, N-myristoylation sites and nuclear localization signals [
]. Proteins in this entry contain a kinase phosphorylation domain.
C2 calcium-dependent domain-containing protein 4A (NLF1) and 4B (NLF2) are nuclear factors highly expressed in endothelial cells and induced by acute inflammation. They may have a role in regulating genes that control cellular architecture and adhesion [
].
Ribonuclease P (Rnp) is a ubiquitous ribozyme that catalyzes a Mg2 -dependent hydrolysis to remove the 5'-leader sequence of precursor tRNA (pre-tRNA) in all three domains of life [
]. In bacteria, the catalytic RNA (typically ~120kDa) is aided by a small protein cofactor (~14kDa) []. Archaeal and eukaryote RNase P consist of a single RNA and archaeal RNase P has four or five proteins, while eukaryotic RNase P consists of 9 or 10 proteins. Eukaryotic and archaeal RNase P RNAs cooperatively function with protein subunits in catalysis [].Eukaryotic nuclear RNase P shares most of its protein components with another essential RNP enzyme, nucleolar RNase MRP [
]. RNase MRP (mitochondrial RNA processing) is an rRNA processing enzyme that cleaves various RNAs, including ribosomal, messenger, and mitochondrial RNAs. It can cleave a specific site within precursor rRNA to generate the mature 5'-end of 5.8S rRNA []. Despite its name, the vast majority of RNase MRP is localized in the nucleolus []. RNase MRP has been shown to cleave primers for mitochondrial DNA replication and CLB2 mRNA. In yeast, RNase MRP possesses one putatively catalytic RNA and at least 9 protein subunits (Pop1, Pop3-Pop8, Rpp1, Snm1 and Rmp1) []. Human RNase MRP complex consists of 267 nucleotides and supports the interaction with and among at least seven protein components: hPop1, hPop5, Rpp20, Rpp25, Rpp30, Rpp38, and Rpp40) and three additional proteins, hPop4, Rpp21 and Rpp14, have been reported to be associated with at least a subset of RNase MRP complexes [].This entry includes animal Rpp38, which is a component of the Rnp and the MRP ribonuclease complexes [
].
This entry represents absorption protein P2 (synonym: receptor-binding protein P2) from the bacteriophage PRD1. Absorption protein P2 is a multi-β-sheet protein whose complicated topology forms an elongated seahorse-shaped molecule with a distinct head, containing a pseudo-beta propeller structure with approximate six-fold symmetry, and tail (β-sandwich). It is required for the attachment of the phage to the host conjugative DNA transfer complex. This is a poorly understood large transmembrane complex of unknown architecture, with at least 11 different proteins [
].
The eukaryotic Sm and Sm-like (Lsm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel β-sheet [,
,
]. Lsm11 is an SmD2-like subunit which binds U7 snRNA along with Lsm10 and five other Sm subunits to form a 7-membered ring structure. Lsm11 and the U7 snRNP of which it is a part are thought to play an important role in histone mRNA 3' processing [
,
]. This family also includes Small nuclear ribonucleoprotein Sm D-like protein, which is likely an homologue of Lsm11.
This entry represents a group of plant proteins, including protein ENHANCED DISEASE RESISTANCE 4 (EDR4) from Arabidopsis. EDR4 plays a negative role in resistance to powdery mildew. It localizes at the plasma membrane and endosomal compartments. It interacts with EDR1 and recruits EDR1 to the fungal penetration site, where it regulates defense responses. It also interacts with CLATHRIN HEAVY CHAIN2 (CHC2) and edr4 mutants show reduced endocytosis [
].
This entry includes TPRG1 and TPRG1L. TPRG1L, also known as protein Mover, is a presynaptic protein that is differentially expressed across brain areas and synapse types [
,
].
This entry consists of Mycoplasma hypothetical proteins that adopt a multi-helical structure containing a buried central helix. Their function has not been determined yet.
PAR basic leucine zipper (bZIP) factors constitute a group of circadian transcription factors that have conserved basic regions flanked by proline- and acidic-amino-aci-rich (PAR) domains and functionally compatible leucine zipper dimerization domains [
]. PARbZip proteins are transcriptionally controlled by the circadian molecular oscillator, and in turn control expression of transcription factors and enzymes involved in metabolism and xenobiotic detoxification [].This entry includes vertebrates PARbZip proteins thyrotroph embryonic factor TEF/VBP, D site-binding protein (DBP), and hepatic leukemia factor (HLF), and Drosophila giant bZIP factor. Giant regulates the expression of the Krüppel and knirps segmentation gap genes [
].
Colicin D is a bacteriocin that kills target cells by cleaving tRNA(Arg). Colicin D immunity protein (ImmD) inhibits the bactericidal activity of colicin D by binding to its tRNase catalytic domain [
]. This entry represents the structural domain of ImmD and related klebicin and microcin immunity proteins.
Members of this family are found in hypothetical proteins synthesised by the Archaeal organism Sulfolobus. Their exact function has not, as yet, been determined.
This entry describes the coenzyme PQQ (pyrrolo-quinoline-quinone) biosynthesis protein PqqC. Pyrroloquinoline quinone (PQQ) is the prosthetic group of several bacterial enzymes, including methanol dehydrogenase of methylotrophs and the glucose dehydrogenase of a number of bacteria [
]. PQQC is an oxidase whose reaction involves a ring closure and eight-electron oxidation of its substrate (AHQQ) to produce PQQ []. This entry does not include related proteins likely to be functionally distinct from PqqC, such as homologues found in the Chlamydias.
Selenocysteine insertion sequence-binding protein 2
Type:
Family
Description:
Selenocysteine insertion sequence-binding protein 2 (SECISBP2) is required for the incorporation of the amino acid selenocysteine (Sec) into proteins during translation. The mRNA triplet UGA is usually a stop codon, but the presence of a Sec insertion sequence element (SECIS) in the 3'-untranslated region of an mRNA, and its binding to SECISBP2, leads to insertion of Sec into the protein instead. Mutations in the SECISBP2 gene that alter the amino acid sequence or cause splicing defects lead to abnormal thyroid hormone metabolism. A splice variant of SECISBP2 is localized to the mitochondrion rather than the nucleus [
].This entry also includes selenocysteine insertion sequence-binding protein 2-like (SECISBP2L), which also binds SECIS, but in mammals does not lead to incorporation of Sec into proteins, because it lacks the SECIS dependent domain association that is found in SECISBP2. An invertebrate orthologue is fully competent for Sec incorporation, however [
].Homologues are also known from bacteria.
This entry includes subunits of the mitochondrial ribosome including 28S ribosomal protein S25 (MRPS25) from mammals, 54S ribosomal protein Mrp49 from the yeasts Saccharomyces cerevisiae and Schizosaccharomyces pombe. MRPS25 is a component of the mitochondrial small ribosomal subunit [
]. Mrp49 is a component of the mitochondrial large ribosomal subunit [].