This domain family is found in viruses, and is approximately 90 amino acids in length. It is found in association with
,
. and contains a conserved GLG sequence motif. 1a protein is the major virulence factor of the (Cucumber mosaic virus. The Ns strain of CMV causes necrotic lesions to Nicotiana spp. while other strains cause systemic mosaic. The determinant of the pathogenesis of these different strains is the specific amino acid residue at the 461 residue of the 1a protein [
].
FeMo cofactor biosynthesis protein NifB, C-terminal
Type:
Domain
Description:
NifB belongs to a family of iron-molybdenum cluster-binding proteins that includes NifX, and NifY, all of which are involved in the synthesis of an iron-molybdenum cofactor (FeMo-co) that binds the active site of the dinitrogenase enzyme as part of nitrogen fixation in bacteria. This domain is sometimes found fused to a N-terminal domain (the Radical SAM domain) in nifB-like proteins [,
].
CBS domains are evolutionarily conserved structural domains found in a variety of non functionally-related proteins from all kingdoms of life. These domains pair together to form a intramolecular dimeric structure (CBS pair), termed Bateman domain [
,
,
,
]. CBS domains have been shown to bind mainly ligands with an adenosyl group such as AMP, ATP and S-AdoMet, but may also bind metal ions, or nucleic acids [,
]. Hence, they play an essential role in the regulation of the activities of numerous proteins, and mutations in them are associated with several hereditary diseases [,
,
]. CBS domains are found attached to a wide range of other protein domains suggesting that CBS domains may play a regulatory role making proteins sensitive to adenosyl-carrying ligands. The region containing the CBS domains in cystathionine-beta synthase is involved in regulation by S-AdoMet []. CBS domain pairs from AMPK bind AMP or ATP []. The CBS domains from IMPDH, which bind ATP, have shown to have a role in the regulation of adenylate nucleotide synthesis [,
].This entry represents the CBS domain found in bacteria and plants proteins, including mitochondrial CBSX3 from Arabidopsis. CBSX3 interacts with and activates o-type thioredoxin (Trx-o2) Trx-o2, increasing its activity, which is known to play regulatory roles in the electron transport chain (ETC) complex II. This interaction regulates ROS generation in mitochondria and plays a key role in the modulation of plant development and growth [,
].
Bacteriophage P4, Psu, polarity suppression protein
Type:
Family
Description:
This family contains a number of phage polarity suppression proteins (Psu) (approximately 190 residues long). The Psu protein of Bacteriophage P4
causes suppression of transcriptional polarity in Escherichia coli by overcoming Rho termination factor activity []. It has the structure of a golf stick composed of seven helices. Psu is described to have a complicated knotted dimeric conformation. It binds to the hexameric capsomere on the P4 capsid to prevent DNA leakage [].
Conserved hypothetical protein CHP02300, FYDLN acid
Type:
Family
Description:
Members of this family are bacterial proteins with a conserved motif [KR]FYDLN, sometimes flanked by a pair of CXXC motifs, followed by a long region of low complexity sequence in which roughly half the residues are Asp and Glu, including multiple runs of five or more acidic residues. The function of members of this family is unknown.
GspG is the major pseudopilin of the type 2 secretion systems (T2SSs). The N-terminal hydrophobic helices of the GspG subunits arrange within the core of the pseudopilus, with the C-terminal domains and the Ca2+-binding sites located at the surface. The structure of GspG (also known as PulG) has been revealed [
]. The type II secretion system (T2SS) is one of several extracellular secretion systems in gram-negative bacteria. It delivers toxins and a range of hydrolytic enzymes including proteases, lipases and carbohydrate-active enzymes to the cell surface or extracellular space [
]. T2SS systems are composed of 11 to 15 different proteins, which are generally called GspA to GspO and GspS. The T2SS spans the two bacterial membranes and ensures secretion of folded proteins across the outer membrane pore formed by GspD. The inner membrane complex contains GspC, GspL, GspM, and GspF. The cytoplasmic domains of GspL and GspF interact with an ATPase, GspE. GspE is thought to energize the formation of a short pseudopilus by several pilin-like proteins, GspG to GspK []. GspD has been shown to interact with the inner membrane component GspC []. The T2SS pseudopilus is a periplasmic filament composed of the major pseudopilin, EpsG, and four minor pseudopilins, EpsH, EpsI, EpsJ and EpsK. Pseudopilus is assembled by the polymerization of GspG (also known as PulG) subunits. Pseudopilin proteins have a conserved N-terminal hydrophobic segment followed by a more variable C-terminal periplasmic and globular domain [
].
GspI is a pseudopilin component of the type II secretion system (T2SS). It contains the prepilin signal sequences [
]. In Pseudomonas aeruginosa GspI homologue, known as XcpV, has been suggested to be the central component and initiator of pseudopilus formation [].The type II secretion system (T2SS) is one of several extracellular secretion systems in gram-negative bacteria. It delivers toxins and a range of hydrolytic enzymes including proteases, lipases and carbohydrate-active enzymes to the cell surface or extracellular space []. T2SS systems are composed of 11 to 15 different proteins, which are generally called GspA to GspO and GspS. The T2SS spans the two bacterial membranes and ensures secretion of folded proteins across the outer membrane pore formed by GspD. The inner membrane complex contains GspC, GspL, GspM, and GspF. The cytoplasmic domains of GspL and GspF interact with an ATPase, GspE. GspE is thought to energize the formation of a short pseudopilus by several pilin-like proteins, GspG to GspK []. GspD has been shown to interact with the inner membrane component GspC []. The T2SS pseudopilus is a periplasmic filament composed of the major pseudopilin, EpsG, and four minor pseudopilins, EpsH, EpsI, EpsJ and EpsK. Pseudopilus is assembled by the polymerization of GspG (also known as PulG) subunits. Pseudopilin proteins have a conserved N-terminal hydrophobic segment followed by a more variable C-terminal periplasmic and globular domain [
].
This family of proteins represents GspJ which is targeted to the membrane of Escherichia coli. GspJ forms a complex with GspI and GspK, which is part of the type 2 secretion system, involved in the translocation of proteins across the outer membrane of E.coli. The GSPK-I-J complex has quasi-helical characteristics [
].The type II secretion system (T2SS) is one of several extracellular secretion systems in gram-negative bacteria. It delivers toxins and a range of hydrolytic enzymes including proteases, lipases and carbohydrate-active enzymes to the cell surface or extracellular space [
]. T2SS systems are composed of 11 to 15 different proteins, which are generally called GspA to GspO and GspS. The T2SS spans the two bacterial membranes and ensures secretion of folded proteins across the outer membrane pore formed by GspD. The inner membrane complex contains GspC, GspL, GspM, and GspF. The cytoplasmic domains of GspL and GspF interact with an ATPase, GspE. GspE is thought to energize the formation of a short pseudopilus by several pilin-like proteins, GspG to GspK []. GspD has been shown to interact with the inner membrane component GspC []. The T2SS pseudopilus is a periplasmic filament composed of the major pseudopilin, EpsG, and four minor pseudopilins, EpsH, EpsI, EpsJ and EpsK. Pseudopilus is assembled by the polymerization of GspG (also known as PulG) subunits. Pseudopilin proteins have a conserved N-terminal hydrophobic segment followed by a more variable C-terminal periplasmic and globular domain [
].
Conserved hypothetical protein CHP02304, F390 synthetase-related
Type:
Family
Description:
Members of this family form a distinct clade within a larger family of proteins that also includes coenzyme F390 synthetase, an enzyme known in Methanobacterium thermoautotrophicum and a few other methanogenic archaea. The enzyme adenylates coenzyme F420 to F390, a reversible process, during oxygen stress. Other informative homologies include domains of the non-ribosomal peptide synthetases involved in activation by adenylation. This family is likely to be an adenylate-forming enzyme related to but distinct from coenzyme F390 synthetase.
Taurine ABC transporter, substrate-binding protein TauA
Type:
Family
Description:
This entry represents taurine-binding periplasmic protein TauA. TauA is part of tauABCD gene cluster involved in sulfonate transport in sulphate starvation condition. TauA plays a major role in ABC transport system and could be ideal candidate to serve as taurine catcher in biological fluids [
]. The most closely related proteins outside this family are putative aliphatic sulphonate binding proteins.
This family of proteins is found in eukaryotes. Proteins in this family are typically between 298 and 416 amino acids in length. IFT46 is a flagellar protein of complex B. Like all IFT (Intraflagellar transport) proteins, it is required for transport of IFT particles into the flagella [
].
Putative 7-cyano-7-deazaguanosine (preQ0) biosynthesis protein QueE
Type:
Family
Description:
Members of this radical SAM domain protein family appear to be a form of the queuosine biosynthesis protein QueE. QueE is involved in making preQ0 (7-cyano-7-deazaquanine), a precursor of both the bacterial/eukaryotic modified tRNA base queuosine and the archaeal modified base archaeosine. Members occur in preQ0 operons species that lack members of related protein family
.
Ribosomal protein L9, N-terminal domain superfamily
Type:
Homologous_superfamily
Description:
Ribosomal protein L9 is one of the proteins from the large ribosomal subunit.
In Escherichia coli, L9 is known to bind directly to the 23S rRNA. It belongsto a family of ribosomal proteins grouped on the basis of sequence similarities [
].The crystal structure of Bacillus stearothermophilus L9 shows the 149-residue protein comprises two globular domains connected by a rigid linker [
]. Each domain contains an rRNA binding site, and the protein functions as astructural protein in the large subunit of the ribosome. The C-terminal domain consists of two loops, an α-helix and a three-stranded mixed
parallel, anti-parallel β-sheet packed against the central α-helix. The long central α-helix is exposed to solvent in the middle and participates in thehydrophobic cores of the two domains at both ends.
The SH2-containing Shc adapter proteins are targets of activated tyrosine kinases and are implicated in the transmission of activation signals to the Ras/mitogen-activated protein kinase (MAPK) pathway [
]. Three Shc genes were originally identified in mammals that encode proteins characterised by an amino-terminal phosphotyrosine binding (PTB) domain and a carboxy-terminal Src homology 2 domain. Shc1 (ShcA) is ubiquitously expressed, whereas expression of Shc2 (ShcB) and Shc3 (ShcC) appears to be limited to neuronal cells [].SHC is composed of an N-terminal domain that interacts with proteins containing phosphorylated tyrosines, a (glycine/proline)-rich collagen-homology domain that contains the phosphorylated binding site, and a C-terminal SH2 domain. SH2 has been shown to interact with the tyrosine-phosphorylated receptors of EGF and PDGF and with the tyrosine-phosphorylated C chain of the T-cell receptor, providing one of the mechanisms of T-cell-mediated Ras activation [
]. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites [,
,
].
Ganglioside-induced differentiation-associated protein 1, C-terminal domain
Type:
Domain
Description:
GDAP1 belongs to a subfamily of the glutathione S-transferase (GST) family. Mutations of the GDAP1 gene cause Charcot-Marie-Tooth disease (CMT), one of the most frequent inherited peripheral neuropathy in humans [
]. GDAP1 is an integral, tail-anchored protein of the mitochondrial outer membrane (MOM) and the peroxisomal membrane, predominantly expressed in neural cells. The recombinant human GDAP1 has been shown to have specific GSH-conjugating activity in vitro, this activity is regulated by its hydrophobic domain 1 (HD1) []. This entry also includes GDAP1L1 (ganglioside-induced differentiation-associated protein 1-like 1), which is a paralogue of GDAP1 []. GDAP1L1 is capable of substituting for the loss of GDAP1 in the central nervous system of GDAP1-deficient mice [
].Proteins in this entry contains the GST-N and the GST-C domains, and an extended interdomain linker which may adopt two additional α-helices [
]. This is the C-terminal domain of GDAP1.
Uncharacterised protein family, inner membrane CreD
Type:
Family
Description:
This family consists of several bacterial CreD or Cet inner membrane proteins. Dominant mutations of the cet gene of Escherichia coli result in tolerance to colicin E2 and increased amounts of an inner membrane protein with a Mr of 42,000. The cet gene is shown to be in the same operon as the phoM gene, which is required in a phoR background for expression of the structural gene for alkaline phosphatase, phoA. Although the Cet protein is not required for phoA expression, it has been suggested that the Cet protein has an enhancing effect on the transcription of phoA [
].
This entry is represented by Bacteriophage 69, Orf23, the major tail protein. The characteristics of the protein distribution suggest prophage matches in addition to the phage matches.
Gene transfer agents belong to a group of unusual genetic exchange elements [
]. GTAs are unusual in the sense they have the structure of a small tailed phage, which do not possess typical phage traits such as host cell lysis and infectious transmission of the GTA genes. In the Rhodobacter capsulatus GTA the GTA particles contain random 4.5 kb DNA fragments of the R.capsulatus genome. These DNA fragments can be transmitted to other cells where allelic conversion may occur via homologous recombination. The genes coding for the GTA particles are of two distinct types: the first is a cluster of genes reminiscent of a cryptyic prophage, where a number of the genes have similarity to known phage structural genes; the second type consists of two genes coding for a cellular two-component signal transduction system, which regulates the transcription of the GTA structural gene cluster in a growth phase dependent manner [
].This entry is represented by ORFg9 (RCAP_rcc01691) of the Gene Transfer Agent (GTA) of Rhodobacter capsulatus [see Fig.1, in
]. ORFg9 has sequence homology to the major tail protein of bacteriophage TP901-1 (lactococcus phage TP901-1).
Notch cell surface receptors are large, single-pass type-1 transmembrane proteins found in a diverse range of metazoan species, from human to Caenorhabditis species. The fruit fly, Drosophila melanogaster, possesses only one Notch protein, whereas in C.elegans, two receptors have been found; by contrast, four Notch paralogues (designated N1-4) have been identified in mammals, playing both unique and redundant roles. The hetero-oligomer Notch comprises a large extracellular domain (ECD), containing 10-36 tandem Epidermal Growth Factor (EFG)-like repeats, which are involved in ligand interactions; a negative regulatory region, including three cysteine-rich Lin12-Notch Repeats (LNR); a single trans-membrane domain (TM); a small intracellular domain (ICD), which includes a RAM (RBPjk-association module) domain; six ankyrin repeats (ANK), which are involved in protein-protein interactions; and a PEST domain. Drosophila Notch also contains an OPA domain [
]. Notch signalling is an evolutionarily conserved pathway involved in a wide variety of developmental processes, including adult homeostasis and stem cell maintenance, cell proliferation and apoptosis [
]. Notch is activated by a range of ligands -the so-called DSL ligands (Delta/Seratte/LAG-2). Activation is also mediated by a sequence of proteolytic events: ligand binding leads to cleavage of Notch by ADAM proteases [] at site 2 (S2) and presenilin-1/g-secretase at sites 3 (S3)and 4 (S4) [].The last cleavage releases the Notch intracellular part of theprotein (NICD) from the membrane and, upon release, the NICD translocates to the nucleus where it associates with a CBF1/RBJk/Su(H)/Lag1 (CSL) family of DNA-binding proteins. The subsequent recruitment of a co-activator mastermind like (MAML1) protein [
] promotes transcriptional activation of Notch target genes: well established Notch targets are the Hes and Hey gene families. Aberrant Notch function and signalling has been associated with a number of human disorders, including Allagile syndrome, spondylocostal dysostosis, aortic valve disease, CADASIL (Cerebral Autosomal Dominant Arteriopathy with Subcortical Infarcts and Leukoencephalopathy), and T-cell Acute Lympho-blastic Leukemia (T-ALL); it has also been implicated in various human carcinomas [
,
]. This entry represents Neurogenic locus notch homologue protein 2 from humans and similar proteins found in chordates. Notch 2 has a widespread expression pattern. It is expressed in a number of tissues, including brain, heart, kidney, lung, teeth, skeletal muscle and liver. This protein functions as a receptor for membrane-bound ligands Jagged-1 (JAG1), Jagged-2 (JAG2) and Delta-1 (DLL1) to regulate cell-fate determination. Upon ligand activation through the released notch intracellular domain (NICD) it forms a transcriptional activator complex with RBPJ/RBPSUH and activates genes of the enhancer of split locus [
,
]. It affects the implementation of differentiation, proliferation and apoptotic programs and is involved in bone remodelling and homeostasis [,
,
]. It positively regulates self-renewal of liver cancer cells [].
Conserved hypothetical protein CHP03773, ABC transporter-like
Type:
Family
Description:
Members of this protein family occur in genomes that contain a three-gene ABC transporter operon associated with the presence of domain
. That domain occurs as a single-copy insert in the substrate-binding protein, and occurs in two or more copies in members of this protein family. Members of this family typically are encoded adjacent to the said transporter operon and may serve as a substrate receptor.
Sperm acrosome membrane-associated protein 4/Protein Bouncer
Type:
Family
Description:
This entry includes proteins from chordates that contain a UPAR/Ly6 domain, including Sperm acrosome membrane-associated protein 4 (SPACA4) from humans and Protein Bouncer (Bncr) from Danio rerio (Zebrafish). SPACA4 (also known as Sperm acrosomal membrane-associated protein 14) is a sperm surface membrane protein that may be involved in sperm-egg plasma membrane adhesion and fusion during fertilization [
].Bncr is an oocyte-expressed fertilization factor that mediates sperm-egg binding and is essential for sperm entry into the egg and to mediate species-specific gamete recognition and fertilization, which is vital for vertebrate species performing external fertilization [
].
This family includes U3 small nucleolar RNA-associated protein 4 (UTP4), a ribosome biogenesis factor that is involved in nucleolar processing of pre-18S ribosomal RNA. In yeast, it is required for optimal pre-ribosomal RNA transcription by RNA polymerase I [
,
], while in human is required for pre-rRNA processing [].
This entry represents the ATP-binding domain of DEAD box protein 17 (DDX17, also known as p72) found in chordates. This protein is a member of the DEAD-box helicase family, a diverse family of proteins involved in ATP-dependent RNA unwinding, needed in a variety of cellular processes including splicing, ribosome biogenesis and RNA degradation [
,
,
,
]. DDX17 has a wide variety of functions including regulating the alternative splicing of exons exhibiting specific features such as the inclusion of AC-rich alternative exons in CD44 transcripts, playing a role in innate immunity, and promoting mRNA degradation mediated by the antiviral zinc-finger protein ZC3HAV1 in an ATPase-dependent manner [
,
,
,
, ,
].
Notch cell surface receptors are large, single-pass type-1 transmembrane proteins found in a diverse range of metazoan species, from human to Caenorhabditis species. The fruit fly, Drosophila melanogaster, possesses only one Notch protein, whereas in C.elegans, two receptors have been found; by contrast, four Notch paralogues (designated N1-4) have been identified in mammals, playing both unique and redundant roles. The hetero-oligomer Notch comprises a large extracellular domain (ECD), containing 10-36 tandem Epidermal Growth Factor (EFG)-like repeats, which are involved in ligand interactions; a negative regulatory region, including three cysteine-rich Lin12-Notch Repeats (LNR); a single trans-membrane domain (TM); a small intracellular domain (ICD), which includes a RAM (RBPjk-association module) domain; six ankyrin repeats (ANK), which are involved in protein-protein interactions; and a PEST domain. Drosophila Notch also contains an OPA domain [
]. Notch signalling is an evolutionarily conserved pathway involved in a wide variety of developmental processes, including adult homeostasis and stem cell maintenance, cell proliferation and apoptosis [
]. Notch is activated by a range of ligands -the so-called DSL ligands (Delta/Seratte/LAG-2). Activation is also mediated by a sequence of proteolytic events: ligand binding leads to cleavage of Notch by ADAM proteases [] at site 2 (S2) and presenilin-1/g-secretase at sites 3 (S3)and 4 (S4) [].The last cleavage releases the Notch intracellular part of the protein (NICD) from the membrane and, upon release, the NICD translocates to the nucleus where it associates with a CBF1/RBJk/Su(H)/Lag1 (CSL) family of DNA-binding proteins. The subsequent recruitment of a co-activator mastermind like (MAML1) protein [] promotes transcriptional activation of Notch target genes: well established Notch targets are the Hes and Hey gene families. Aberrant Notch function and signalling has been associated with a number of human disorders, including Allagile syndrome, spondylocostal dysostosis, aortic valve disease, CADASIL (Cerebral Autosomal Dominant Arteriopathy with Subcortical Infarcts and Leukoencephalopathy), and T-cell Acute Lympho-blastic Leukemia (T-ALL); it has also been implicated in various human carcinomas [
,
]. Notch 1 has a widespread pattern of expression and has been demonstrated to be involved in several processes such as mesenchymal cell differentiation, spermatogenesis, and osteoblastic cell differentiation. The pattern of expression of Notch 1 in adult human tissues reveals that this receptor may play multiple roles in several cell types, including neurons, lymphoid cells and skeletal muscle cells, as well as suprabasal cells in stratified epithelia [
].
This family incudes Protein YecM from Escherichia coli and similar proteins predominantly found in gammaproteobacteria. YecM shows a pseudo-twofold axis with eight β-strands forming a curved sheet that wraps around C-terminal α-helix and a presumed active site, forming a deep groove, and two α-helices on each side of the sheet. It may be a metal-binding protein and function as an enzyme, but its specific function is still unknown [
]. Many members of this entry are predicted to belong to the vicinal oxygen chelate (VOC) superfamily [].
This entry represents the three redox active TRX (a) domains found in protein disulfide-isomerase A5 (PDIA5, also known as PDIR). PDIR is composed of three redox active TRX (a) domains and an N-terminal redox inactive TRX-like (b) domain. Similar to PDI, it is involved in oxidative protein folding in the endoplasmic reticulum (ER) through its isomerase and chaperone activities. These activities are lower compared to PDI, probably due to PDIR acting only on a subset of proteins. PDIR is preferentially expressed in cells actively secreting proteins and its expression is induced by stress [
]. Similar to PDI, the isomerase and chaperone activities of PDIR are independent; CXXC mutants lacking isomerase activity retain chaperone activity. The TRX-like b domain of PDIR is critical for its chaperone activity [].
Probable bifunctional tRNA threonylcarbamoyladenosine biosynthesis protein
Type:
Family
Description:
This entry describes a probable bifunctional tRNA threonylcarbamoyladenosine biosynthesis protein. Its N-termanal domain is homologous to the Kae1/YgjD family, which is involved in tRNA threonylcarbamoyladenosine biosynthesis [
]. Its C-terminal region contains a serine/threonine protein kinase domain (STYKS) and is homologous to Bud32. Kae1 and Bud32 are two components of the the KEOPS/EKC complex, which has been implicated in transcription, telomere maintenance and chromosome segregation []. In this archaeal family, it seems that the Kae1 and Bud32 orthologues are fused.
Flagellar biosynthesis protein FlhF, GTPase domain
Type:
Domain
Description:
This entry represents the GTPase domain found at the C-terminal of flagellar biosynthetic protein FlhF. The assembly of flagella is a multi-step process and relies on a complex type III export machinery located in the cytoplasmic membrane. The FlhF protein is a essential for the placement and assembly of polar flagella and has been classified as a signal-recognition particle (SRP)-type GTPase [
,
]. It is similar to the 54 kd subunit (SRP54) of the SRP that mediates the transport to or across the plasma membrane in bacteria and the endoplasmic reticulum in eukaryotes [,
]. SRP recognises N-terminal signal sequences of newly synthesized polypeptides at the ribosome. The SRP-polypeptide complex is then targeted to the membrane by an interaction between SRP and its cognate receptor (SR). FlhF activities and the net effect of FlhF on flagellation phenotypes appear to be different among polar flagellates [].
Probable transcription termination protein NusA, archaeal
Type:
Family
Description:
This entry represents a family of archaeal proteins found in a single copy per genome. It contains two KH domains and is most closely related to the central region bacterial NusA, a transcription termination factor named for its interaction with phage lambda protein N in Escherichia coli. The proteins required for antitermination by N include NusA, NusB, NusE (ribosomal protein S10), and NusG. This system, on the whole, appears not to be present in the archaea.
This family consists of type III secretion system (T3SS) stator protein, previously known as nodulation protein NolV, from different Rhizobium species [
]. The function of this family is unclear, however, it has been suggested that T3SS stator protein is a component of the T3SS, which is used to inject bacterial effector proteins into eukaryotic host cells.
Transmembrane ATPases are membrane-bound enzyme complexes/ion transporters that use ATP hydrolysis to drive the transport of protons across a membrane. Some transmembrane ATPases also work in reverse, harnessing the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. There are several different types of transmembrane ATPases, which can differ in function (ATP hydrolysis and/or synthesis), structure (e.g., F-, V- and A-ATPases, which contain rotary motors) and in the type of ions they transport [
,
]. The different types include:F-ATPases (ATP synthases, F1F0-ATPases), which are found in mitochondria, chloroplasts and bacterial plasma membranes where they are the prime producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts).V-ATPases (V1V0-ATPases), which are primarily found in eukaryotes and they function as proton pumps that acidify intracellular compartments and, in some cases, transport protons across the plasma membrane [
]. They are also found in bacteria [].A-ATPases (A1A0-ATPases), which are found in Archaea and function like F-ATPases, though with respect to their structure and some inhibitor responses, A-ATPases are more closely related to the V-ATPases [
,
].P-ATPases (E1E2-ATPases), which are found in bacteria and in eukaryotic plasma membranes and organelles, and function to transport a variety of different ions across membranes.E-ATPases, which are cell-surface enzymes that hydrolyse a range of NTPs, including extracellular ATP.F-ATPases (also known as ATP synthases, F1F0-ATPase, or H(+)-transporting two-sector ATPase) (
) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), with additional subunits in mitochondria. Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis [
]. These ATPases can also work in reverse in bacteria, hydrolysing ATP to create a proton gradient.In yeast, the F0 complex is composed of at least nine polypeptides which are nucleus-encoded: b, OSCP, d, e, f, g, h, i/j and k, together with three subunits, 6, 8 and 9, which are mitochondrion-encoded. The bovine enzyme also includes subunit F6 for which no homologue has been found in yeast [
].This entry represents subunit 8 found in the F0 complex of mitochondrial F-ATPases from fungi. This subunit appears to be an integral component of the stator stalk in yeast mitochondrial F-ATPases [
]. The stator stalk is anchored in the membrane, and acts to prevent futile rotation of the ATPase subunits relative to the rotor during coupled ATP synthesis/hydrolysis. This subunit differs in sequence between fungi, Metazoa () and plants (
).
This region of the APC family of proteins is known as the basic domain. It contains a high proportion of positively charged amino acids and interacts with microtubules [
].
Conserved hypothetical protein CHP02171, Fibrobacter succinogenes
Type:
Family
Description:
This entry represents a paralogous family of proteins found in the rumen bacterium Fibrobacter succinogenes. The proteins average over 900 amino acids in length and are of unknown function, though more than half are predicted lipoproteins.
Integrating conjugative element protein PilL, PFGI-1
Type:
Domain
Description:
This entry describes the conserved N-terminal region of a variable length protein family associated with laterally transfered regions flanked by markers of conjugative plasmid integration and/or transposition. Most members of the family have the lipoprotein signal peptide motif. A member of the family from a pathogenicity island in Salmonella enterica subsp enterica serovar Dublin strain was designated PilL for nomenclature consistency with a neighbouring gene for the pilin structural protein PilS. However, the species distribution of this protein family tracks much better with markers of conjugal transfer than with markers of PilS-like pilin structure.
Conserved hypothetical protein CHP03806, HNE0200 family
Type:
Domain
Description:
This entry represents an uncharacterised protein which is associated with another uncharacterised protein containing β-helix repeats (
). These two proteins are usually encoded adjacently, either as separate genes or in a fusion. Sometimes two proteins in this entry occur with a single member of
. The function of these proteins is unknown.
PHD finger protein 20 (PHF20) is a Methyllysine-binding protein, component of the MOF histone acetyltransferase protein complex. It consists of tandem Tudor domains at the N-terminal, an AT hook, a C2H2-type zinc finger, and a plant homeodomain (PHD) finger [
,
,
]. PHF20L1 binds to monomethylated lysine 142 on DNA (cytosine-5) methyltransferase 1 (DNMT1). It has been shown to antagonize DNMT1 proteasomal degradation [].This is the AT-hook domain found in PHD finger protein 20 (PHF20) and PHD finger protein 20-like (PHF20L) [
,
,
].
This entry represents Tail tip protein M from Escherichia phage lambda (Bacteriophage lambda). Members of this protein family are found in tailed bacteriophages (Caudovirales) and in bacterial prophages mostly among Proteobacteria.TipM is part of the distal tail tip which plays a role in DNA ejection during entry, and in tail assembly initiation during exit. It may bind tail tip complex associated with tape measure protein and allow tail tube protein polymerization on top of tail tip [
].
Cytosolic carboxypeptidase-like protein 5 catalytic domain
Type:
Domain
Description:
This entry contains the M14 carboxypeptidase-like domain of cytosolic carboxypeptidase-like protein 5 (CCP5, ATP/GTP binding protein-like 5 or AGBL-5; MEROPS identifier M14.025), and related proteins. CCP5 is part of the cytosolic carboxypeptidase (CCP) family, which also includes enzymes CCP1/Nna1, CCP4, and CCP6 [
], and belongs to subfamily M14B of peptidase family M14 []. CCP5 removes alpha- and gamma-linked glutamates from tubulin [].
V-set and immunoglobulin domain-containing protein 1
Type:
Family
Description:
V-set and immunoglobulin domain-containing protein 1 (VSIG1, also konwn as A34) belongs to the immunoglobulin superfamily (IgSF), whose members have one or more Ig-like domains in the extracellular region that is implicated in cell-cell adhesion, a transmembrane domain, and one cytoplasmic C-terminal region [
]. VSIG1 is required for the proper differentiation of glandular gastric epithelia [].
Movement proteins (MPs) encoded by many virus genera are specialised proteins essential for plant viral genomes or virions transport within and between cells. There are some models of virus movement, such as Tobacco mosaic virus (TMV) model or the one described in several families of the icosahedral RNA viruses and pararetroviruses. Aditionally, proteins involved in replication or encapsidation are also required in cell-to-cell movement in some viruses [
].This entry represents the movement protein 6 (p6) from Beet yellows virus and related Closterovirus proteins. P6 is a small protein (6kDa) localised in the endoplasmic reticulum. It has a single-span N-terminal transmembrane domain and a C-terminal hydrophilic domain which faces the cytosol. It is involved, together with Hsp70h, CP, CPm, and P64, in cell to cell movement of the viral genome without any budding, being essential for this process. The mechanism of action of this protein is not clear. It is suggested that it also plays a role in virion formation [
].
This region is found in viruses, and is approximately 20 amino acids in length. The region is found C-terminal to
. There is a single completely conserved residue Y that may be functionally important.
Chloroplast division requires the formation of an FtsZ division ring, its positioning being regulated by the proteins MinD and MinE. MCD1 is a plant-specific protein that directly interacts with MinD and is required for MinD localization to regulate FtsZ ring formation [
].
Myb-related protein P/transcription factor Y1, C-terminal
Type:
Domain
Description:
This entry represents the C terminus of plant P/Y1 proteins. Members of this entry are transcriptional regulators of genes encoding enzymes for flavonoid biosynthesis [
,
]. P protein plays a role in the pathway leading to the production of a red phlobaphene pigment [], and P proteins are homologous to the DNA-binding domain of myb-like transcription factors []. This domain is associated with domain.
GTPase activating protein complex, subunit Bfa1/Byr4
Type:
Family
Description:
Bfa1 is required for the spindle assembly checkpoint in budding yeast. It functions in the same pathway with Bub2 in one of the branches of the spindle assembly checkpoint which prevents cytokinesis before the completion of chromosome segregation [
,
]. Bub2 and Bfa1 are also required for the maintenance of G2/M arrest in response to DNA damage and to spindle misorientation []. The homologue of Bfa1 in fission yeast is Byr4, which together with Cdc16 (homologue of Bub2) forms a two-component GTPase-activating protein for Spg1 GTPase [,
]. Spg1 GTPase positively regulates septation and constriction of the actomyosin ring for cell division.Bfa1 and Bub2 may function as a universal checkpoint in response to various checkpoint signals to avoid improper mitotic exit [
].
tRNA threonylcarbamoyl adenosine modification protein TsaB
Type:
Family
Description:
TsaB, previously known as YeaZ, has been shown to be involved in N6-threonylcarbamoyladenonsine (t(6)A) biosynthesis, together with YgjD (TsaD), YrdC (TsaC), and YjeE (TsaE) [
,
].
Tungstate ABC transporter, substrate-binding protein WtpA
Type:
Family
Description:
Members of this protein family are tungstate (and, more weakly, molybdate) binding proteins of tungstate(/molybdate) ABC transporters, as first characterised in Pyrococcus furiosus. Note that this family is homologous to molybdate transporters, and that at least one other family of tungstate transporter binding protein, TupA, also exists. Bacterial high affinity transport systems are involved in active transport of solutes across the cytoplasmic membrane. Most of the bacterial ABC (ATP-binding cassette) importers are composed of one or two transmembrane permease proteins, one or two nucleotide-binding proteins and a highly specific periplasmic solute-binding protein. In Gram-negative bacteria the solute-binding proteins are dissolved in the periplasm, while in archaea and Gram-positive bacteria, their solute-binding proteins are membrane-anchored lipoproteins [
,
].
Conserved hypothetical protein CHP04072, B12-binding/radical SAM-type
Type:
Family
Description:
Members of this protein family occur in conserved genomic contexts highly suggestive of lipid biosynthesis, including an island shared between Kuenenia stuttgartiensis, which produces ladderanes [
], and Desulfotalea psychrophila, which produces a different kind of unusual polyunsaturated hydrocarbon.
Conserved hypothetical protein CHP04014, B12-binding/radical SAM-type
Type:
Family
Description:
Members of this family, which are found in methanogenic archaea, have both a B12 binding homology domain (
) and a radical SAM domain (
). They occur only once per genome. Some species that express proteins of this family also express a related protein with a similar domain architecture (see
).
Translationally controlled tumour protein (TCTP) domain
Type:
Domain
Description:
The translationally controlled tumor proteins (TCTPs, such as p21, p23 and histamine releasing factor (HRF)) are a highly conserved and abundantly expressed family of eukaryotic proteins that are implicated in a variety of cellular functions, including microtubule stabilization, cell cycle, apoptosis, and cytokine release. TCTP is ubiquitously expressed in all eukaryotic organisms from protozoa such as Plasmodium sp. to plants and mammals [
,
,
,
].The TCTP domain structure comprises four β-sheets, designated A-D, and three main helices, designated H1-H3, connected in a complex topology. A central feature of the structure is the four-stranded sheet A, against one face of which packs the three-stranded sheet B and the small helix H1. Helix H3 packs against part of the opposite face of sheet A. Helix H2 packs against helix H3 (forming an α-helical hairpin). The final major structural feature is the two-stranded sheet C, which protrudes from the core globular structure formed by the rest of the domain [
].This entry represents the TCTP domain.
This family consists of several Rab5-interacting proteins (RIP5 or Rab5ip) including ER membrane protein complex subunit 6 (EMC6) and RCAF1. The ras-related GTPase rab5 is rate-limiting for homotypic early endosome fusion. Rab5ip represents a novel rab5 interacting protein that may function on endocytic vesicles as a receptor for rab5-GDP and participate in the activation of rab5 [
]. EMC6 interacts with Rab5A and BECN1/Beclin 1 and regulates autophagosome formation []. EMC6 is part of the EMC complex, required for efficient folding of proteins in the ER [] and for post-translational membrane insertion of tail-anchored (TA) proteins [,
]. RCAF1 acts as an assembly factor for mitochondrial respiratory complexes [].
GspN is a cytoplasmic membrane component of the type II secretion system (T2SS). It is present in T2SS operons of a subset of Gram-negative speciesI, such as Aeromonas hydrophila (gene exeN); Erwinia carotovora (gene outN); Klebsiella pneumoniae (gene pulN); or Vibrio cholerae (gene epsN) [
]. The size of the 'N' protein is around 250 amino acids. It apparently contains a single transmembrane domain located in the N-terminal section. The short N-terminal domain ispredicted to be cytoplasmic and the large C-terminal domain periplasmic.
The type II secretion system (T2SS) is one of several extracellular secretion systems in gram-negative bacteria. It delivers toxins and a range of hydrolytic enzymes including proteases, lipases and carbohydrate-active enzymes to the cell surface or extracellular space [
]. T2SS systems are composed of 11 to 15 different proteins, which are generally called GspA to GspO and GspS. The T2SS spans the two bacterial membranes and ensures secretion of folded proteins across the outer membrane pore formed by GspD. The inner membrane complex contains GspC, GspL, GspM, and GspF. The cytoplasmic domains of GspL and GspF interact with an ATPase, GspE. GspE is thought to energize the formation of a short pseudopilus by several pilin-like proteins, GspG to GspK []. GspD has been shown to interact with the inner membrane component GspC []. The T2SS pseudopilus is a periplasmic filament composed of the major pseudopilin, EpsG, and four minor pseudopilins, EpsH, EpsI, EpsJ and EpsK. Pseudopilus is assembled by the polymerization of GspG (also known as PulG) subunits. Pseudopilin proteins have a conserved N-terminal hydrophobic segment followed by a more variable C-terminal periplasmic and globular domain [
].
Cellulose, an aggregate of unbranched polymers of beta-1,4-linked glucose residues, is the major component of wood and thus paper, and is synthesized by plants, most algae, some bacteria and fungi, and even some animals. The genes that synthesize cellulose in higher plants differ greatly from the well-characterised genes found in Acetobacter and Agrobacterium spp. More correctly designated as "cellulose synthase catalytic subunits", plant cellulose synthase (CesA) proteins are integral membrane proteins, approximately 1,000 amino acids in length. There are a number of highly conserved residues, including several motifs shown to be necessary for processive glycosyltransferase activity [
].An operon encoding 4 proteins required for bacterial cellulose biosynthesis
(bcs) in Acetobacter xylinus (Gluconacetobacter xylinus) has been isolated via genetic complementationwith strains lacking cellulose synthase activity [
]. Nucleotide sequence analysis showed the cellulose synthase operon to consist of 4 genes, designated bcsA, bcsB, bcsC and bcsD, all of which are required for maximal bacterial cellulose synthesis in A. xylinum.
The calculated molecular mass of the protein encoded by bcsD is 17.3kDa [
]. The function of BcsD is unknown.This entry represents Cellulose synthase operon protein D from Komagataeibacter xylinus and similar proteins from Proteobacteria. BcsD has been related with the level of crystalline structure of cellulose [
]. The protein have a octamer pore-like structure with four inner passageways [].
Torsin-1A-interacting protein 1/2, AAA+ activator domain
Type:
Domain
Description:
This entry represents the C-terminal AAA+ activator domain of Torsin-1A-interacting proteins 1 and 2 (TOIP1/2) also known as LAP1 proteins (Lamina-associated polypeptide 1), which are type 2 integral membrane proteins with a single membrane-spanning region of the inner nuclear membrane [
,
,
]. These proteins interact with and activate Torsin A, an AAA+ ATPase localized to the endoplasmic reticulum (ER), through a perinuclear domain and forms a heterohexameric (LAP1-Torsin)3 ring that targets Torsin to the nuclear envelope. LAP1 has an atypical AAA+ fold and provides an arginine finger to the Torsin A active site to promote its ATPase activity [,
]. A single mutation in Torsin A causes early onset primary dystonia, a painful and severely disabling neuromuscular disease [,
].
GEMIN8 proteins are found in the nuclear bodies called gems (Gemini of Cajal bodies) that are often in proximity to Cajal (coiled) bodies themselves. They are also found in the cytoplasm [
]. The family is part of the SMN (survival motor neurone) complex that plays an essential role in spliceosomal snRNP assembly in the cytoplasm and is required for pre-mRNA splicing in the nucleus. GEMIN8 binds directly to SMN1 and mediates the interaction of the GEMIN6-GEMIN7 heterodimer [].
Microcystin LR degradation protein MlrC, C-terminal
Type:
Domain
Description:
Proteins in this entry are involved in degradation of the cyanobacterial heptapeptide hepatotoxin microcystin LR, and are encoded in the mlr gene cluster [
]. MlrC from Sphingomonas wittichii (strain RW1 / DSM 6014 / JCM 10273) is believed to mediate the last step of peptidolytic degradation of the tetrapeptide. It is suspected to be a metallopeptidase based on homology to known peptidases and its inhibition by metal chelators. The proteins encoded by the mlr cluster may be involved in cell wall peptidoglycan cycling and subsequently act fortuitously in hydrolysis of microcystin LR.This entry represents the C-terminal region of these proteins.
Cell cycle exit/neuronal differentiation protein 1
Type:
Family
Description:
Cell cycle exit and neuronal differentiation protein 1 (Cend1 or BM88) is a homodimer of 140 amino acids. It has a putative transmembrane domain at the C terminus. The mRNA of this protein is expressed only in neural tissues, where it is restricted to neurons. It is involved in neuroblastoma cell differentiation [
,
], neuronal differentiation [], and development of the cerebellum [].
This is the C-terminal domain of DNA mismatch repair protein Mlh1, which belongs to the MutL family. This domain forms part of the endonuclease active site [
].
Pesticidal crystal protein Cry22Aa, Ig-like domain
Type:
Domain
Description:
This domain can be found in Pesticidal crystal protein Cry22Aa from Bacillus thuringiensis, a protein with a toxic effect on several insect larvae [
]. It is also present in Chitinase 60 from Moritella marina (), responsible for degradation of krill chitin. BT_2262 from Bacteroides thetaiotaomicron (
) contains this domain at the N-terminal end [
]. This entry represents the bacterial immunoglobulin-like domain [,
].
This entry represents a group of plant universal stress protein (USP)-like proteins, including AT3G01520 from Arabidopsis. AT3G01520 binds to AMP and contains the ATP-binding loop, which suggests that it belongs to the ATP-binding USP subfamily [
].
Chemotaxis protein CheA, P2 response regulator-binding
Type:
Domain
Description:
Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions []. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk []. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more []. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) []. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK.A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response [
,
].Signal transducing histidine kinases are the key elements in two-component signal transduction systems, which control complex processes such as the initiation of development in microorganisms [
,
]. Examples of histidine kinases are EnvZ, which plays a central role in osmoregulation [], and CheA, which plays a central role in the chemotaxis system []. Histidine kinases usually have an N-terminal ligand-binding domain and a C-terminal kinase domain, but other domains may also be present. The kinase domain is responsible for the autophosphorylation of the histidine with ATP, the phosphotransfer from the kinase to an aspartate of the response regulator, and (with bifunctional enzymes) the phosphotransfer from aspartyl phosphate back to ADP or to water []. The kinase core has a unique fold, distinct from that of the Ser/Thr/Tyr kinase superfamily. HKs can be roughly divided into two classes: orthodox and hybrid kinases [
,
]. Most orthodox HKs, typified by the Escherichia coli EnvZ protein, function as periplasmic membrane receptors and have a signal peptide and transmembrane segment(s) that separate the protein into a periplasmic N-terminal sensing domain and a highly conserved cytoplasmic C-terminal kinase core. Members of this family, however, have an integral membrane sensor domain. Not all orthodox kinases are membrane bound, e.g., the nitrogen regulatory kinase NtrB (GlnL) is a soluble cytoplasmic HK []. Hybrid kinases contain multiple phosphodonor and phosphoacceptor sites and use multi-step phospho-relay schemes instead of promoting a single phosphoryl transfer. In addition to the sensor domain and kinase core, they contain a CheY-like receiver domain and a His-containing phosphotransfer (HPt) domain.The response regulators for CheA bind to the P2 domain, which is found between
and
as either one or two copies. Highly flexible linkers connect P2 to the rest of CheA and impart remarkable mobility to the P2 domain. This feature is thought to enhance the inter CheA dimer phosphotransfer reactions within the signalling complex, thereby amplifying the phosphorylation signal [
].
Uncharacterised protein family UPF0758, YicR subfamily
Type:
Family
Description:
UPF0758 was previously known as the radC family. The name was assigned according to the radC102 mutant of E. coli which was later demonstrated to be an allele of the transcription-repair-coupling factor recG [
,
]. It has been described as a putative JAMM-family deubiquitinating enzyme, but its function remains to be determined [].This entry represents the YicR subfamily, found in the Enterobacteriaceae.
The synaptonemal complex (SC) is a tripartite structure that physically links homologous chromosomes during prophase I. The SC has a zipper-like structure composed of two lateral elements (LEs) and a central element (CE). Transverse filaments, composed of SYCP1 molecules, bridge the gap between one LE and the CE.Syce2/CESC1 localises to the CE and interacts with the transverse filament protein SYCP1. Its localization to the CE is dependent on recruitment by SYCP1 [
]. Syce2/ CESC1 is required for synaptonemal complex assembly and also for double strand break repair and homologous recombination [].
This family of phage proteins include Bacteriophage T7 gene 5.5 (also known as Protein suppressor of silencing) which plays a role in the inhibition of gene silencing by the host nucleoid-associated protein H-NS/hns. It disrupts higher-order H-NS-DNA complexes [
].
The baseplate of Enterobacteria phage T4 controls host cell recognition, attachment, tail sheath contraction and viral DNA ejection. The structure of the baseplate suggests a mechanism of baseplate structural transition during the initial stages of T4 infection. The baseplate is assembled from six identical wedges that surround the central hub. Gp53, along with other T4 gene products, combine sequentially to assemble a wedge [
].
PKCs are classified into three groups (classical, atypical, and novel) depending on their mode of activation and the structural characteristics of their N-terminal regulatory domain [
,
]. Atypical protein kinases C (aPKCs) have a PB1 and an atypical C1 domain, which only accepts phosphatidylserine [].In mammals there are two aPKC isoforms, zeta and iota/lambda (iota is the human orthologue and lambda the mouse orthologue) [
]. aPKCs are involved in many cellular functions including proliferation, migration, apoptosis, polarity maintenance and cytoskeletal regulation [,
]. They also play a critical role in the regulation of glucose metabolism and in the pathogenesis of type 2 diabetes [,
].This entry consists mainly of invertebrate aPKCs. In Drosophila, aPKC is also involved in polarity maintenance, cytoskeletal regulation and cell cycle progression, and proliferation [,
,
,
,
].
Conserved hypothetical protein CHP03097, O-antigen ligase-related
Type:
Family
Description:
These proteins in this entry all have multiple transmembrane domains and are related to or are members of the O-antigen ligase family. This group is associated with genomes and usually genomic contexts containing elements of the exosortase/PEP-CTERM protein export system [
], specifically the type 1 variety of this system described by .
Retinoic acid receptor responder protein 1 (RARRES1) is also known as TIG-1 (tazarotene-induced gene-1) as it is up-regulated by the synthetic retinoid tazarotene [
,
,
]. TIG-1 is a tumor suppressor gene, its expression being frequently downregulated through promoter hypermethylation in various carcinomas [,
]. Inhibits the cytoplasmic carboxypeptidase AGBL2, and may regulate the alpha-tubulin tyrosination cycle [].
Protein-export membrane protein SecD/SecF/SecDF, conserved site
Type:
Conserved_site
Description:
Secretion across the inner membrane in some Gram-negative bacteria occurs via the preprotein translocase
pathway. Proteins are produced in the cytoplasm as precursors, and require a chaperone subunit to direct them tothe translocase component [
]. From there, the mature proteins are either targeted to the outermembrane, or remain as periplasmic proteins. The translocase protein subunits are encoded on the bacterial
chromosome. The translocase itself comprises 7 proteins, including a chaperone protein (SecB), an ATPase (SecA), an integralmembrane complex (SecCY, SecE and SecG), and two additional membrane proteins that promote the release of
the mature peptide into the periplasm (SecD and SecF) []. The chaperone protein SecB [] is a highly acidic homotetrameric protein that exists as a "dimer of dimers"in the bacterial cytoplasm.
SecB maintains preproteins in an unfolded state after translation, and targets these to the peripheral membraneprotein ATPase SecA for secretion [
]. Together with SecY and SecG, SecE forms a multimericchannel through which preproteins are translocated, using both proton motive forces and ATP-driven secretion. The
latter is mediated by SecA. The structure of theEscherichia coli SecYEG assembly revealed a sandwich of two membranes interacting through the extensive cytoplasmic
domains []. Each membrane is composed of dimers of SecYEG. The monomeric complex contains 15transmembrane helices.
The SecD and SecF equivalents of the
Gram-positive bacterium Bacillus subtilis are jointly present in one polypeptide,denoted SecDF, that is required to maintain a high capacity for protein secretion.
Unlike the SecD subunit of the pre-protein translocase of E. coli, SecDFof B. subtilis was not required for the release of a mature secretory protein from
the membrane, indicating that SecDF is involved in earlier translocation steps [].Comparison with SecD and
SecF proteins from other organisms revealed the presence of 10 conservedregions in SecDF, some of which appear to be important for SecDF function.
Interestingly, the SecDF protein of B. subtilis has 12 putative transmembranedomains. Thus, SecDF does not only show sequence similarity but also structural
similarity to secondary solute transporters [].This entry represents a GG-containing domain found in the N-terminal region of prokaryotic SecD and SecF protein export membrane proteins. It is found in association with
. SecD and SecF proteins are part of the multimeric protein export complex comprising SecA, D, E, F, G, Y, and YajC [
]. SecD and SecF are required to maintain a proton motive force [].
Sterol regulatory element-binding protein 1, C-terminal
Type:
Domain
Description:
This entry represents the C-terminal domain of the sterol regulatory element-binding protein 1 (Sre1). Sre1 is a transcriptional activator required for transcription of genes required for adaptation to anaerobic growth like those implicated in the nonrespiratory oxygen-consumptive biosynthetic pathways of sterol, heme, sphingolipid, and ubiquinone biosynthesis [
,
].
Protein kinase B beta (PKB-beta), also known as AKT2, is one of three closely related serine/threonine-protein kinases: AKT1/alpha, AKT2/beta and AKT3/gamma. PKBs contain an N-terminal pleckstrin homology (PH) domain and a C-terminal catalytic domain [
]. PKB-beta is the predominant PKB isoform expressed in insulin-responsive tissues. It plays a critical role in the regulation of glucose homeostasis []. It is also implicated in muscle cell differentiation []. Mice deficient in PKB-beta display normal growth weights but exhibit severe insulin resistance and diabetes, accompanied by lipoatrophy and B-cell failure [].
NudC-like 2/NudC domain-containing protein 2 (NudCL2/NUDCD2) regulates the LIS1/dynein pathway by stabilizing LIS1 with Hsp90 chaperone. LIS1 is a key regulator of cytoplasmic dynein, and is critical for cell proliferation, survival, and neuronal migration [
]. Like other members of the NudC family, NUDCD2 has a conserved p23 domain, which possesses chaperone activity both in conjunction with and independently of Hsp90 [,
]. This entry represents the p23 domain of NUDCD2.
NudC domain-containing protein 3 (NUDCD3) has not been characterised. It is a member of the NudC family. All members of the NudC family share a conserved p23 domain, which possesses chaperone activity both in conjunction with and independently of heat shock protein 90 (Hsp90) [
]. NudC proteins play multiple roles in cell cycle progression, cell migration, inflammatory response, platelet production, carcinogenesis [].
Orsellinic acid/F9775 biosynthesis cluster protein D
Type:
Family
Description:
This family of proteins includes orsellinic acid/F9775 biosynthesis cluster protein D (orsD) from Emericella nidulans. The orsD gene is part of the cluster that encodes components for the biosynthesis of orsellinic acid, as well as biosynthesis of the cathepsin K inhibitors F9775 A and F9775 B [
,
], but the function of orsD is unknown. OrsD contains two segments that are likely to be C2H2 zinc binding domains.
Protein kinase B gamma (PKB-gamma), also known as AKT3, is one of three closely related serine/threonine-protein kinases: AKT1/alpha, AKT2/beta and AKT3/gamma. PKBs contain an N-terminal pleckstrin homology (PH) domain and a C-terminal catalytic domain [
]. PKB-gamma is predominantly expressed in neuronal tissues. Mice deficient in PKB-gamma show a reduction in brain weight due to the decreases in cell size and cell number []. PKB-gamma has also been shown to be upregulated in estrogen-deficient breast cancer cells, androgen-independent prostate cancer cells, and primary ovarian tumours []. It acts as a key mediator in the genesis of ovarian cancer [].
Protein kinase B alpha (PKB-alpha), also known as AKT1, is one of three closely related serine/threonine-protein kinases: AKT1/alpha, AKT2/beta and AKT3/gamma. PKBs contain an N-terminal pleckstrin homology (PH) domain and a C-terminal catalytic domain [
]. PKB-alpha is predominantly expressed in endothelial cells. It is critical for the regulation of angiogenesis and the maintenance of vascular integrity []. It also plays a role in adipocyte differentiation []. Mice deficient in PKB-alpha exhibit perinatal morbidity, growth retardation, reduction in body weight accompanied by reduced sizes of multiple organs, and enhanced apoptosis in some cell types. PKB-alpha activity has been reported to be frequently elevated in breast and prostate cancers [,
]. In some cancer cells, PKB-alpha may act as a suppressor of metastasis [].
Nuclear transition protein 1 (TP1) is one of the spermatid-specific proteins
[]. TP1 is a basic protein well conserved in mammalian species. In mammals, the second stage of spermatogenesis is characterised by the conversion of nucleosomal chromatin to the compact, non-nucleosomal and transcriptionally inactive form found in the sperm nucleus. This condensation is associated with a double-protein transition. The first transition corresponds to the replacement of histones by several spermatid-specific proteins (also called transition proteins) which are themselves replaced by protamines during the second transition.
This entry represents a heptapeptide located in positions 28 to 34, which contains five basic residues (Arg or Lys) as well as a tyrosine that could be essential for the destabilisation of chromatin by intercalating between the bases of DNA.
D-galactose-binding periplasmic protein MglB-like, PBP domain
Type:
Domain
Description:
This entry represents the PBP (periplasmic binding protein) domain found in a group of glucose/galactose-binding proteins, including MglB (a D-galactose-binding periplasmic protein) from E. coli [
]. They belong to the periplasmic binding protein superfamily which consists of two alpha/beta globular domains connected by a three-stranded hinge. This Venus flytrap-like domain undergoes transition from an open to a closed conformational state upon ligand binding [,
]. Moreover, this group of proteins are homologous to the ligand-binding domain of eukaryotic receptors such as glutamate receptor (GluR) and DNA-binding transcriptional repressors such as LacI and GalR [,
].
This family represents plant E3-ligase proteins and includes PP2CA INTERACTING RING FINGER PROTEIN 2 (PIR2, also known as MND1-interacting protein 1) and its paralogues RF4 and RF298 from Arabidopsis. PIR2 is the closest homologue of PIR1 (PP2CA interacting RING finger protein 1) and both positively modulate ABA signaling by targeting PP2CA for degradation [
].
Uncharacterised protein family, zinc metallopeptidase putative
Type:
Family
Description:
Members of this family of bacterial proteins are described as hypothetical proteins or zinc metallopeptidases. The majority have a HExxH zinc-binding motif characteristic of neutral zinc metallopeptidases, however there is no evidence to support their function as metallopeptidases.
Ribosome maturation protein SDO1/SBDS, C-terminal domain
Type:
Domain
Description:
This entry represents the C-terminal domain of proteins that are highly conserved in species ranging from archaea to vertebrates and plants [
]. This entry contains several Shwachman-Bodian-Diamond syndrome (SBDS) proteins from both mouse and humans. Shwachman-Diamond syndrome (OMIM 260400) is an autosomal recessive disorder with clinical features that include pancreatic exocrine insufficiency, haematological dysfunction and skeletal abnormalities. It is characterised by bone marrow failure and leukemia predisposition.Members of this entry play a role in RNA metabolism [
,
]. In yeast, SBDS orthologue SDO1 is involved in the biogenesis of the 60S ribosomal subunit and translational activation of ribosomes. Together with the EF-2-like GTPase RIA1 (EfI1), it triggers the GTP-dependent release of TIF6 from 60S pre-ribosomes in the cytoplasm, thereby activating ribosomes for translation competence by allowing 80S ribosome assembly and facilitating TIF6 recycling to the nucleus, where it is required for 60S rRNA processing and nuclear export. This data links defective late 60S subunit maturation to an inherited bone marrow failure syndrome associated with leukemia predisposition [].The SBDS protein is composed of three domains. The N-terminal (FYSH,
) domain is the most frequent target for disease mutations and contains a novel mixed α/β-fold, the central domain (
) consists of a three-helical bundle and the C-terminal domain (represented in this entry) has a ferredoxin-like fold [
,
].
Arf GTPase-activating protein GIT1/2, coiled-coil domain
Type:
Domain
Description:
This entry represents the coiled-coil region of GIT (G protein-coupled receptor kinase-interacting) proteins. This coiled-coil region is the surface that associates with the equivalent binding-region on beta-PIX, or p21-activated kinase-interacting exchange factor proteins. Both GIT and PIX complex together to form a scaffold for the formation of multi-protein assemblies. On its own the GIT-CC region assembles into a parallel two-stranded CC in the asymmetric unit. Similarly the PIX coiled-coil region assembles into a trimer. At least in vitro the two regions associate together into a stable heteropentameric complex that consists of one PIX trimer and one GIT dimer [
].
PACS-1 is a cytosolic sorting protein that directs the localisation of membrane proteins in the trans-Golgi network (TGN)/endosomal system. PACS-1 connects the clathrin adaptor AP-1 to acidic cluster sorting motifs contained in the cytoplasmic domain of cargo proteins such as furin, the cation-independent mannose-6-phosphate receptor and in viral proteins such as human immunodeficiency virus type 1 Nef [
].
This entry represents comG operon protein 7, ComGG. It is required for DNA-binding during transformation of competent bacterial cells [
]. About half of pre-ComGG is present as a peripheral membrane protein and the other half as an integral protein. Upon partial processing, ComGG is translocated to a position outside the membrane [].The comG operon of Bacillus subtilis encodes seven membrane associated proteins which function in binding of transforming DNA to the competent cell surface [
]. ComGC, GD, GE and GG have N-terminal sequence motifs typical of type 4 pre-pilins and are processed by a pathway that requires the product of comC, also an essential competence gene. They form pilin-like structures that are localised to the cytoplasmic membrane and cell wall []. The comG operon also consist of ComGF, a small integral membrane protein, ComGA and ComGB, which are predicted to be a nucleotide binding protein and an integral membrane protein respectively []. When strains missing each of the 7 proteins are created, they were all found to be nontransformable and failed to bind transforming DNA to the cell surface []. Competence is the ability of a cell to take up exogenous DNA from its environment, resulting in transformation. It is widespread among bacteria and is probably an important mechanism for the horizontal transfer of genes. DNA usually becomes available by the death and lysis of other cells. Competent bacteria use components of extracellular filaments called type 4 pili to create pores in their membranes and pull DNA through the pores into the cytoplasm. This process, including the development of competence and the expression of the uptake machinery, is regulated in response to cell-cell signalling and/or nutritional conditions [].
Nonstructural protein 13, zinc-binding domain, coronavirus-like
Type:
Domain
Description:
Nidoviruses (Coronaviridae, Arteriviridae, and Roniviridae) feature the most complex genetic organization among plus-strand RNA viruses. Their
replicase genes encode an exceptionally large number of nonstructural protein domains which mediate the key functions required for genomic RNA synthesis (replication) and subgenomic RNA (sgRNA) synthesis (transcription). They encode a nonstructural protein, called NSP10 in arteriviruses (Av) and NSP13 in coronaviruses (CoV) [], that is comprised of a C-terminal nucleoside triphosphate-binding/helicase (Hel) motif and a N-terminal cysteine-rich zinc-binding domain (ZBD). The ZBD is critically involved in nidovirus replication and transcription, modulating the ATPase/helicase activity in cis [,
,
,
]. In SARS-CoV, it has been shown that NSP12 directly interacts with NSP13 and enhances its helicase activity [,
,
,
].The ZBD is comprised of about 80 to 100 residues, including 12 to 13 conserved Cys/His residues. It consists of a RING-like module and treble-cleft zinc finger, together coordinating three Zn atoms. The N-terminal RING-like module has a notable binuclear structure with a cross-brace topology involving 6 Cys and 2 His residues that coordinate two zinc ions. The C-terminal zinc finger of ZBD adopts a treble-cleft fold distinct from that of
the RING module. It coordinates one Zn ion with a C[H/C]C[C/H] pattern [].This entry represents the ZBD domain of coronavirus helicase.
Accessory protein NS7, porcine deltacoronavirus (PDCoV)
Type:
Domain
Description:
This entry includes the accessory protein NS7 found in Porcine coronavirus HKU15. Porcine deltacoronavirus (PDCoV) encodes three accessory proteins, NS6, NS7 and NS7a. NS7a is a 100 amino-acid polypeptide identical to the C-terminal of NS7; it remains unclear whether their functions are redundant [
]. PDCoV HKU15, an emerging swine enteric coronavirus that causes diarrhea in neonatal piglets, has also been found in the respiratory tract of pigs and may be able to cause respiratory infections, thus possibly spreading through the respiratory route [,
]. NS7-specific mAbs that recognised cells transfected with an NS7 expression construct or infected with PDCoV also recognized NS7a, which is encoded by a separate subgenome mRNA with a non-canonical transcription regulatory sequence []. The NS7 protein is extensively distributed in the mitochondria and may be involved in various cellular processes such as cytoskeleton networks and cell communication, metabolism, and protein biosynthesis. NS7-expressing and PDCoV-infected cells also show a substantial down-regulation of alpha-actinin-4 [].
Methyl-CpG-binding domain protein 2/3, p55-binding region
Type:
Domain
Description:
This entry represents a second MBD domain of methyl-CpG-binding domain proteins 2 and 3. This region has been implicated in binding the RbAp46/48 (retinoblastoma protein-associated protein) homologue p55, which is one of the components of the MBD2-NuRD complex. The MBD2-NuRD complex is a nucleosome remodelling and deacetylation complex [
].
This entry represents FHF complex subunit HOOK interacting proteins (FHIP, also known as FAM160) found in metazoa and fungi, which are components of the FTS/Hook/FHIP complex (FHF complex). The FHF complex may function to promote vesicle trafficking and/or fusion via the homotypic vesicular protein sorting complex (the HOPS complex). FHF complex promotes the distribution of AP-4 complex to the perinuclear area of the cell [
]. FHIP2A (FAM160B1) is required for proper functioning of the nervous system []. FHIP2B, previously described as retinoic acid-induced protein 16, is able to activate MAPK/ERK and TGFB signalling pathways and may play a role in cell proliferation [].
Glutathione-regulated potassium-efflux system ancillary protein KefG/KefF
Type:
Family
Description:
This family includes the glutathione-regulated potassium-efflux system ancillary proteins KefG and KefF which belong to the NAD(P)H dehydrogenase (quinone) family. The potassium efflux system (Kef) is used for many bacteria to protect themselves from the toxic effects of electrophilic compounds [
]. This family also includes poorly characterised proteins, such as YrkL and YdeQ.
Glutathione-regulated potassium-efflux system ancillary protein KefG
Type:
Family
Description:
KefG is required for full activity of KefB. It is unlikely that KefG has oxido-reductase activity, it has probably evolved from its function as an oxido-reductase to be a regulator of KefB. KefG belongs to the NAD(P)H dehydrogenase (quinone) family,(KefG subfamily).
Glutathione-regulated potassium-efflux system ancillary protein KefF
Type:
Family
Description:
KefF is a stabilising subunit of the KefC K(+) efflux system [
]. It is required for full activity of KefC. KefF has oxidoreductase activity [], and probably evolved from its function as an oxido-reductase to be a regulator of KefC. KefF belongs to the NAD(P)H dehydrogenase (quinone) family.
This entry represents the accessory protein 4b, ORF4b (also called NS3c protein) of Pipistrellus bat coronavirus HKU5.ORF4b/NS3c plays a role in the inhibition of host innate immunity by inhibiting the interaction between host IkappaB kinase epsilon (IKBKE or IKKE) and mitochondrial antiviral-signalling protein (MAVS). In turn, this inhibition prevents the production of host interferon beta. Additionally, it may also interfere with host antiviral response within the nucleus. ORF4b/NS3c proteins in this subgroup are similar to the MERS-CoV ORF4b (also known as MERS-CoV 4b) which has been shown to interfere with the NF-kappaB-dependent innate immune response during infection, as well as antagonizing the early antiviral alpha/beta interferon (IFN-alpha/beta) response, which may significantly contribute to MERS-CoV pathogenesis [
,
,
,
].
The tripartite ATP-independent periplasmic (TRAP) transporters are substrate-binding protein (SBP)-dependent secondary transporters ubiquitous in prokaryotes, but absent from eukaryotes. They are comprised of an SBP of the DctP or TAXI families and two integral membrane proteins of unequal sizes that form the DctQ and DctM protein families (the small and large membrane components respectively). The TRAP transporter for sialic acid consists of the SBP siaP, and siaQM (termed siaT in some cases), encoding the fused integral membrane protein [
].This family consists of DctQ homologues found in TRAP transporters [
].