This family of bacterial proteins is functionally uncharacterised. Members in this family contain the conserved [S/T]GA[S/T]motif and are ApbE substrates which suggests that they may be involved in extra cytosolic redox activities [
].
This entry represents a family of phage (and bacteriocin) proteins related to the phage P2 V gene product (GpV), which forms the small spikes on the baseplate that bind to the host cell and penetrate through the host membrane [
]. This entry also includes Gp45, which is a component of the baseplate that forms a central needlelike spike used to puncture the host cell membrane for tube insertion during virus entry [,
]. At least one member is encoded within a region of Pectobacterium carotovorum (Erwinia carotovora) described as a bacteriocin, a phage tail-derived module able to kill bacteria closely related to the host strain.The characteristics of the protein distribution suggest prophage matches in addition to the phage matches.
The DND (DNA degradation) system produces an phosphorothioation modification to DNA, replacing a non-bridging oxygen of a phosphate group with sulfur. The modification causes DNA degradation during electrophoresis in Tris buffer. This protein, like DndB , contains a DGQHR domain, which also occurs in several contexts that suggest lateral transfer rather than DNA phosphorothioation-dependent restriction [
].
This is family of bacterial proteins likely to be necessary for binding to DNA and recognising the modification sites. Members are found in bacteria, archaea and on viral plasmids, and are typically between 354 and 474 amino acids in length. There is a conserved DGQHR sequence motif.Included in this family is the DndB protein encoded by an operon associated with a sulphur-containing modification to DNA [
]. DndB is described as a putative ATPase.
AvrL567 is a protein from the fungal pathogen Melampsora lini which induces plant disease resistance in flax plants [
]. Avirulence proteins trigger the resistance response in plants by interacting with plant disease-resistance proteins []. The protein has a novel β-barrel-like fold [].
Periplasmic metal-binding protein Tp34-type superfamily
Type:
Homologous_superfamily
Description:
This entry represents metal-binding periplasmic proteins. Tp34 has been classified together with other proteins in a group of uncharacterised proteins probably involved in high affinity Fe2+ transport. However, the structural and functional aspects of this group of proteins remain undetermined. Tp34 may bind zinc and the iron-sequestering lactoferrin [
] and may have a role in metal ion homeostasis [].
Members of this protein family are found exclusively in CRISPR-associated (cas) type I system gene clusters of the Dpsyc subtype. Markers for that type include a variant form of cas3 and the GSU0054-like protein family. This family occurs in less than half of known Dpsyc clusters.
This superfamily represents the C-terminal soluble domain characterised in MmpS4 from Mycobacterium tuberculosis, but is also present in other transport accessory proteins, including MmpS1-5 from Mycobacterium tuberculosis [
]. They are part of an export system required for biosynthesis and secretion of siderophores, and are essential for virulence of Mycobacterium tuberculosis []. MmpS4 possesses an N-terminal transmembrane (TM) helix and a C-terminal soluble domain [].
Fcf1 (also known as Utp24) is an essential protein involved in pre-rRNA processing and 40S ribosomal subunit assembly [
]. As a component of the small subunit (SSU) processome, Fcf1 is an essential nucleolar protein required for processing of the 18S pre-rRNA at sites A0-A2 [].The PIN (PilT N terminus) domain is a compact RNA-binding protein domain found in all domains of life. Typically, the PIN domain contains three or four conserved acidic residues (putative metal-binding, active site residues) [
]. The Fcf1 PIN domain has four of these putative active site residues and the Fcf1-Utp23 homologue PIN domain has three of them. Point mutation studies of the conserved acidic residues in the putative active site of Saccharomyces cerevisiae Fcf1 determined they were essential for pre-rRNA processing at sites A1 and A2, whereas the presence of the Fcf1 protein itself is also required for cleavage at site A0 [,
].
The CRISPR-Cas system is a prokaryotic defense mechanism against foreign genetic elements. The key elements of this defense system are the Cas proteins and the CRISPR RNA. Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes [
]. CRISPRs appear to provide acquired resistance against mobile genetic elements (viruses, transposable elements and conjugative plasmids). CRISPR clusters contain sequences complementary to antecedent mobile elements and target invading nucleic acids. CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA).The defense reaction is divided into three stages. In the adaptation stage, the invader DNA is cleaved, and a piece of it is selected to be integrated as a new spacer into the CRISPR locus, where it is stored as an identity tag for future attacks by this invader. During the second stage (the expression stage), the CRISPR RNA (pre-crRNA) is transcribed and subsequently processed into the mature crRNAs. In the third stage (the interference stage), Cas proteins, together with crRNAs, identify and degrade the invader [
,
,
].The CRISPR-Cas systems have been sorted into three major classes. In CRISPR-Cas types I and III, the mature crRNA is generally generated by a member of the Cas6 protein family. Whereas in system III the Cas6 protein acts alone, in some class I systems it is part of a complex of Cas proteins known as Cascade (CRISPR-associated complex for antiviral defense). The Cas6 protein is an endoribonuclease necessary for crRNA production whereas the additional Cas proteins that form the Cascade complex are needed for crRNA stability [
]. This entry describes the Cas1 variant of the NMENI subtype of CRISPR/Cas system.
Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes [
]. CRISPRs appear to provide acquired resistance against mobile genetic elements (viruses, transposable elements and conjugative plasmids). CRISPR clusters contain sequences complementary to antecedent mobile elements and target invading nucleic acids. CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA). In addition, there are many protein families known as CRISPR-associated sequences (Cas), which are encoded in the vicinity of CRISPR loci [
]. CRISPR/cas gene regions can be quite large, with up to 20 different, tandem-arranged cas genes next to a CRISPR cluster or filling the region between two repeat clusters. Cas genes and CRISPRs are found on mobile genetic elements such as plasmids, and have undergone extensive horizontal transfer. Cas proteins are thought to be involved in the propagation and functioning of CRISPRs. Some Cas proteins show similarity to helicases and repair proteins, although the functions of most are unknown. Cas families can be divided into subtypes according to operon organisation and phylogeny. This entry describes the Cas1 protein particular to the DVULG subtype of CRISPR/Cas system. This is also known as Cas1 Type I-C [
].
The CRISPR-Cas system is a prokaryotic defense mechanism against foreign genetic elements. The key elements of this defense system are the Cas proteins and the CRISPR RNA. Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes [
]. CRISPRs appear to provide acquired resistance against mobile genetic elements (viruses, transposable elements and conjugative plasmids). CRISPR clusters contain sequences complementary to antecedent mobile elements and target invading nucleic acids. CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA).The defense reaction is divided into three stages. In the adaptation stage, the invader DNA is cleaved, and a piece of it is selected to be integrated as a new spacer into the CRISPR locus, where it is stored as an identity tag for future attacks by this invader. During the second stage (the expression stage), the CRISPR RNA (pre-crRNA) is transcribed and subsequently processed into the mature crRNAs. In the third stage (the interference stage), Cas proteins, together with crRNAs, identify and degrade the invader [
,
,
].The CRISPR-Cas systems have been sorted into three major classes. In CRISPR-Cas types I and III, the mature crRNA is generally generated by a member of the Cas6 protein family. Whereas in system III the Cas6 protein acts alone, in some class I systems it is part of a complex of Cas proteins known as Cascade (CRISPR-associated complex for antiviral defense). The Cas6 protein is an endoribonuclease necessary for crRNA production whereas the additional Cas proteins that form the Cascade complex are needed for crRNA stability [
]. This entry describes the Cas1 protein particular to the YPEST subtype of CRISPR/Cas system. This is also known as Cas1 Type I-F [
].
CRISPR-associated protein Cas1, HMARI/TNEAP subtype
Type:
Family
Description:
The CRISPR-Cas system is a prokaryotic defense mechanism against foreign genetic elements. The key elements of this defense system are the Cas proteins and the CRISPR RNA. Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes [
]. CRISPRs appear to provide acquired resistance against mobile genetic elements (viruses, transposable elements and conjugative plasmids). CRISPR clusters contain sequences complementary to antecedent mobile elements and target invading nucleic acids. CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA).The defense reaction is divided into three stages. In the adaptation stage, the invader DNA is cleaved, and a piece of it is selected to be integrated as a new spacer into the CRISPR locus, where it is stored as an identity tag for future attacks by this invader. During the second stage (the expression stage), the CRISPR RNA (pre-crRNA) is transcribed and subsequently processed into the mature crRNAs. In the third stage (the interference stage), Cas proteins, together with crRNAs, identify and degrade the invader [
,
,
].The CRISPR-Cas systems have been sorted into three major classes. In CRISPR-Cas types I and III, the mature crRNA is generally generated by a member of the Cas6 protein family. Whereas in system III the Cas6 protein acts alone, in some class I systems it is part of a complex of Cas proteins known as Cascade (CRISPR-associated complex for antiviral defense). The Cas6 protein is an endoribonuclease necessary for crRNA production whereas the additional Cas proteins that form the Cascade complex are needed for crRNA stability [
]. This entry describes a Cas1 subgroup that includes Cas1 proteins of the related HMARI and TNEAP subtypes of CRISPR/Cas system.
The CRISPR-Cas system is a prokaryotic defense mechanism against foreign genetic elements. The key elements of this defense system are the Cas proteins and the CRISPR RNA. Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes [
]. CRISPRs appear to provide acquired resistance against mobile genetic elements (viruses, transposable elements and conjugative plasmids). CRISPR clusters contain sequences complementary to antecedent mobile elements and target invading nucleic acids. CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA).The defense reaction is divided into three stages. In the adaptation stage, the invader DNA is cleaved, and a piece of it is selected to be integrated as a new spacer into the CRISPR locus, where it is stored as an identity tag for future attacks by this invader. During the second stage (the expression stage), the CRISPR RNA (pre-crRNA) is transcribed and subsequently processed into the mature crRNAs. In the third stage (the interference stage), Cas proteins, together with crRNAs, identify and degrade the invader [
,
,
].The CRISPR-Cas systems have been sorted into three major classes. In CRISPR-Cas types I and III, the mature crRNA is generally generated by a member of the Cas6 protein family. Whereas in system III the Cas6 protein acts alone, in some class I systems it is part of a complex of Cas proteins known as Cascade (CRISPR-associated complex for antiviral defense). The Cas6 protein is an endoribonuclease necessary for crRNA production whereas the additional Cas proteins that form the Cascade complex are needed for crRNA stability [
]. This entry represents the MYXAN subtype of the Cas1 endonuclease. Species with this type of CRISPR system include Myxococcus xanthus, Cyanothece sp., Leptospira interrogans, Sorangium cellulosum, and Anabaena variabilis ATCC 29413. The entry also detects Cas4/Cas1 fusion proteins.
CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA [
]. Cas1 is the most universal CRISPR system protein thought to be involved in spacer integration []. Cas1 is metal-dependent deoxyribonuclease [], also binds RNA [], and has been shown to possess a unique fold consisting of a N-terminal β-strand domain and a C-terminal α-helical domain [].This entry represents CRISPR-Cas subtype I-E. In E. coli, the CRISPR-Cas I-E system consists of up to three CRISPR-spacer arrays (i.e., CRISPR2.1, CRISPR2.2 and CRISPR2.3) made of type 2 repeats [
].
The CRISPR-Cas system is a prokaryotic defense mechanism against foreign genetic elements. The key elements of this defense system are the Cas proteins and the CRISPR RNA. Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes [
]. CRISPRs appear to provide acquired resistance against mobile genetic elements (viruses, transposable elements and conjugative plasmids). CRISPR clusters contain sequences complementary to antecedent mobile elements and target invading nucleic acids. CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA).The defense reaction is divided into three stages. In the adaptation stage, the invader DNA is cleaved, and a piece of it is selected to be integrated as a new spacer into the CRISPR locus, where it is stored as an identity tag for future attacks by this invader. During the second stage (the expression stage), the CRISPR RNA (pre-crRNA) is transcribed and subsequently processed into the mature crRNAs. In the third stage (the interference stage), Cas proteins, together with crRNAs, identify and degrade the invader [
,
,
].The CRISPR-Cas systems have been sorted into three major classes. In CRISPR-Cas types I and III, the mature crRNA is generally generated by a member of the Cas6 protein family. Whereas in system III the Cas6 protein acts alone, in some class I systems it is part of a complex of Cas proteins known as Cascade (CRISPR-associated complex for antiviral defense). The Cas6 protein is an endoribonuclease necessary for crRNA production whereas the additional Cas proteins that form the Cascade complex are needed for crRNA stability [
]. This entry describes the Cas1 protein particular to the ECOLI subtype of CRISPR/Cas system.
Ribonuclease P (Rnp) is a ubiquitous ribozyme that catalyzes a Mg2 -dependent hydrolysis to remove the 5'-leader sequence of precursor tRNA (pre-tRNA) in all three domains of life [
]. In bacteria, the catalytic RNA (typically ~120kDa) is aided by a small protein cofactor (~14kDa) []. Archaeal and eukaryote RNase P consist of a single RNA and archaeal RNase P has four or five proteins, while eukaryotic RNase P consists of 9 or 10 proteins. Eukaryotic and archaeal RNase P RNAs cooperatively function with protein subunits in catalysis [].This entry represents RNP1 from archaea and its homologues from eukaryotes [
].
Armadillo-type fold containing protein ARMC8/Vid28
Type:
Family
Description:
Proteins in this family contain the armadillo-type fold, such as ARMC8 from humans and Vid28 from budding yeasts. Vid28 is a subunit of the GID Complex, a multisubunit ubiquitin ligase, involved in catabolite-induced degradation of gluconeogenic enzymes [
,
]. ARMC8 plays an important role in regulating cell migration and proliferation, but its exact function is not clear [].
Nucleolar and spindle-associated protein 1 (NuSAP) is a microtubule-associated protein with the capacity to bundle and stabilise microtubules [
]. When overexpressed, it causes profound cytoplasmic microtubule bundling in interphase cells. It localises to the spindle during mitosis. NuSAP depletion by RNA interference results in defect in spindle midzone formation and aberrant anaphase and cytokinesis []. NuSAP may be regulated by phosphorylation during mitosis [,
,
]. NuSAP is indispensable for mitosis and may play an important role in cancer progression and aggressiveness [,
,
,
,
,
].
This AAA domain is found in a group of uncharacterised proteins, including YhaN from Bacillus subtilis and DNA double-strand break repair Rad50 ATPase from Archaeoglobus fulgidus.AAA ATPases belong to the AAA+ superfamily of ringshaped P-loop NTPases, which exert their activity through the energy-dependent unfolding of macromolecules [
].
The plsX gene is part of the bacterial fab gene cluster which encodes several key fatty acid biosynthetic enzymes [
].The plsX gene encodes a poorly understood enzyme of phospholipid
metabolism [].
Equatorial segment protein (ESP) been localised to the equatorial segment of the acrosome in sperm. It defines a discrete acrosomal subcompartment that persists throughout acrosomal biogenesis [
] and is required for fully fertile sperm [].
This domain is found in the eukaryotic 60S ribosomal proteins P1 and P2, as well as in archaebacterial 50S ribosomal protein L12. It is involved in dimerization [
]. These proteins play an important role in the elongation step of protein synthesis [,
].
This entry represents a family of proteins from cellular organisms. Structural analysis suggest members of this group are likely to have a heavy-metal binding domain. The protein oligomerises as a pentamer [
].
CRISPR-associated protein Cas7/Cst2/DevR, subtype I-a/Apern
Type:
Family
Description:
The CRISPR-Cas system is a prokaryotic defense mechanism against foreign genetic elements. The key elements of this defense system are the Cas proteins and the CRISPR RNA. Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes [
]. CRISPRs appear to provide acquired resistance against mobile genetic elements (viruses, transposable elements and conjugative plasmids). CRISPR clusters contain sequences complementary to antecedent mobile elements and target invading nucleic acids. CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA).The defense reaction is divided into three stages. In the adaptation stage, the invader DNA is cleaved, and a piece of it is selected to be integrated as a new spacer into the CRISPR locus, where it is stored as an identity tag for future attacks by this invader. During the second stage (the expression stage), the CRISPR RNA (pre-crRNA) is transcribed and subsequently processed into the mature crRNAs. In the third stage (the interference stage), Cas proteins, together with crRNAs, identify and degrade the invader [
,
,
].The CRISPR-Cas systems have been sorted into three major classes. In CRISPR-Cas types I and III, the mature crRNA is generally generated by a member of the Cas6 protein family. Whereas in system III the Cas6 protein acts alone, in some class I systems it is part of a complex of Cas proteins known as Cascade (CRISPR-associated complex for antiviral defense). The Cas6 protein is an endoribonuclease necessary for crRNA production whereas the additional Cas proteins that form the Cascade complex are needed for crRNA stability [
]. This entry represents the Csa2 (CRISPR/Cas subtype protein 2) family of proteins, which includes MJ0381 from Methanocaldococcus jannaschii (Methanococcus jannaschii). This archaeal clade is a member of the DevR family, which includes the DevR protein of Myxococcus xanthus, a protein whose expression appears to be regulated through a number of means, including both location and autorepression. DevR is a key regulator of development, and mutants in DevR are incapable of fruiting body development [
].
Cbl (Casitas B-lineage lymphoma) is an adaptor protein that functions as a negative regulator of many signalling pathways that start from receptors at the cell surface.The N-terminal region of Cbl contains a Cbl-type phosphotyrosine-binding (Cbl-PTB) domain, which is composed of three evolutionarily conserved domains: an N-terminal four-helix bundle (4H) domain, an EF hand-like calcium-binding domain, and a divergent SH2-like domain. The calcium-bound EF-hand wedges between the 4H and SH2 domains, and roughly determines their relative orientation. The Cbl-PTB domain has also been named Cbl N-terminal (Cbl-N) or tyrosine kinase binding (TKB) domain [
,
].The N-terminal 4H domain contains four long α-helices. The C and D helices in this domain pack against the adjacent EF-hand-like domain, and a highly conserved loop connecting the A and B helices contacts the SH2-like domain. The EF-hand motif is similar to classical EF-hand proteins. The SH2-like
domain retains the general helix-sheet-helix architecture of the SH2 fold, but lacks the secondary β-sheet, comprising β-strands D', E and F, and also a prominent BG loop [].This entry represents the EF hand-like domain.
Cbl (Casitas B-lineage lymphoma) is an adaptor protein that functions as a negative regulator of many signalling pathways that start from receptors at the cell surface.The N-terminal region of Cbl contains a Cbl-type phosphotyrosine-binding (Cbl-PTB) domain, which is composed of three evolutionarily conserved domains: an N-terminal four-helix bundle (4H) domain, an EF hand-like calcium-binding domain, and a divergent SH2-like domain. The calcium-bound EF-hand wedges between the 4H and SH2 domains, and roughly determines their relative orientation. The Cbl-PTB domain has also been named Cbl N-terminal (Cbl-N) or tyrosine kinase binding (TKB) domain [
,
].The N-terminal 4H domain contains four long α-helices. The C and D helices in this domain pack against the adjacent EF-hand-like domain, and a highly conserved loop connecting the A and B helices contacts the SH2-like domain. The EF-hand motif is similar to classical EF-hand proteins. The SH2-like
domain retains the general helix-sheet-helix architecture of the SH2 fold, but lacks the secondary β-sheet, comprising β-strands D', E and F, and also a prominent BG loop [].This entry represents the SH2-like domain.
This entry represents the conjugal transfer protein TraQ found in Bacteroides species. The Bacteroides thetaiotaomicron gene coding for this protein is located in a conjugate transposon and appears to be upregulated in the presence of host or other bacterial species compared to growth in pure culture [,
].
CFAP91 (also known as AMY-1-associating protein expressed in testis 1 or AAT-1) is a component of a spoke-associated complex, regulates flagellar dynein activity by mediating regulatory signals between the radial spokes and dynein arms [
]. It binds to AMY1 and may play a role in spermatogenesis [].
Copper resistance protein CopC/internalin, immunoglobulin-like
Type:
Homologous_superfamily
Description:
This superfamily represents an immunoglobulin E-set-like β-barrel domain found in the following proteins:Copper-resistance proteins CopC and PcoC. CopC is a bacterial copper protein involved in copper homeostasis that binds 1 equivalent of copper (II). Its immunoglobulin-like fold is similar to that of the unrelated blue copper proteins [
]. PcoC is a plasmid-encoded, soluble, periplasmic protein that is essential for copper resistance in Escherichia coli [
].Internalin proteins InlA, InlB,InlI and InlJ. These proteins are members of a family of listerial cell surface proteins from the opportunistic pathogen Listeria monocytogenes. Their N-terminal regions consist of a central LRR region flanked by an EF-hand-like cap on one end, and an immunoglobulin-like fold on the other. Together these regions form a domain that directs host cell-specific invasion [
].Pullulanase A, involved in the degradation of glycogen of the mammalian host cells [
].
The orbivirus VP5 protein is one of the two proteins (with VP2) which make up the virus particle outer capsid. Cryoelectron microscopy indicates that VP5 is a trimer suggesting that there are 360 copies of VP5 per virion [
].
Plasma membrane fusion protein PRM1 is a fungi protein involved in cell fusion during mating by stabilising the plasma membrane fusion event [
,
,
,
,
].
Calcium/calmodulin-dependent protein kinase II inhibitor
Type:
Family
Description:
This family includes calcium/calmodulin-dependent protein kinase II inhibitor 1 and 2 (CAMK2N1 and CAMK2N2). They are potent and specific inhibitor of CaM-kinase II (CAMK2) [
].
This family represents a putative integral membrane protein that is likely to be the membrane component of an ABC transport system. This family is found in bacteria and archaea.
Spindle and kinetochore-associated protein 2 (Ska2) is a component of the SKA1 complex (consists of Ska1, Ska2, and Ska3/Rama1), a microtubule-binding subcomplex of the outer kinetochore that is essential for proper chromosome segregation [
]. It is required for timely anaphase onset during mitosis, when chromosomes undergo bipolar attachment on spindle microtubules leading to silencing of the spindle checkpoint []. The SKA1 complex is a direct component of the kinetochore-microtubule interface and directly associates with microtubules as oligomeric assemblies. The complex facilitates the processive movement of microspheres along a microtubule in a depolymerisation-coupled manner. In the complex, it is required for SKA1 localisation [
].
Disulphide bonds contribute to folding, maturation, stability, and regulation of proteins, in particular those localized out of the cytosol. Oxidation of selected pairs of cysteines to disulphide in vivo requires cellular factors present in the bacterial periplasmic space or in the endoplasmic reticulum of eukaryotic cells [
,
].This family represents disulphide bond formation protein C (BdbC) from Bacillus subtilis which functionally corresponds to the well-characterised E. coli DsbB [
].
Disulphide bonds contribute to folding, maturation, stability, and regulation of proteins, in particular those localized out of the cytosol. Oxidation of selected pairs of cysteines to disulphide in vivo requires cellular factors present in the bacterial periplasmic space or in the endoplasmic reticulum of eukaryotic cells [
,
].The disulfide bond formation protein B (DsbB) is a component of the pathway that leads to disulphide bond formation in periplasmic proteins of Escherichia coli and other bacteria. The DsbB protein oxidises the periplasmic protein DsbA which in turn oxidises cysteines in other periplasmic proteins in order to make disulphide bonds [
]. DsbB acts as a redox potential transducer across the cytoplasmic membrane. It is a membrane protein which spans the membrane four times with both the N- and C-termini of the protein are in the cytoplasm. Each of the periplasmic domains of the protein has two essential cysteines. The two cysteines in the first periplasmic domain are in a Cys-X-Y-Cys configuration that is characteristic of the active site of other proteins involved in disulphide bond formation, including DsbA and protein disulphide isomerase [].This entry also includes disulphide bond formation protein BdbC from Bacillus subtilis which functionally corresponds to the well-characterised E. coli DsbB [].
Protein phosphorylation, which plays a key role in most cellular activities, is a reversible process mediated by protein kinases and phosphoprotein phosphatases. Protein kinases catalyse the transfer of the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. Phosphoprotein phosphatases catalyse the reverse process.
Protein kinases fall into three broad classes, characterised with respect to substrate specificity []:Serine/threonine-protein kinasesTyrosine-protein kinasesDual specificity protein kinases (e.g. MEK - phosphorylates both Thr and Tyr on target proteins)Protein kinase function is evolutionarily conserved from Escherichia coli to human [
]. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation []. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins. The catalytic subunits of protein kinases are highly conserved, and several structures have been solved [], leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases [].In the absence of cAMP, protein kinase A (PKA) exists as an equimolar tetramer of regulatory (R) and catalytic (C) subunits. In addition to its role as an inhibitor of the C subunit, the R subunit anchors the holoenzyme to specific intracellular locations and prevents the C subunit from entering the nucleus. Typical R subunits have a conserved domain structure, consisting of the N-terminal dimerisation domain, inhibitory region, cAMP-binding domain A and cAMP-binding domain B. R subunits interact with C subunits primarily through the inhibitory site. The cAMP-binding domains show extensive sequence similarity and bind cAMP cooperatively.On the basis of phylogenetic trees generated from multiple sequence alignment of complete sequences, this family was divided into four sub-families, types I to IV [
]. Types I and II, found in animals, differ in molecular weight, sequence, autophosphorylation capability, cellular location and tissue distribution. Types I and II are further sub-divided into alpha and beta subtypes, based mainly on sequence similarity. Type III are from fungi and type IV are from alveolates.
Cbl (Casitas B-lineage lymphoma) is an adaptor protein that functions as a negative regulator of many signalling pathways that start from receptors at the cell surface.The N-terminal region of Cbl contains a Cbl-type phosphotyrosine-binding (Cbl-PTB) domain, which is composed of three evolutionarily conserved domains: an N-terminal four-helix bundle (4H) domain, an EF hand-like calcium-binding domain, and a divergent SH2-like domain. The calcium-bound EF-hand wedges between the 4H and SH2 domains, and roughly determines their relative orientation. The Cbl-PTB domain has also been named Cbl N-terminal (Cbl-N) or tyrosine kinase binding (TKB) domain [
,
].The N-terminal 4H domain contains four long α-helices. The C and D helices in this domain pack against the adjacent EF-hand-like domain, and a highly conserved loop connecting the A and B helices contacts the SH2-like domain. The EF-hand motif is similar to classical EF-hand proteins. The SH2-like
domain retains the general helix-sheet-helix architecture of the SH2 fold, but lacks the secondary β-sheet, comprising β-strands D', E and F, and also a prominent BG loop [].This entry represents the Cbl-PTB domain.
Drosophila Hemingway (Hmw) is required for motile cilia function and sperm flagella assembly [
]. The human orthologue, known as cilia- and flagella-associated protein 97 (CFAP97) or KIAA1430, localizes to the primary cilium [].
Protein phosphorylation, which plays a key role in most cellular activities, is a reversible process mediated by protein kinases and phosphoprotein phosphatases. Protein kinases catalyse the transfer of the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. Phosphoprotein phosphatases catalyse the reverse process.
Protein kinases fall into three broad classes, characterised with respect to substrate specificity []:Serine/threonine-protein kinasesTyrosine-protein kinasesDual specificity protein kinases (e.g. MEK - phosphorylates both Thr and Tyr on target proteins)Protein kinase function is evolutionarily conserved from Escherichia coli to human [
]. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation [
]. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins. The catalytic subunits of protein kinases are highly conserved, and several structures have been solved [], leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases [].MAP (Mitogen Activated Protein) kinases participate in kinase cascades,whereby at least 3 protein kinases act in series, culminating in activation
of MAP kinase []. MAP kinases are activated by dual phosphorylationon both tyrosine and threonine residues of a conserved TXY motif.
p38 proteins belong to the MAP kinase family and were discovered in 3different contexts independently: first, as tyrosine phosphoproteins found
in extracts of cells treated with inflammatory cytokines; second, astargets of a pyrinidyl imidazole drug that blocks production of TNFalpha; and third, as reactivating kinases for MAP kinase-activated protein
(MAPKAP) []. The proteins are activated by cytokines, hormones, GPCRs,osmotic shock, heat shock and other stresses [
].
This entry represents a structural domain found in the cell division protein ZapA, as well as in related proteins. This domain has a core structure consisting of two layers alpha/beta, and has a long C-terminal helix that forms dimeric parallel and tetrameric antiparallel coiled coils []. ZapA interacts with FtsZ, where FtsZ is part of a mid-cell cytokinetic structure termed the Z-ring that recruits a hierarchy of fission related proteins early in the bacterial cell cycle. ZapA drives the polymerisation and filament bundling of FtsZ, thereby contributing to the spatio-temporal tuning of the Z-ring.The ZapA structure has a beta(2)-alpha(2) fold with two layers (alpha/beta). It has a long C-terminal helix which forms dimeric parallel and tetrameric antiparallel coiled coils.
Protein tyrosine (pTyr) phosphorylation is a common post-translational modification which can create novel recognition motifs for protein interactions and cellular localisation, affect protein stability, and regulate enzyme activity. Consequently, maintaining an appropriate level of protein tyrosine phosphorylation is essential for many cellular functions. Tyrosine-specific protein phosphatases (PTPase;
) catalyse the removal of a phosphate group attached to a tyrosine residue, using a cysteinyl-phosphate enzyme intermediate. These enzymes are key regulatory components in signal transduction pathways (such as the MAP kinase pathway) and cell cycle control, and are important in the control of cell growth, proliferation, differentiation and transformation [,
]. The PTP superfamily can be divided into four subfamilies []:(1) pTyr-specific phosphatases(2) dual specificity phosphatases (dTyr and dSer/dThr)(3) Cdc25 phosphatases (dTyr and/or dThr)(4) LMW (low molecular weight) phosphatasesBased on their cellular localisation, PTPases are also classified as:Receptor-like, which are transmembrane receptors that contain PTPase domains [
]
Non-receptor (intracellular) PTPases [
]
All PTPases carry the highly conserved active site motif C(X)5R (PTP signature motif), employ a common catalytic mechanism, and share a similar core structure made of a central parallel β-sheet with flanking α-helices containing a β-loop-α-loop that encompasses the PTP signature motif [
]. Functional diversity between PTPases is endowed by regulatory domains and subunits. This entry represents the low molecular weight (LMW) protein-tyrosine phosphatases (or acid phosphatase), which act on tyrosine phosphorylated proteins, low-MW aryl phosphates and natural and synthetic acyl phosphates [
,
]. The structure of a LMW PTPase has been solved by X-ray crystallography [] and is found to form a single structural domain. It belongs to the alpha/beta class, with 6 α-helices and 4 β-strands forming a 3-layer α-β-alpha sandwich architecture.
Arenaviridae produce four gene products: RNA-directed RNA polymerase, Z (zinc finger) protein, nucleocapsid protein (NP), and envelope glycoprotein precursor (which gives rise to mature proteins GP1 and GP2). The smallest protein is the Z protein (also called p11), which has molecular size of 11 kD. The Z protein has a zinc-binding RING-finger motif. It has been suggested that the Z-protein is a structural protein and is a component of the nucleocapsid. The arenavirus RING-finger Z protein is the main driving force of arenavirus budding, and myristoylation of its N terminus plays a key role [
]. Z proteins also possess, near their C-termini, PPxY and/or P(T/S)AP motifs. These same motifs are found in the M proteins of Ebola (and other negative stranded RNA) viruses, as well as in Gag proteins of retroviruses. They are called "late budding domains"or "L domains"(despite their short length), and are essential for viral budding. Arenaviruses do not possess the M (or matrix) protein, but the Z protein plays the same important role [
]. The Z protein is hydrophobic and is associated with the nucleocapsid of the virion core [
]. It has been shown to interact with several cellular proteins, including the promyelocytic leukemia protein and the eukaryote translation initiation factor 4E. The former has been proposed to contribute to the non-cytolytic nature of LCMV infection, whereas the latter has been proposed to repress cap-dependent translation. Early studies suggested a role of Z in arenavirus transcriptional regulation. However, it has been shown that Z is not required for virus RNA replication and transcription; rather, it exhibits a dose-dependent inhibitory effect on RNA synthesis mediated by the arenavirus polymerase.
This family consists of various hypothetical proteins from cyanobacteria, none of which are functionally described. It includes
, PDB:3fcn, is a small protein of unkown function that has a novel all-alpha fold. The family has several highly conserved sequence motifs, including YD/ExD, DxxNVxEEIE, and CPY/F/W, as well as conserved tryptophans.
This entry includes fission yeast ribonucleases P/MRP protein subunit Pop7 and its homologue, Rpp20, from animals. Pop7/Rpp20 is a component of ribonuclease P, a protein complex that generates mature tRNA molecules by cleaving their 5'-ends. They are also a component of RNase MRP complex, which cleaves pre-rRNA sequences [
,
].
Synaptotagmin-like proteins (Slps) contain a N-terminal RabBD (Rab-binding) domain and two C-terminal C2 domains, C2A and C2B [
]. The characteristic feature of the Slp family is the N-terminal domain (referred to as SHD for Slp Homology Domain), which is not found in other C-type tandem C2 proteins []. SHD functions as a specific Rab27A/B-binding domain []. The C2B domain of Slp4 (also known as Granuphilin) interact with the plasma membrane lipid phosphatidylinositol-(4,5)-bisphosphate [PI(4,5)P2][
]. C2 domains fold into an 8-standed β-sandwich that can adopt 2 structural arrangements, type I and type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions [
,
,
,
,
,
,
,
].
Protein kinase C (PKC) is a member of a family of Ser/Thr phosphotransferases that are involved in many cellular signaling pathways. Fungi have only one or two PKCs in contrast to mammals, which have at least 9 [
]. Saccharomyces cerevisiae contains a single PKC isozyme, Pkc1p, which contains all of the regulatory motifs found in mammalian PKCs []. In addition to its main function in maintaining cell integrity, fungi PKCs have been implicated in the regulation of diverse processes such as the organization of the actin cytoskeleton, autophagy and apoptosis, cell cycle control, cytokinesis and genetic stability [,
]. PKC has two antiparallel coiled-coiled regions (ACC finger domain) known as HR1 (PKC homology region 1/ Rho binding domain) upstream of the C2 domain and two C1 domains downstream.The C2 domain was first identified in PKC. C2 domains fold into an 8-standed β-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains, like those of PKC, are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions [
,
,
,
,
].This entry represents the C2 domain of fungal PKC-like proteins.
Saccharomyces cerevisiae Inn1 associates with the contractile actomyosin ring at the end of mitosis and is needed for cytokinesis [
]. The C2 domain of Inn1, located at the N terminus, is required for ingression of the plasma membrane. The C terminus is relatively unstructured and contains eight PXXP motifs that are thought to mediate interaction of Inn1 with SH3 domains in the cytokinesis proteins Hof1 (an F-BAR protein) and Cyk3 (whose overexpression can restore primary septum formation in Inn1Delta cells) as well as recruiting Inn1 to the bud-neck by binding to Cyk3 [,
]. Inn1 and Cyk3 appear to cooperate in activating chitin synthase Chs2 for primary septum formation, which allows coordination of actomyosin ring contraction with ingression of the cleavage furrow []. It is thought that the C2 domain of Inn1 helps to preserve the link between the actomyosin ring and the plasma membrane, contributing both to membrane ingression, as well as to stability of the contracting ring. Additionally, Inn1 might induce curvature of the plasma membrane adjacent to the contracting ring, thereby promoting ingression of the membrane []. S. pombe Inn1 is also involved in the ingression of the plasma membrane during cytokinesis. However, it does not play an essential role, probably because the actinomyosin ring is connected to the cell cortex by many more proteins [
].
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [
,
]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [
,
].This entry represents 50S ribosomal protein L32 from bacteria. It also includes 60S ribosomal protein L32 from the protozoa Reclinomonas americana.
Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [
,
]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [
,
].This entry includes 50S ribosomal protein L32-1 from plants and cyanobacteria.
Patatin-like phospholipase domain-containing protein 2
Type:
Domain
Description:
Patatin-like phospholipase domain-containing protein 2 (PNPLA2) plays a key role in hydrolysis of stored triacylglecerols and is also known as adipose triglyceride lipase (ATGL) [
]. Members of this family share a patatin domain, initially discovered in potato tubers [,
]. ATGL is expressed in white and brown adipose tissue in high mRNA levels. Mutations in PNPLA2 encoding adipose triglyceride lipase (ATGL) leads to neutral lipid storage disease (NLSD) which is characterized by the accumulation of triglycerides in multiple tissues []. ATGL mutations are also commonly associated with severe forms of skeletal- and cardio-myopathy. ATGL is regulated by insulin []. PNPLA2/ATGL is also known as TTS-2.2 (transport-secretion protein 2.2), iPLA2-zeta (calcium-independent phospholipase A2), and desnutrin [,
,
].
Flagellum-associated coiled-coil domain-containing protein 1
Type:
Family
Description:
Juvenile amyotrophic lateral sclerosis (ALS) is a form of chronic motor neuron disease characterised by combined upper and lower motor neuron symptoms. Amyotrophic lateral sclerosis 2 (ALS2) is an autosomal recessive form of juvenile ALS and has been mapped to human chromosome 2q33 [
]. Amyotrophic lateral sclerosis 2 chromosomal region candidate gene 12 protein (ALS2CR12), also known as FLACC1, is a putative GTPase regulator and its mutation is linked to the familial amyotrophic lateral sclerosis 2 [].
Members of this group are predicted to be metal-dependent hydrolases based on sequence analysis. They are related to Mg-dependent DNases and contain a TadD DNase domain. However, the similarity is not strong enough to confidently predict that these proteins are necessarily DNases and not some other type of metal-dependent hydrolase. Another related group is the TatD deoxyribonuclease family.Members of this group may be distantly related to a large 3D fold-based domain superfamily of metalloenzymes [
]. The description of this fold superfamily was based on an analysis of conservation patterns in three dimensions, and the discovery that the same active-site architecture occurs in a large set of enzymes involved primarily in nucleotide metabolism. The group is thought to include urease, dihydroorotase, allantoinase, hydantoinase, AMP-, adenine- and cytosine- deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolase, formylmethanofuran dehydrogenase, and other enzymes [].
Conserved hypothetical protein CHP04046, FMN-dependent
Type:
Family
Description:
Members of this protein family belong to a conserved seven-gene biosynthetic cluster found sparsely in Cyanobacteria, Proteobacteria, and Actinobacteria. Distant homologies to characterised proteins suggest that members are enzymes dependent on a flavinoid cofactor.
Many bacteria are covered in a layer of surface-associated polysaccharide called the capsule. These capsules can be divided into four groups depending upon the organisation of genes responsible for capsule assembly, the assembly pathway and regulation [
]. This superfamily plays a role in group 1 capsule biosynthesis. It is likely to be involved in the later stages of capsule assembly. Structurally, Wzi consists of an 18-stranded β-barrel with a periplasmic helical bundle that has a role in the recognition of the capsular polysaccharide. It is predicted that the Wzi-polysaccharide interaction is critical in the initialisation step of the functional capsule biosynthesis and the later steps of the capsule assembly [
,
].
In prokaryotes, RuvA, RuvB, and RuvC process the universal DNA intermediate of homologous recombination, termed Holliday junction. The tetrameric DNA helicase RuvA specifically binds to the Holliday junction and facilitates the isomerization of the junction from the stacked folded configuration to the square-planar structure [
]. In the RuvA tetramer, each subunit consists of three domains, I, II and III, where I and II form the major core that is responsible for Holliday junction binding and base pair rearrangements of Holliday junction executed at the crossover point, whereas domain III regulates branch migration through direct contact with RuvB.
Uncharacterised protein family, CYTH/CHAD/HD-like domain-containing
Type:
Family
Description:
This group includes uncharacterised proteins from Methanosarcina spp. with CYTH and CHAD domains, and a central HD-like domain.CYTH domain proteins may play a central role in the interface between nucleotide and polyphosphate metabolism [
]. Based on the conservation of catalytic residues, CYTH domains are likely to chelate two divalent cations and exhibit a reaction mechanism that is dependent on two metal ions, analogous to nucleotide cyclases, polymerases, and certain phosphoesterases []. It has also been suggested that the experimentally characterised members of the CYTH domain superfamily, namely, adenylyl cyclase and thiamine triphosphatase, are secondary derivatives of proteins that performed an ancient role in polyphosphate and nucleotide metabolism [].The C-terminal CHAD domain (
) is an α-helical domain that is found fused to the CYTH domain or is encoded by genes occurring in the same operon as those encoding CYTH domains. Therefore, it is predicted to be functionally associated with CYTH adenylate cyclases [
]. While there is no experimental evidence as to the function of the CHAD domain, and no clear functional prediction can be made for it, the conserved histidines and other charged residues could form a strongly polar surface that could either participate in metal chelation, or act as phosphoacceptors [].Apart from these two domains, members of this group have a unique central domain. It is distantly related to the HD-type metal dependent phosphohydrolase domain (
), but there is no indication as to its function.
Matrix protein (M1) of influenza virus is a bifunctional membrane/RNA-binding protein that mediates the encapsidation of RNA-nucleoprotein cores into the membrane envelope. It is therefore required that M1 binds both membrane and RNA simultaneously [
]. M1 is comprised of two domains connected by a linker sequence. The N-terminal domain has a multi-helical structure that can be divided into two subdomains []. The C-terminal domain also contains an α-helical structure.This entry represents the N-terminal domain of M1.
The flagellar motor switch in Escherichia coli and Salmonella typhimurium regulates the
direction of flagellar rotation and hence controls swimming behaviour [].The switch is a complex apparatus that responds to signals transduced by the
chemotaxis sensory signalling system during chemotactic behaviour []. CheY,the chemotaxis response regulator, is believed to act directly on the switch
to induce tumbles in the swimming pattern, but no physical interactions of CheY and switch proteins have yet been demonstrated.
The switch complex comprises at least three proteins - FliG, FliM and FliN.
It has been shown that FliG interacts with FliM, FliM interacts with itself,and FliM interacts with FliN [
]. Several residues within the middle thirdof FliG appear to be strongly involved in the FliG-FliM interaction, with
residues near the N- or C-termini being less important []. Such clusteringsuggests that FliG-FliM interaction plays a central role in switching.
Analysis of the FliG, FliM and FliN sequences shows that none are especially
hydrophobic or appear to be integral membrane proteins []. This result isconsistent with other evidence suggesting that the proteins may be
peripheral to the membrane, possibly mounted on the basal body M ring [,
]. FliG is present in about 25 copies per flagellum.
PmrA/PmrB and PhoP/PhoQ are a pair of two-component systems (TCSs) that allow the Gram-negative bacteria to survive the cationic antimicrobial peptide polymyxin B. The two TCSs are linked by the polymyxin resistance protein, PmrD [
]. This entry represents a domain found in PmrD. This domain can also be found in anti-adapter protein IraM, a protein which inhibits rpoS proteolysis by regulating rssB activity, thereby increasing the stability of the sigma stress factor rpoS during magnesium starvation []. The Salmonella PmrA/PmrB two-component system is required for resistance to the cationic peptide antibiotic olymyxin B, resistance to Fe(3+)-mediated killing, growth in soil, virulence in mice, and infection of chicken macrophages. PmrA-activated genes encode periplasmic and integral membrane proteins as well as cytoplasmic products mediating the modification of the lipopolysaccharide, suggesting a role for the PmrA/PmrB system in remodeling of the Gram-negative envelope [
].The PmrA/PmrB two-component system of Salmonella enterica is activated by Fe(3+), which is sensed by the PmrB protein, and by low Mg(2+), which is sensed by the PhoQ protein. The low Mg(2+) activation requires pmrD, a PhoPPhoQ-activated gene that activates the response regulator PmrA at a posttranscriptional level. However, under conditions that activate the PmrA protein independently of pmrD, such as exposure to Fe3, lower levels of pmrD transcription occur. It has been demonstrated that PmrA binds to the pmrD promoter, suppressing transcription. Negative regulation of the PhoP/PhoQ-activated pmrD gene by the PmrA/ PmrB system closes a regulatory circuit designed to maintain proper cellular levels of activated PmrA protein, and constitutes a singular example of a multicomponent feedback loop [
,
].
Proline-rich AKT1 substrate 1 protein (AKT1S1, PRAS40) is part of the mammalian target of rapamycin complex 1 (mTORC1, contains MTOR, MLST8, RPTOR, AKT1S1/PRAS40 and DEPTOR), which regulates cell growth and survival in response to nutrient and hormonal signals [
]. Within mTORC1, AKT1S1 negatively regulates mTOR activity in a manner that is dependent on its phosphorylation state and binding to 14-3-3 proteins. AKT1S1 is a substrate for AKT1 phosphorylation, but can also be activated by AKT1-independent mechanisms. It may also play a role in nerve growth factor-mediated neuroprotection [,
].
This entry represents the HMGN family, whose members promote chromatin unfolding, enhance access to nucleosomes, and modulate transcription from chromatin templates. HMGNs are expressed only in vertebrates [
].The high mobility group (HMG) proteins are the most abundant and ubiquitous nonhistone chromosomal proteins. They bind to DNA and to nucleosomes and are involved in the regulation of DNA-dependent processes such as transcription, replication, recombination, and DNA repair. They can be grouped into three families: HMGB (HMG 1/2), HMGN (HMG 14/17) and HMGA (HMG I/Y). The characteristic domains are: AT-hook for the HMGA family, the HMG Box for the HMGB family, and the nucleosome-binding domain (NBD) for the members of the HMGN family [
].
This is a serine rich protein that is found in the docking protein p130(cas) (Crk-associated substrate). The protein folds into a four helix bundle which is associated with protein-protein interactions [
].
This is a family of uncharacterised proteins. The structure of one of the members in this family, Rv1873 from Mycobacterium tuberculosis (
), has been solved and it adopts a mainly α-helical structure [
].
This entry consists of bacterial uncharacterised proteins. The structure of one of the proteins has been solved and it adopts a beta barrel-like structure.
Proteins of this entry include phage tail proteins. They probably include bacterial Ig-like domains related to
. Which also includes a number of phage tail invasin proteins.
This entry describes the DndE protein encoded by an operon associated with a sulphur-containing modification to DNA (phosphorothioation)[
]. The operon is sporadically distributed in bacteria, much like some restriction enzyme operons. DndE is part of a protein complex that also includes IscS, DndC, and DndE, involved in phosphorothioation (PT) []. DndE binds to DNA in vitro, with a preference for nicked dsDNA [], but its nicked dsDNA-binding capacity does not seem essential for PT modification [].
Protein G-related albumin-binding (GA) modules occur on the surface of numerous Gram-positive bacterial pathogens. Protein G of group C and G Streptococci interacts with the constant region of IgG and with human serum albumin. The GA module is composed of a left-handed three-helix bundle and is found in a range of bacterial cell surface proteins [
,
]. GA modules may promote bacterial growth and virulence in mammalian hosts by scavenging albumin-bound nutrients and camouflaging the bacteria. Variations in sequence give rise to differences in structure and function between GA modules in different proteins, which could alter pathogenesis and host specificity due to their varied affinities for different species of albumin []. Proteins containing a GA module include PAB from Peptostreptococcus magnus [].
Anti-bacteriophage protein A/HamA, C-terminal domain
Type:
Domain
Description:
Hachiman antiphage defense system has been described in Ralstonia solanacearum species complex (RSSC) and in Escherichia coli, and is composed of two genes, hamAB, which encode Anti-bacteriophage protein A (also known as HamA, represented in this entry), and an helicase (HamB/AbpB). These proteins confer temperature dependent resistance to phages T2, T4, T7 and lambda but not RB32 or RB69 [
,
,
].This entry represents a domain found at the C-terminal of AbpA and in similar proteins predominantly from bacteria.
Transforming growth factor-beta (TGF-beta) forms a family with other growth factors. The receptors for most of the members of this growth factor family are related. These proteins are receptor-type kinases of Ser/Thr type, which have a single transmembrane domain and a specific hydrophilic Cys-rich ligand-binding domain [
,
,
]. The C-terminal part of the extracellular domain is conserved. Some of the receptors of this family contain subclass-specific N-terminal extensions of this homology domain. The type I receptors also possess 7 extracellular residues preceding the cysteine box.
This entry includes a group of bacterial proteins, including EipB from Brucella. EipB is a periplasmic protein that functions as part of a system required for cell envelope homeostasis. It adopts a β-spiral fold, consisting of 14 β-strands and 2 α-helices [
].
This entry includes a group of bacterial proteins, including EipB from Brucella. EipB is a periplasmic protein that functions as part of a system required for cell envelope homeostasis. It adopts a β-spiral fold, consisting of 14 β-strands and 2 α-helices [
].
HYPK, also called Huntingtin yeast partner K or Huntingtin yeast two-hybrid protein K, is an intrinsically unstructured Huntingtin (HTT)-interacting protein with chaperone-like activity. It is involved in regulating cell growth, cell cycle, unfolded protein response, and cell death [
,
,
]. Proteins matched by this entry contain an N-terminal ubiquitin-associated (UBA) region that shows high sequence similarity with that of eukaryotic nascent polypeptide-associated complex proteins (NAC), which is one of the cytosolic chaperones that contact the nascent polypeptide chains as they emerge from the ribosome and assist in post-translational processes [,
].
Myotubularin-related protein 4 (MTMR4) is a member of the myotubularin (MTM) family. It is the only family member that possesses a FYVE domain (a zinc finger domain) at its C terminus [
]. MTMR4 has dual-specificity phosphatase activity []; some studies have shown that it can dephosphorylate PI3P or PI(3,5)P2, suggesting that MTMR4 is also a lipid phosphatase []. MTMR4 has a unique distribution to endosomes [] and has been shown to function in early and recycling endosomes [,
]. MTMR4 attenuates TGF-beta signalling by dephosphorylating intracellular signalling mediator R-Smads []. Similarly, it acts as a negative modulator for the homeostasis of bone morphogenetic proteins (BMPs) signalling [].The myotubularin family constitutes a large group of conserved proteins, with 14 members in humans consisting of myotubularin (MTM1) and 13 myotubularin-related proteins (MTMR1-MTMR13). Orthologues have been found throughout the eukaryotic kingdom, but not in bacteria. MTM1 dephosphorylates phosphatidylinositol 3-monophosphate (PI3P) to phosphatidylinositol and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2] to phosphatidylinositol 5-monophosphate (PI5P) [,
]. The substrate phosphoinositides (PIs) are known to regulate traffic within the endosomal-lysosomal pathway []. MTMR1, MTMR2, MTMR3, MTMR4, and MTMR6 have also been shown to utilise PI(3)P as a substrate, suggesting that this activity is intrinsic to all active family members. On the other hand, six of the MTM family members encode for catalytically inactive phosphatases. Inactive myotubularin phosphatases contain substitutions in the Cys and Arg residues of the Cys-X5-Arg motif. MTM pseudophosphatases have been found to interact with MTM catalytic phosphatases []. The myotubularin family includes several members mutated in neuromuscular diseases or associated with metabolic syndrome, obesity, and cancer [].This entry represents the C-terminal FYVE domain of MTMR4.
This entry represent the N terminus of the acyl carrier protein:aminoglycoside acyltransferase BtrH. Alternatively it can be referred to as butirosin biosynthesis protein H. BtrH transfers the unique (S)-4-amino-2-hydroxybutyrate (AHBA) side chain, which protects the antibiotic butirosin from several common resistance mechanisms. Butirosin, an aminoglycoside antibiotic produced by Bacillus circulans, exhibits improved antibiotic properties over its parent molecule and retains bactericidal activity toward many aminoglycoside-resistant strains. Butirosin is unique in carrying the AHBA side-chain. BtrH transfers the AHBA from the acyl carrier protein BtrI to the parent aminoglycoside ribostamycin as a gamma-glutamylated dipeptide [,
].
Mismatch repair contributes to the overall fidelity of DNA replication by targeting mispaired bases that arise through replication errors during homologous recombination and as a result of DNA damage. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex [
]. This entry represents a family of evolutionarily related DNA mismatch repair proteins, including MutL, Mlh1, Mlh2 and Mlh3, and Pms 1 and 2. Bacterial MutL proteins are homodimers, while their eukaryotic homologues form heterodimers consisting of the MutL homologue Mlh1 and either Pms1, Pms2 or Mlh3 [,
]. MutL homologues share a conserved ATP binding site [].
Cyclin-dependent protein kinase inhibitor SMR11/SMR16
Type:
Family
Description:
The SIAMESE-RELATED (SMR) family of cyclin-dependent kinase inhibitors regulate the transition from the mitotic cell cycle to endoreplication in plants [
]. The specific cyclin-dependent kinase complexes which are inhibited, and the functions of most SMRs remain unknown.Seventeen putative SMR genes have been identified in the Arabidopsis genome 17. The most divergent group consists of SMR11 and SMR16, which are represented by this entry.