Nuclear protein AMMECR1, presently a protein of unknown function, is encoded by one of the genes affected by an X-linked deletion that causes the association of Alport syndrome, midface hypoplasia, intellectual disability and elliptocytosis in humans [
]. This entry represents the C-terminal region of AMMECR1 (approximately from residue 122 to 333), which is well conserved. Homologues appear in species ranging from bacteria and archaea to eukaryotes, including Protein PH0010 from Pyrococcus horikoshii [
]. The high level of conservation of the AMMECR1 domain points to a basic cellular function, potentially in either the transcription, replication, repair or translation machinery [,
]. The AMMECR1 domain, which contains a 6-amino-acid motif (LRGCIG) that might be functionally important since it is strikingly conserved throughout evolution []. The AMMECR1 domain consists of two distinct subdomains of different sizes. The large subdomain, which contains both the N- and C-terminal regions, consists of five α-helices and five β-strands. These five β-strands form an antiparallel β-sheet. The small subdomain consists of four α-helices and three β-strands, and these β-strands also form an antiparallel β-sheet. The conserved 'LRGCIG' motif is located at β(2) and its N-terminal loop, and most of the side chains of these residues point toward the interface of the two subdomains. The two subdomains are connected by only two loops, and the interaction between the two subdomains is not strong. Thus, these subdomains may move dynamically when the substrate enters the cleft. The size of the cleft suggests that the substrate is large, e.g., the substrate may be a nucleic acid or protein. However, the inner side of the cleft is not filled with positively charged residues, and therefore it is unlikely that negatively charged nucleic acids such as DNA or RNA interact at this site [].
Rho guanosine triphosphatases (GTPases) are critical regulators of cell motility, polarity, adhesion, cytoskeletal organisation, proliferation, gene
expression, and apoptosis. Conversion of these biomolecular switches to the activated GTP-bound state is controlled by two families of guanine nucleotide exchanges factors (GEFs). DH-PH proteins are a large group of Rho GEFs comprising a catalytic Dbl homology (DH) domain with an adjacent pleckstrin homology (PH) domain within the context of functionally diverse signalling modules. The evolutionarily distinct andsmaller family of DOCK (dedicator of cytokinesis) or CDM (CED-5, DOCK1180, Myoblast city) proteins activate either Rac or Cdc42 to control cell migration, morphogenesis, and phagocytosis. DOCK proteins share the DOCK-type C2 domain (also termed the DOCK-homology region (DHR)-1 or CDM-zizimin homology 1 (CZH1) domain and the DHR-2 domain (also termed the CZH2 or DOCKER domain), [
,
,
,
,
,
].The ~200 residue DOCK-type C2 domain is located toward the N terminus. It adopts a C2-like architecture and interacts with phosphatidylinositol
3,4,5-trisphosphate [] to mediate signalling and membrane localization. The central core of the DOCK-type C2 domain domain adopts an antiparallel β-sandwich with the "type II"C2 domain fold (a circular permutation of the more common "type I"topology), in which two 4-stranded sheets with strand order 6-5-2-3 and 7-8-1-4 create convex- and concave-exposed faces, respectively [
].Some DOCK proteins are listed below:Mammalian Mammalian dedicator of cytokinesis 180 (DOCK180 or DOCK1),
important for cell migration.Mammalian DOCK2, important for lymphocyte development, homong, activation,
adhesion, polarization and migration processes.Mammalian DOCK3 (also known as MOCA), is expressed predominantly in neurons
and resides in growth cones and membrane ruffles.Mammalian DOCK4, possesses tumor suppressor properties.
Mammalian DOCK9 (zizimin1), plays an important role in dendrite growth in
hippocampal neurons through activation of Cdc42.Drosophila melanogaster Myoblast city.
Caenorhabditis elegans CED-5.
This entry represents the TsaD protein family that is widely distributed. TsaD and its archaeal homologue Kae1 (
) belong to the Kae1/TsaD family (
), a conserved protein family with unknown function.
This entry includes bacterial TsaD and its homologues, such as Qri7 (localize to the mitochondria) from budding yeast
[]. TsaD (also known as Gcp or YgjD) was originally described as a glycoprotease essential for cell viability [
] and a critical mediator involved in the modification of cell wall peptidoglycan synthesis and/or cell division []. Gcp is a member of the Kae1/TsaD family, required for the formation of a threonylcarbamoyl group on adenosine at position 37 in tRNAs that read codons beginning with adenine []. YgjD has been renamed as TsaD, and it has been shown that YgjD and proteins YrdC (TsaC), YjeE (TsaE), and YeaZ (TsaB), are necessary and sufficient for t6A biosynthesis in vitro, and may constitute a complex [].The first characterised member of the Kae1/TsaD family was annotated as Gcp for O-sialoglycoprotein endopeptidase [
], but this activity could not be confirmed []. Later, its homologue, Kae1 from Pyrococcus abyssi, has been shown to have DNA-binding properties and apurinic-endonuclease activity []. Members of this family have since been studied in yeast, archaea and bacteria resulting in sometimes conflicting data, several proposed functions and annotations but no definitive characterisation. For instance, some members have been linked to DNA maintenance in bacteria and mitochondria [] and transcription regulation and telomere homeostasis in eukaryotes [,
], but their function remained unclear. Recent research indicates that this family is involved in the biosynthesis of N6-threonylcarbamoyl adenosine, a universal modification found at position 37 of tRNAs that read codons beginning with adenine [,
].
There are multiple types of iron-sulphur clusters which are grouped into three main categories based on their atomic content: [2Fe-2S], [3Fe-4S], [4Fe-4S] (see ), and other hybrid or mixed metal types. Two general types of [2Fe-2S] clusters are known and they differ in their coordinating residues. The ferredoxin-type [2Fe-2S]clusters are coordinated to the protein by four cysteine residues (see
). The Rieske-type [2Fe-2S] cluster is coordinated to its protein by two cysteine residues and two histidine residues [,
].The structure of several Rieske domains has been solved [
]. It contains three layers of antiparallel beta sheets forming two beta sandwiches. Both beta sandwiches share the central sheet 2. The metal-binding site is at the top of the beta sandwich formed by the sheets 2 and 3. The Fe1 iron of the Rieske cluster is coordinated by two cysteines while the other iron Fe2 is coordinated by two histidines. Two inorganic sulphide ions bridge the two iron ions forming a flat, rhombic cluster. Rieske-type iron-sulphur clusters are common to electron transfer chains of mitochondria and chloroplast and to non-haem iron oxygenase systems: The Rieske protein of the Ubiquinol-cytochrome c reductase (
) (also known as the bc1 complex or complex III), a complex of the electron transport chains of mitochondria and of some aerobic prokaryotes; it catalyses the oxidoreduction of ubiquinol and cytochrome c.
The Rieske protein of chloroplastic plastoquinone-plastocyanin reductase (
) (also known as the b6f complex). It is functionally similar to the bc1 complex and catalyses the oxidoreduction of plastoquinol and cytochrome f.
Bacterial naphthalene 1,2-dioxygenase subunit alpha, a component of the naphthalene dioxygenase (NDO) multicomponent enzyme system which catalyses the incorporation of both atoms of molecular oxygen into naphthalene to form cis-naphthalene dihydrodiol. Bacterial 3-phenylpropionate dioxygenase ferredoxin subunit. Bacterial toluene monooxygenase. Bacterial biphenyl dioxygenase.
Over 70 metallopeptidase families have been identified to date. In these enzymes a divalent cation which is usually zinc, but may be cobalt, manganese or copper, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. In some families of co-catalytic metallopeptidases, two metal ions are observed in crystal structures ligated by five amino acids, with one amino acid ligating both metal ions. The known metal ligands are His, Glu, Asp or Lys. At least one other residue is required for catalysis, which may play an electrophillic role.
Many metalloproteases contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases [].This domain is found in the FtsH family of proteins that include FtsH a membrane-bound ATP-dependent protease universally conserved in prokaryotes [
]. The FtsH peptidases, which belong to MEROPS peptidase family M41 (clan MA(E)), efficiently degrade proteins that have a low thermodynamic stability - e.g. they lack robust unfoldase activity. This feature may be key and implies that this could be a criterion for degrading a protein. In Oenococcus oeni (Leuconostoc oenos) FtsH is involved in protection against environmental stress [], and shows increased expression under heat or osmotic stress. These two lines of evidence suggest that it is a fundamental prokaryotic self-protection mechanism that checks if proteins are correctly folded. The precise function of this N-terminal region is unclear.
Geminivirus AL1 replication-associated protein, central domain
Type:
Domain
Description:
Geminiviruses are characterised by a genome of circular single-stranded DNA encapsidated in twinned (geminate) quasi-isometric particles, from which the group derives its name [
]. Most geminiviruses can be divided into two subgroups on the basis of host range and/or insect vector: i.e. those that infect dicotyledenous plants and are transmitted by the same whitefly species, and those that infect monocotyledenous plants and are transmitted by different leafhopper vectors. The genomes of the whitefly-transmitted African cassava mosaic virus, Tomato golden mosaic virus (TGMV) and Bean golden mosaic virus (BGMV) possess a bipartite genome. By contrast, only a single DNA component has been identified for the leafhopper-transmitted Maize streak virus (MSV) and Wheat dwarf virus (WDV) [,
]. Beet curly top virus (BCTV), and Tobacco yellow dwarf virus belong to a third possible subgroup. Like MSV and WDV, BCTV is transmitted by a specific leafhopper species, yet like the whitefly-transmitted geminiviruses it has a host range confined to dicotyledenous plants.Sequence comparison of the whitefly-transmitted Squash leaf curl virus (SqLCV) and Tomato yellow leaf curl virus (TYLCV) with the genomic components of TGMV and BGMV reveals a close evolutionary relationship [
,
,
]. Amino acid sequence alignments of Potato yellow mosaic virus (PYMV) proteins with those encoded by other geminiviruses show that PYMV is closely related to geminiviruses isolated from the New World, especially in the putative coat protein gene regions []. Comparison of MSV DNA-encoded proteins with those of other geminiviruses infecting monocotyledonous plants, including Panicum streak virus [] and Miscanthus streak virus (MiSV) [], reveal high levels of similarity.This is the central region of the geminivirus rep proteins [
]. It is found C-terminal to and is thought to be responsible for oligomerisation.
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].The insect octopamine receptor mediates the attenuation of adenylate cyclase
activity. Sequence and pharmacological comparisons indicate that theoctopamine receptor is unique, but closely related to mammalian adrenergic
receptors, perhaps as an evolutionary precursor [,
].
Somatostatin (SST), also known as somatotropin release-inhibiting factor (SRIF), is a hypothalamic hormone, a pancreatic hormone, and a central and peripheral neurotransmitter. Somatostatin has a wide distribution throughout the central nervous system (CNS) as well as in peripheral tissues, for example in the pituitary, pancreas and stomach. The various actions of somatostatin are mediated by a family of rhodopsin-like G protein-coupled receptors, which comprise of five distinct subtypes: Somatostatin receptor 1 (SSTR1), Somatostatin receptor 2 (SSTR2), Somatostatin receptor 3 (SSTR3), Somatostatin receptor 4 (SSTR4) and Somatostatin receptor 5 (SSTR5) [
,
,
]. These subtypes are widely expressed in many tissues [,
,
,
,
,
], and frequently multiple subtypes coexist in the same cell []. The somatostatin receptor subtypes also share common signalling pathways, such as the inhibition of adenylyl cyclase [,
], activation of phosphotyrosine phosphatase (PTP), and modulation of mitogen-activated protein kinase (MAPK) through G protein-dependent mechanisms. Some of the subtypes are also coupled to inward rectifying K+ channels (SSTR2, SSTR3, SSTR4, SSTR5) [,
], to voltage-dependent Ca2+ channels (SSTR1, SSTR2) [], to an Na+/H+ exchanger (SSTR1), AMPA/kainate glutamate channels (SSTR1, SSTR2), phospholipase C (SSTR2, SSTR5), and phospholipase A2 (SSTR4) []. Amongst the wide spectrum of somatostatin effects, several biological responses have been identified that display absolute or relative subtype selectivity. These include GH secretion (SSTR2 and 5), insulin secretion (SSTR5), glucagon secretion (SSTR2), and immune responses (SSTR2) [
].This entry represents SSTR2. In humans has been found in high levels the brain, kidney and pituitary, with lower levels in the jejunum, pancreas, colon and liver. All five human somatostatin receptors expressed in COS-7 cells are coupled to activation of phosphoinositide (PI)-specific PLC-beta; and Ca2+ mobilisation via pertussis toxin-sensitive G protein(s) with an order of potency of SSTR5 >SSTR2 >SSTR3 >SSTR4 >SSTR1 [
].
The extended plant homeodomain (ePHD) domain contains an N-terminal pre-PHD (C2HC zinc finger), a long linker, and a noncanonical PHD finger (C4HC3 zinc finger). The ePHD domain can bind dsDNA but not histones [
,
,
,
].The pre-PHD-type C2HC zinc finger and the PHD finger in the ePHD domain are associated with each other via extensive hydrophobic interactions and numerous hydrogen bonding interactions and folded as an intact structural module. The pre-PHD-type C2HC zinc finger consists of two α-helices separated by an anti-parallel β-sheet. Three cysteine residues and one histidine residue from the N-terminal loop, β2-strand, and α2-helix coordinate one zinc ion (designated Zn1). The C-terminal part of the ePHD domain is a PHD finger, consisting of one short antiparallel β-sheet and one long antiparallel β-sheet that are linked by one α-helix. Like other PHD fingers, the PHD finger of the ePHD domain consists of two interleaved zinc fingers. A pair of bound zinc ions (designated Zn2 and Zn3) specifically stabilizes the characteristic cross-braced folding topology of the PHD finger. Each zinc ion is coordinated by a combination of four cysteine and histidine residues in which the Zn3 ion is coordinated by a C3H motif instead of a C4 motif [
,
].Some proteins known to contain a ePHD domain are listed below:Vertebrate plant homeodomain finger 6 (PHF6), a multidomain protein that comprises four nuclear localization signals and two ePHD domains. It is implicated in chromatin regulation and neural development.Vertebrate MLL1/2/3/4, histone methyltransferases.Vertebrate JMJD2A/B/C, histone demethylases.Vertebrate Bromodomain- and PHD finger-containing protein 1, 2, and 3 (BRPF1/2/3), a component of MOZ (monocytic leukemia zinc finger)/MORF (MOZ- related factor) histone acetyltransferase complex.Vertebrate JADE1/2/3, components of the HBO1 complex which has a histone H4-specific acetyltransferase activity.AF10/17, subunits of the multimeric DOT1L complex that mediates H3K79 methylation.Caenorhabditis elegansprotein lin-49, a component of a histone modifying complex.
Yeast NuA3 HAT complex component NTO1.
This entry includes a group of animal proteins that belong to the class V-like SAM-binding methyltransferase superfamily and contain the SET domain usually flanked by other domains forming the so-called pre- and post-SET regions. The enzymes belonging to this class all N-methylate lysine in proteins. Most of them are histone methyltransferases (), including human N-lysine methyltransferase KMT5A (also known as PR-Set7), which is a nucleosomal histone-lysine N-methyltransferase that specifically monomethylates 'Lys-20' of histone H4 (H4K20me1). It plays a central role in the silencing of euchromatic genes [
,
,
,
].Methyltransferases (EC 2.1.1.-) constitute an important class of enzymes
present in every life form. They transfer a methyl group most frequently fromS-adenosyl L-methionine (SAM or AdoMet) to a nucleophilic acceptor such as
nitrogen, oxygen, sulfur or carbon leading to S-adenosyl-L-homocysteine(AdoHcy) and a methylated molecule. The substrates that are methylated by
these enzymes cover virtually every kind of biomolecules ranging from smallmolecules, to lipids, proteins and nucleic acids. Methyltransferases are
therefore involved in many essential cellular processes includingbiosynthesis, signal transduction, protein repair, chromatin regulation and
gene silencing [,
,
]. More than 230 different enzymatic reactions ofmethyltransferases have been described so far, of which more than 220 use SAM
as the methyl donor [E1]. A review published in 2003 [
] divides allmethyltransferases into 5 classes based on the structure of their catalytic
domain (fold):class I: Rossmann-like α/β class II: TIM beta/α-barrel α/β class III: tetrapyrrole methylase α/βclass IV: SPOUT α/β class V: SET domain all βA more recent paper [
] based on a study of the Saccharomyces cerevisiae methyltransferome argues for four more folds:class VI: transmembrane all α class VII: DNA/RNA-binding 3-helical bundle all α class VIII: SSo0622-like α+β class IX: thymidylate synthetase α+β
Melatonin is a naturally occurring compound found in animals, plants, and microbes [
,
]. In animals melatonin is secreted by the pineal gland during darkness [,
]. It regulates a variety of neuroendocrine functions and is thought to play an essential role in circadian rhythms []. Drugs that modify the action of melatonin, and hence influence circadian cycles, are of clinical interest for example, in the treatment of jet-lag []. Many of the biological effects of melatonin are produced through the activation of melatonin receptors [
], which are members of rhodopsin-like G protein-coupled receptor family. There are three melatonin receptor subtypes. Melatonin receptor type 1A and melatonin receptor type 1B are present in humans and other mammals [] while melatonin receptor type 1C has been identified in amphibia and birds []. There is also a closely-related orphan receptor, termed melatonin-related receptor type 1X (also known as GPR50) [], is yet to achieve receptor status from the International Union of Basic and Clinical Pharmacology (IUPHAR), since a robust response mediated via the protein has not been reported in the literature. Melatonin receptor type 1C receptors are 80% identicaland are distinct from 1A and 1B subtypes. Similar ligand binding and functional characteristics are observed in expressed 1A and 1C receptors. The melatonin receptors inhibit adenylyl cyclase via a pertussis-toxin-sensitive G-protein, probably of the Gi/Go class.
This entry represents melatonin receptor 1C, which is found in birds, fish and amphibians, but not in humans [
]. It has been shown that the widespread distribution of 1C in brain provides a molecular substrate for the profound actions of melatonin in birds []. Also, in amphibians, very low concentrations of melatonin activate the 1C receptor subtype, triggering movement of granules toward the cell centre, thus lightening skin colour. 1C receptor activation reduces intracellular cAMP via a pertussis toxin-sensitive inhibitory G-protein (Gi) [].
Signal transduction response regulator, PEP-CTERM system, putative
Type:
Family
Description:
Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions [
]. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk []. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more []. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) []. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli canregulate either the kinase or phosphatase activity of the bifunctional HK.
A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response [
,
].This entry represents a protein family that shares full-length homology with (but do not include) the acetoacetate metabolism regulatory protein AtoC (see
). These proteins have a Fis family DNA binding sequence, a response regulator receiver domain, and sigma-54 interaction domain. They are found strictly within a subset of Gram-negative bacterial species with the proposed PEP-CTERM/exosortase system, analogous to the LPXTG/sortase system [
] common in Gram-positive bacteria, where members of and
also occur.
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes [
]. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Many families of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases [].Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base [
]. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [,
].The potyviridae are a family of positive strand RNA viruses, members of which include Zucchini yellow mosaic virus, and Turnip mosaic virus (strain Japanese) which cause considerable losses of crops worldwide.This entry represents a C-terminal region from various plant potyvirus P1 proteins (found at the N terminus of the polyprotein). The C terminus of P1 is a serine peptidase belonging to MEROPS peptidase family S30 (clan PA(S)). It is the protease responsible for autocatalytic cleavage between P1 and the helper component protease, which is a cysteine peptidase belonging to MEROPS peptidase family C6
[
,
]. The P1 protein may be involved in virus-host interactions [], and evasion of immune responses [].
Prokaryotic cells have a defence mechanism against a sudden heat-shock stress. Commonly, they induce a set of proteins that protect cellular proteins from being denatured by heat. Among such proteins are the GroE and DnaK chaperones whose transcription is regulated by a heat-shock repressor protein HrcA. HrcA is a winged helix-turn-helix repressor that negatively regulates the transcription of dnaK and groE operons by binding the upstream CIRCE (controlling inverted repeat of chaperone expression) element. In Bacillus subtilis this element is a perfect 9 base pair inverted repeat separated by a 9 base pair spacer. The crystal structure of a heat-inducible transcriptional repressor, HrcA, from Thermotoga maritima has been reported at 2.2A resolution. HrcA is composed of three domains: an N-terminal winged helix-turn-helix domain (WHTH), a GAF-like domain, and an inserted dimerizing domain (IDD). The IDD shows a unique structural fold with an anti-parallel β-sheet composed of three β-strands sided by four α-helices. HrcA crystallises as a dimer, which is formed through hydrophobic contact between the IDDs and a limited contact that involves conserved residues between the GAF-like domains [
]. The structural studies suggest that the inactive form of HrcA is the dimer and this is converted to its DNA-binding form by interaction with GroEL, which binds to a conserved C-terminal sequence region [,
]. Comparison of the HrcA-CIRCE complexes from B. subtilis and Bacillus thermoglucosidasius (Geobacillus thermoglucosidasius), which grow at vastly different ranges of temperature shows that the thermostability profiles were consistent with the difference in the growth temperatures suggesting that HrcA can function as a thermosensor to detect temperature changes in cells []. Any increase in temperature causes the dissociation of the HrcA from the CIRCE complex with the concomitant activation of transcription of the groE and dnaK operons.
Haemagglutinin (HA) is one of two main surface fusion glycoproteins embedded in the envelope of influenza viruses, the other being neuraminidase (NA). There are sixteen known HA subtypes (H1-H16) and nine NA subtypes (N1-N9), which together are used to classify influenza viruses (e.g. H5N1). The antigenic variations in HA and NA enable the virus to evade host antibodies made to previous influenza strains, accounting for recurrent influenza epidemics [
]. The HA glycoprotein is present in the viral membrane as a single polypeptide (HA0), which must be cleaved by the host's trypsin-like proteases to produce two peptides (HA1 and HA2) in order for the virus to be infectious. Once HA0 is cleaved, the newly exposed N-terminal of the HA2 peptide then acts to fuse the viral envelope to the cellular membrane of the host cell, which allows the viral negative-stranded RNA to infect the host cell. The type of host protease can influence the infectivity and pathogenicity of the virus.The haemagglutinin glycoprotein is a trimer containing three structurally distinct regions: a globular head consisting of anti-parallel β-sheets that form a β-sandwich with a jelly-roll fold (contains the receptor binding site and the HA1/HA2 cleavage site); a triple-stranded, coiled-coil, α-helical stalk; and a globular foot composed of anti-parallel β-sheets [
,
]. Each monomer consists of an intact HA0 polypeptide with the HA1 and HA2 regions linked by disulphide bonds. The N terminus of HA1 provides the central strand in the 5-stranded globular foot, while the rest of the HA1 chain makes its way to the 8-stranded globular head. HA2 provides two alpha helices, which form part of the triple-stranded coiled-coil that stabilises the trimer, its C terminus providing the remaining strands of the 5-stranded globular foot.This entry represents the entire haemagglutinin protein (HA0) consisting of both the HA1 and HA2 regions, as found in influenza A and B viruses.
Viruses in the order Picornavirales infect different vertebrate, invertebrate, and plant hosts and are responsible for a variety of human, animal, and plant diseases. These viruses have a single-stranded, positive sense RNA genome that generally translates a large precursor polyprotein which is proteolytically cleaved after translation to generate mature functional viral proteins. This process is usually mediated by (more than one) proteases, and a 3C (for the family Picornaviridae) or 3C-like (3CL) protease (for other families) plays a central role in the cleavage of the viral precursor polyprotein. In addition to this key role, 3C/3C-like protease is able to cleave a number of host proteins to remodel the cellular environment for virus reproduction [
,
,
,
,
,
]. The Picornavirales 3C/3C-like protease domain forms the MEROPS peptidase family C3 (picornain family) of clan PA.The 3C/3CL protease domain adopts a chymotrypsin-like fold with a cysteine nucleophile in place of a commonly found serine which suggests that the cysteine and serine perform an analogous catalytic function. The catalytic triad is made of a histidine, an aspartate/glutamate and the conserved cysteine in this sequential order. The 3C/3CL protease domain folds into two antiparallel beta barrels that are linked by a loop with a short α-helix in its middle, and flanked by two other α-helices at the N- and C-terminal. The two barrels are topologically equivalent and are formed by six antiparallel beta strands with the first four organised into a Greek key motif. The active-site residues are located in the cleft between the two barrels with the nucleophilic Cys from the C-terminal barrel and the general acid base His-Glu/Asp from the N-terminal barrel [
,
,
].This entry includes cysteine peptidases that belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies C3A and C3B.
The serum paraoxonases/arylesterases are enzymes that catalyse the hydrolysisof the toxic metabolites of a variety of organophosphorus insecticides. The
enzymes hydrolyse a broad spectrum of organophosphate substrates, including paraoxon and a number of aromatic carboxylic acid esters (e.g., phenyl
acetate), and hence confer resistance to organophosphate toxicity []. Mammals have 3 distinct paraoxonase types, termed PON1-3 [,
]. In mice andhumans, the PON genes are found on the same chromosome in close proximity.
PON activity has been found in variety of tissues, with highest levels in liver and serum - the source of serum PON is thought to be the liver. Unlike mammals, fish and avian species lack paraoxonase activity.
Human and rabbit PONs appear to have two distinct Ca2+ binding sites, onerequired for stability and one required for catalytic activity. The Ca2+
dependency of PONs suggests a mechanism of hydrolysis where Ca2+ acts as theelectrophillic catalyst, like that proposed for phospholipase A2. The
paraoxonase enzymes, PON1 and PON3, are high density lipoprotein (HDL)-associated proteins capable of preventing oxidative modification of low
density lipoproteins (LPL) []. Although PON2 has oxidative properties, theenzyme does not associate with HDL.
Within a given species, PON1, PON2 and PON3 share ~60% amino acid sequence identity, whereas between mammalian species particular PONs (1,2 or 3) share
79-90% identity at the amino acid level. Human PON1 and PON3 share numerous conserved phosphorylation and N-glycosylation sites; however, it is not
known whether the PON proteins are modified at these sites, or whether modification at these sites is required for activity in vivo [
]. This family consists of arylesterases (Also known as serum paraoxonase)
. These enzymes hydrolyse organophosphorus esters such as paraoxon and are found in the liver and blood. They confer resistance to organophosphate toxicity [
]. Human arylesterase (PON1) is associated with HDL and may protect against LDL oxidation [
].
Nuclear protein AMMECR1, presently a protein of unknown function, is encoded by one of the genes affected by an X-linked deletion that causes the association of Alport syndrome, midface hypoplasia, intellectual disability and elliptocytosis in humans [
]. This entry represents the C-terminal region of AMMECR1 (approximately from residue 122 to 333), which is well conserved. Homologues appear in species ranging from bacteria and archaea to eukaryotes, including Protein PH0010 from Pyrococcus horikoshii [
]. The high level of conservation of the AMMECR1 domain points to a basic cellular function, potentially in either the transcription, replication, repair or translation machinery [,
]. The AMMECR1 domain, which contains a 6-amino-acid motif (LRGCIG) that might be functionally important since it is strikingly conserved throughout evolution []. The AMMECR1 domain consists of two distinct subdomains of different sizes. The large subdomain, which contains both the N- and C-terminal regions, consists of five α-helices and five β-strands. These five β-strands form an antiparallel β-sheet. The small subdomain consists of four α-helices and three β-strands, and these β-strands also form an antiparallel β-sheet. The conserved 'LRGCIG' motif is located at β(2) and its N-terminal loop, and most of the side chains of these residues point toward the interface of the two subdomains. The two subdomains are connected by only two loops, and the interaction between the two subdomains is not strong. Thus, these subdomains may move dynamically when the substrate enters the cleft. The size of the cleft suggests that the substrate is large, e.g., the substrate may be a nucleic acid or protein. However, the inner side of the cleft is not filled with positively charged residues, and therefore it is unlikely that negatively charged nucleic acids such as DNA or RNA interact at this site [].
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups []. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Growth hormone secretagogue receptor (GHSR) is a class A GPCR that stimulates food intake by binding to its peptide ligand, ghrelin [
]. Ghrelin also increases growth hormone (GH) release []. The motilin receptor, also known as GPR38, shares significant amino acid sequence identity with GHSR [].
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].OGR1 is expressed in ovarian cancer cell lines and also in spleen, testis, small intestine, peripheral blood leukocytes, brain, heart, lung,
placenta and kidney. Expression has not been found in thymus, prostate,ovary, colon, liver, skeletal muscle or pancreas [
].
Somatostatin (SST), also known as somatotropin release-inhibiting factor (SRIF), is a hypothalamic hormone, a pancreatic hormone, and a central and peripheral neurotransmitter. Somatostatin has a wide distribution throughout the central nervous system (CNS) as well as in peripheral tissues, for example in the pituitary, pancreas and stomach. The various actions of somatostatin are mediated by a family of rhodopsin-like G protein-coupled receptors, which comprise of five distinct subtypes: Somatostatin receptor 1 (SSTR1), Somatostatin receptor 2 (SSTR2), Somatostatin receptor 3 (SSTR3), Somatostatin receptor 4 (SSTR4) and Somatostatin receptor 5 (SSTR5) [
,
,
]. These subtypes are widely expressed in many tissues [,
,
,
,
,
], and frequently multiple subtypes coexist in the same cell []. The somatostatin receptor subtypes also share common signalling pathways, such as the inhibition of adenylyl cyclase [,
], activation of phosphotyrosine phosphatase (PTP), and modulation of mitogen-activated protein kinase (MAPK) through G protein-dependent mechanisms. Some of the subtypes are also coupled to inward rectifying K+ channels (SSTR2, SSTR3, SSTR4, SSTR5) [,
], to voltage-dependent Ca2+ channels (SSTR1, SSTR2) [], to an Na+/H+ exchanger (SSTR1), AMPA/kainate glutamate channels (SSTR1, SSTR2), phospholipase C (SSTR2, SSTR5), and phospholipase A2 (SSTR4) []. Amongst the wide spectrum of somatostatin effects, several biological responses have been identified that display absolute or relative subtype selectivity. These include GH secretion (SSTR2 and 5), insulin secretion (SSTR5), glucagon secretion (SSTR2), and immune responses (SSTR2) [
].This entry represents SSTR4. It is present in high levels in the pituitary, but is less abundant in the brain and peripheral tissues [
,
,
]. All five human somatostatin receptors expressed in COS-7 cells are coupled to activation of phosphoinositide (PI)-specific PLC-beta; and Ca2+ mobilisation via pertussis toxin-sensitive G protein(s) with an order of potency of SSTR5 >SSTR2 >SSTR3 >SSTR4 >SSTR1 [
].
Signal transduction response regulator, propionate catabolism, transcriptional regulator PrpR
Type:
Family
Description:
Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions [
]. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk []. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more []. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) []. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK.A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response [
,
].This entry represents the signal transduction response regulator PrpR. At least five distinct pathways exist for the catabolism of propionate by way of propionyl-CoA. Members of this family represent the transcriptional regulatory protein PrpR which is, in most cases, divergently transcribed from the operon that encodes the genes involved in the methylcitric acid cycle of propionate catabolism. This protein is required for the expression of the proteins involved in this pathway [
]. 2-methylcitric acid, an intermediate in this pathway, has been proposed to be a co-activator of PrpR [].
This growth factor receptor domain is a cysteine-rich region that is found in a variety of eukaryotic proteins that are involved in the mechanism of signal transduction by receptor tyrosine kinases. Proteins containing the growth factor receptor domain include the insulin-like growth factor-binding proteins (IGFBP) [
], the type-1 insulin-like growth-factor receptor (IGF-1R) [], and members of the epidermal growth factor (EGF) receptor family [], such as the receptor protein-tyrosine kinase Erbb-3 (ErbB3) []. The general structure of the growth factor receptor domain is a disulphide-bound fold containing a β-hairpin with two adjacent disulphides. IGFBPs control the distribution, function and activity of insulin-like growth factors (IGFs) IGF-I and IGF-II, which are key regulators of cell proliferation, differentiation and transformation. All IGFBPs share a common domain organisation, where the highest conservation is found in the N-terminal Cys-rich IGF-binding domain. The N-terminal domain contains 10-12 conserved cysteine residues. IGF-1R is a member of the tyrosine-kinase receptor superfamily that is involved in both normal growth and development and malignant transformation. The Cys-rich domain is flanked by two L-domains, and together they contribute to hormone binding and ligand specificity, even though they do not bind ligand directly. The Cys-rich region is composed of eight disulphide-bonded modules, seven of which form a rod-shaped domain. ErbB3 is a member of the epidermal growth factor receptor (EGFR) family of receptor tyrosine kinases. The extracellular region of ErbB3 is made up of two Cys-rich domains and two L-domains, arranged alternately [
]. The two L-domains and the first Cys-rich domain are structurally homologous to those found in IGF-1R. The two Cys-rich domains are extended repeats of seven small disulphide-containing modules. A β-hairpin loop extends from the first Cys-rich domain to contact the C-terminal portion of the second Cys-rich domain, creating a large pore structure.
Over 70 metallopeptidase families have been identified to date. In these enzymes a divalent cation which is usually zinc, but may be cobalt, manganese or copper, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. In some families of co-catalytic metallopeptidases, two metal ions are observed in crystal structures ligated by five amino acids, with one amino acid ligating both metal ions. The known metal ligands are His, Glu, Asp or Lys. At least one other residue is required for catalysis, which may play an electrophillic role.
Many metalloproteases contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases [].This group of metallopeptidases belong to the MEROPS peptidase family M12, subfamily M12B (adamalysin family, clan (MA(M)). The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA and the predicted active site residues for members of this family and thermolysin occur in the motif HEXXH [
].The M12B proteinases are also referred to as adamalysins or reprolysins [
,
]. The adamalysins are zinc dependent endopeptidases found in snake venom. There are some mammalian proteins such as ,
and fertilin . Fertilin and closely related
proteins appear to not have some active site residues andmay not be active enzymes.
CD156 (also called ADAM8 (
) or MS2 human) has been implicated in extravasation of leukocytes.
STAT6 mediate signals from the IL-4 receptor. Unlike the other STAT proteins which bind an IFNgamma Activating Sequence (GAS), STAT6 stands out as having a unique binding site preference. This site consists of a palindromic sequence separated by a 3 bp spacer (TTCNNNG-AA)(N3 site). STAT6 is able to bind the GAS site but only at a low affinity upon IL-4-induced activation [
]. There is speculation that the inappropriate activation of STAT6 is involved in uncontrolled cell growth in an oncogenic state []. IL-4 signaling via STAT6 initially occurs unopposed, but is then dampened by a negative feedback mechanism through the IL-4/Stat6 dependent induction of SOCS1 expression. The IL-4 dependent aspect of Th2 differentiation requires the activation of STAT6. IL-4 signaling and STAT6 appear to play an important role in the immune response. It was shown that large scale chromatin remodeling of the IL-4 gene occurs as cells differentiate into Th2 effectors is STAT6 dependent []. This entry represents the SH2 domain of STAT6.STAT proteins have a dual function: signal transduction and activation of transcription. When cytokines are bound to cell surface receptors, the associated Janus kinases (JAKs) are activated, leading to tyrosine phosphorylation of the given STAT proteins [
]. Phosphorylated STATs form dimers, translocate to the nucleus, and bind specific response elements to activate transcription of target genes []. STAT proteins contain an N-terminal domain (NTD), a coiled-coil domain (CCD), a DNA-binding domain (DBD), an α-helical linker domain (LD), an SH2 domain, and a transactivation domain (TAD). The SH2 domain is necessary for receptor association and tyrosine phosphodimer formation. There are seven mammalian STAT family members which have been identified: STAT1, STAT2, STAT3, STAT4, STAT5 (STAT5A and STAT5B), and STAT6 [
].
There are multiple types of iron-sulphur clusters which are grouped into three main categories based on their atomic content: [2Fe-2S], [3Fe-4S], [4Fe-4S] (see ), and other hybrid or mixed metal types. Two general types of [2Fe-2S] clusters are known and they differ in their coordinating residues. The ferredoxin-type [2Fe-2S]clusters are coordinated to the protein by four cysteine residues (see
). The Rieske-type [2Fe-2S] cluster is coordinated to its protein by two cysteine residues and two histidine residues [,
].The structure of several Rieske domains has been solved [
]. It contains three layers of antiparallel beta sheets forming two beta sandwiches. Both beta sandwiches share the central sheet 2. The metal-binding site is at the top of the beta sandwich formed by the sheets 2 and 3. The Fe1 iron of the Rieske cluster is coordinated by two cysteines while the other iron Fe2 is coordinated by two histidines. Two inorganic sulphide ions bridge the two iron ions forming a flat, rhombic cluster. Rieske-type iron-sulphur clusters are common to electron transfer chains of mitochondria and chloroplast and to non-haem iron oxygenase systems: The Rieske protein of the Ubiquinol-cytochrome c reductase (
) (also known as the bc1 complex or complex III), a complex of the electron transport chains of mitochondria and of some aerobic prokaryotes; it catalyses the oxidoreduction of ubiquinol and cytochrome c.
The Rieske protein of chloroplastic plastoquinone-plastocyanin reductase (
) (also known as the b6f complex). It is functionally similar to the bc1 complex and catalyses the oxidoreduction of plastoquinol and cytochrome f.
Bacterial naphthalene 1,2-dioxygenase subunit alpha, a component of the naphthalene dioxygenase (NDO) multicomponent enzyme system which catalyses the incorporation of both atoms of molecular oxygen into naphthalene to form cis-naphthalene dihydrodiol. Bacterial 3-phenylpropionate dioxygenase ferredoxin subunit. Bacterial toluene monooxygenase. Bacterial biphenyl dioxygenase.
Transcription factors of the T-box family are required both for early cell-fate decisions, such as those necessary for formation of the basic vertebrate body plan, and for differentiation and organogenesis [
]. The T-box is defined as the minimal region within the T-box protein that is both necessary and sufficient for sequence-specific DNA binding, all members of the family so far examined bind to the DNA consensus sequence TCACACCT. The T-box is a relatively large DNA-binding domain, generally comprising about a third of the entire protein (17-26kDa) [].These genes were uncovered on the basis of similarity to the DNA binding domain [
] of Mus musculus (Mouse) Brachyury (T) gene product, which similarity is the defining feature of the family. The Brachyury gene is named for its phenotype, which was identified 70 years ago as a mutant mouse strain with a short blunted tail. The gene, and its paralogues, have become a well-studied model for the family, and hence much of what is known about the T-box family is derived from the murine Brachyury gene.Consistent with its nuclear location, Brachyury protein has a sequence-specific DNA-binding activity and can act as a transcriptional regulator [
]. Homozygous mutants for the gene undergo extensive developmental anomalies, thus rendering the mutation lethal []. The postulated role of Brachyury is as a transcription factor, regulating the specification and differentiation of posterior mesoderm during gastrulation in a dose-dependent manner [].T-box proteins tend to be expressed in specific organs or cell types, especially during development, and they are generally required for the development of those tissues, for example, Brachyury is expressed in posterior mesoderm and in the developing notochord, and it is required for the formation of these cells in mice [
]. The T-box superfamily is an ancient group that appears to play a critical role in development in all animal species [
].
Aquaporins are water channels, present in both higher and lower organisms, that belong to the major intrinsic protein family. Most aquaporins are highly selective for water, though some also facilitate the movement of small uncharged molecules such as glycerol [
]. In higher eukaryotes these proteins play diverse roles in the maintenance of water homeostasis, indicating that membrane water permeability can be regulated independently of solute permeability. In microorganisms however, many of which do not contain aquaporins, they do not appear to play such a broad role. Instead, they assist specific microbial lifestyles within the environment, e.g. they confer protection against freeze-thaw stress and may help maintain water permeability at low temperatures []. The regulation of aquaporins is complex, including transcriptional, post-translational, protein-trafficking and channel-gating mechanisms that are frequently distinct for each family member.Structural studies show that aquaporins are present in the membrane as tetramers, though each monomer contains its own channel [
,
,
]. The monomer has an overall "hourglass"structure made up of three structural elements: an external vestibule, an internal vestibule, and an extended pore which connects the two vestibules. Substrate selectivity is conferred by two mechanisms. Firstly, the diameter of the pore physically limits the size of molecules that can pass through the channel. Secondly, specific amino acids within the molecule regulate the preference for hydrophobic or hydrophilic substrates.
Aquaporins are classified into two subgroups: the aquaporins (also known as orthodox aquaporins), which transport only water, and the aquaglyceroporins, which transport glycerol, urea, and other small solutes in addition to water [
,
].This entry represents aquaporin Z, a major water channel protein in bacteria. It mediates water influx in response to large changes in cellular osmorality [
].
This entry consists of a number of carbohydrate sulphotransferases that transfer sulphate to carbohydrate groups in glycoproteins and glycolipids. These include: Carbohydrate sulphotransferases 8 and 9, which transfer sulphate to position 4 of non-reducing N-acetylgalactosamine (GalNAc) residues in both N-glycans and O-glycans [
]. They function in the biosynthesis of glycoprotein hormones lutropin and thyrotropin, by mediating sulphation of their carbohydrate structures.Carbohydrate sulphotransferase 10, which transfers sulphate to position 3 of the terminal glucuronic acid in both protein- and lipid-linked oligosaccharides [
]. It directs the biosynthesis of the HNK-1 carbohydrate structure, a sulphated glucuronyl-lactosaminyl residue carried by many neural recognition molecules, which is involved in cell interactions during ontogenetic development and in synaptic plasticity in the adult. Carbohydrate sulphotransferases 11 - 13, which catalyze the transfer of sulphate to position 4 of the GalNAc residue of chondroitin [
]. The orthologue in Caenorhabditis elegansis known as carbohydrate sulfotransferase chst-1 [
]. Chondroitin sulphate constitutes the predominant proteoglycan present in cartilage and is distributed on the surfaces of many cells and extracellular matrices. Some, thought not all, of these enzymes also transfer sulphate to dermatan.Carbohydrate sulphotransferase D4ST1, which transfers sulphate to position 4 of the GalNAc residue of dermatan sulphate [
].Heparan sulphate 2-O-sulphotransferase (HS2ST). Heparan sulphate (HS) is a co-receptor for a number of growth factors, morphogens, and adhesion proteins. HS biosynthetic modifications may determine the strength and outcome of HS-ligand interactions. Mice that lack HS2ST undergo developmental failure only after midgestation,the most dramatic effect being the complete failure of kidney development [].Heparan-sulphate 6-O-sulphotransferase (HS6ST), which catalyses the transfer of sulphate from adenosine 3'-phosphate, 5'-phosphosulphate to the 6th position of the N -sulphoglucosamine residue in heparan sulphate [
].Chondroitin 6-sulphotransferase catalyses the transfer of sulphate to position 6 of the N-acetylgalactosamine residue of chondroitin [
].
Haemagglutinin-esterase fusion glycoprotein (HEF) is a multi-functional protein embedded in the viral envelope of several viruses, including influenza C virus, coronaviruses (in particular, Betacoronavirus members of subgenus Embecovirus, previously known as group 2a coronaviruses) and toroviruses [
,
,
]. HEF is required for infectivity, and functions to recognise the host cell surface receptor, to fuse the viral and host cell membranes, and to destroy the receptor upon host cell infection. The haemagglutinin region of HEF is responsible for receptor recognition and membrane fusion, and bears a strong resemblance to the sialic acid-binding haemagglutinin found in influenza A and B viruses, except that it binds 9-O-acetylsialic acid. The esterase region of HEF is responsible for the destruction of the receptor, an action that is carried out by neuraminidase in influenza A and B viruses. The esterase domain is similar in structure to Streptomyces scabies esterase, and to acetylhydrolase, thioesterase I and rhamnogalacturonan acetylesterase.The haemagglutinin-esterase glycoprotein HEF must be cleaved by the host's trypsin-like proteases to produce two peptides (HEF1 and HEF2) in order for the virus to be infectious. Once HEF is cleaved, the newly exposed N-terminal of the HEF2 peptide then acts to fuse the viral envelope to the cellular membrane of the host cell, which allows the virus to infect the host cell.The haemagglutinin-esterase glycoprotein is a trimer, where each monomer is composed of three domains: an elongated stem active in membrane fusion, an esterase domain, and a receptor-binding domain, where the stem and receptor-binding domains together resemble influenza A virus haemagglutinin. Two of these domains are composed of non-contiguous sequence: the receptor-binding haemagglutinin domain is inserted into a surface loop of the esterase domain, and the esterase domain is inserted into a surface loop of the haemagglutinin stem.
Carbamoyl-phosphate synthase small subunit, GATase1 domain
Type:
Domain
Description:
Glutamine amidotransferase (GATase) enzymes catalyse the removal of the ammonia group from glutamine and then transfer this group to a substrate to form a new carbon-nitrogen group [
]. The GATase domain exists either as a separate polypeptidic subunit or as part of a larger polypeptide fused in different ways to a synthase domain. Two classes of GATase domains have been identified [,
]: class-I (also known as trpG-type or triad) and class-II (also known as purF-type or Ntn). In class I glutamine amidotransferases, a triad of conserved Cys-His-Glu forms the active site, wherein the catalytic cysteine is essential for the amidotransferase activity [,
]. Different structures show that the active site Cys of type 1 GATase is located at the tip of a nucleophile elbow.The E.coli carbamoyl phosphate synthase (CPSase) is a heterodimeric enzyme composed of a small and a large subunit (with the exception of CPSase III). CPSase catalyses the synthesis of carbamoyl phosphate from biocarbonate, ATP and glutamine or ammonia, and represents the first committed step in pyrimidine and arginine biosynthesis in prokaryotes and eukaryotes, and in the urea cycle in most terrestrial vertebrates [
,
]. The small subunit catalyses the hydrolysis of glutamine to ammonia, which in turn used by the large chain to synthesize carbamoyl phosphate. The C-terminal domain of the small subunit of CPSase has glutamine amidotransferase activity. In animals CPSase small subunit is part of a fusion protein, CAD, which combines enzymatic activities of the pyrimidine pathway (glutamine-dependent carbamyl phosphate synthetase (GLN-CPSase), aspartate transcarbamylase (ATCase), and dihydroorotase (DHOase)) [
]. In fungi, the CAD-like protein Ura2 is a fusion protein with CPSase and ATCase activity, but without DHOase activity, which is provided by a separate protein []. This entry represents the class-I GATase domain of the CPSase small subunit.
Transcriptional activation and repression is required for control of cell proliferation and differentiation during embryonic development and homeostasis in the adult organism. Perturbations of these processes can lead to the development of cancer [
]. The Eight-Twenty-One (ETO) gene product is able to form complexes with corepressors and deacetylases, such as nuclear receptor corepressor (N-CoR), which repress transcription when recruited by transcription factors []. The ETO gene derives its name from its association with many cases of acute myelogenous leukaemia (AML), in which a reciprocal translocation, t(8;21), brings together a large portion of the ETO gene from chromosome eight and part of the AML1 gene from chromosome 21. The human ETO gene family currently comprises three major subfamilies: ETO/myeloid transforming gene on chromosome 8 (MTG8); myeloid transforming gene related protein-1 (MTGR1) and myeloid transforming gene on chromosome 16 (MTG16). ETO proteins are composed of four evolutionarily conserved domains termed nervy homology regions (NHR) 1-4. NHR1 is thought to stabilise the formation of high molecular weight complexes, but is not directly responsible for repressor activity. NHR2 and its flanking sequence comprise the core repressor domain, which mediates 50% of the wild type repressor activity. Furthermore, there is evidence that the amphipathic helical structure of NHR2 promotes the formation of ETO/AML1 homodimers []. NHR3 and NHR4 have been shown to act in concert to bind N-CoR. NHR4 contains two zinc finger motifs, which are thought to play a role in protein interactions rather than DNA binding []. Screening of dbEST with the entire ETO cDNA sequence revealed a number of ESTs showing significant similarity to the query sequence. Of those identified, two overlapping clones were sequenced, revealing an ORF coding for a putative 575 amino acid protein. This was subsequently mapped to chromosome 20 and named EHT (ETO Homologous on chromosome Twenty), and later as MTGR1 [].
This group of aspartic peptidases belong to MEROPS peptidase family A22 (presenilin family, clan AD): subfamily A22A, the type example being presenilin 1 from Homo sapiens (Human).Presenilins are polytopic transmembrane (TM) proteins, mutations in which
are associated with the occurrence of early-onset familial Alzheimer'sdisease, a rare form of the disease that results from a single-gene
mutation [,
]. The physiological functions of presenilins are unknown, but they may be related to developmental signalling, apoptotic signal transduction, or processing of selected proteins, such as the beta-amyloid precursor protein(beta-APP). There are a number of subtypes which belong to this presenilin family. That presenilin homologues have been identified in species that do not have an Alzhemier's disease correlate suggests that they may have functions unrelated to the disease, homologues having been identified in mouse, Drosophila melanogaster, Caenorhabditis elegans
[] and other members of the eukarya including plants. In humans, there are two presenilin genes (PS1 and PS2)that share 67% amino acid identity, the greatest divergence between the two falling in the N terminus and in the large hydrophilic loop towards the C terminus of each molecule. Six to nine TM domains are predicted for each, and biochemical analysis has demonstrated that their C-termini are cytoplasmic; but the orientation of their N-termini and large hydrophilic loops remains to be resolved. They are expressed in almost all tissues, including the brain and, at a cellular level, they have been localised to the nuclear envelope, endoplasmic
reticulum and Golgi apparatus. The signature defines vertebrate presenilin 2 (MEROPS identifier A22.002), which unlike presenilin 1 has been found to have pro-apoptotic actions, which are enhanced by the mutations that have been characterised in this protein; however, when compared to PS1 gene mutations, they are thought to be responsible for only a small percentage of early-onset familial Alzheimer's disease cases (ca. 1%) [
].
Voltage-dependent sodium channels are transmembrane (TM) proteins responsible for the depolarising phase of the action potential in most
electrically excitable cells []. They may exist in 3 states []: the resting state, where the channel is closed; the activated state, where the channel is open; and the inactivated state, where the channel is closed and refractory to opening. Several different structurally and functionally distinct isoforms are found in mammals, coded for by a multigene family, these being responsible for the different types of sodium ion currents found in excitable tissues.There are nine pore-forming alpha subunit of voltage-gated sodium channels consisting of four membrane-embedded homologous domains (I-IV), each consisting of six α-helical segments (S1-S6), three cytoplasmic loops connecting the domains, and a cytoplasmic C-terminal tail. The S6 segments of the four domains form the inner surface of the pore, while the S4 segments bear clusters of basic residues that constitute the channel's voltage sensors [
,
,
].Sodium channel protein type 8 subunit alpha (SCN8A) encodes the voltage-gated Na+ channel alpha 8 subunit and is strongly expressed in Purkinje cells. Sodium currents are known to generate
the rising phase and the prolonged plateau phase of cerebellar purkinje cell action potentials. Experiments in mice with mutated SCN8A subunits suggest its involvement in the persistent sodium current responsible for the prolonged plateau phase []. SCN8A is abundantly expressed throughout the CNS and in the spinal cord. Mutations in this protein have been related to a number of neurological disorders, including paralysis, ataxia and dystonia [,
,
,
].This entry represents a conserved region found towards the N terminus of the protein, which defines the alpha 8 subunits and distinguishes them from other members of the voltage-gated Na+ channel superfamily. For entries containing other members of this superfamily see
,
,
.
The tripartite DENN (after differentially expressed in neoplastic versus normal cells) domain is found in several proteins that share common structural features and have been shown to be guanine nucleotide exchange factors (GEFs) for Rab GTPases, which are regulators of practically all membrane trafficking events in eukaryotes. The tripartite DENN domain is composed of three distinct modules which are always associated due to functional and/or structural constraints: upstream DENN or uDENN (also known as longin domain), the better conserved central or core or
cDENN, and downstream or dDENN regions. The tripartite DENN domain is found associated with other domains, such as RUN, PLAT, PH, PPR, WD-40, GRAM or C1. The function of DENN domain remains to date unclear, although it appears to represent a good candidate for a GTP/GDP exchangeactivity [
,
,
,
,
].The DENN domain forms a heart-shaped structure, with the N-terminal residues forming one and the C-terminal residues forming the second one. The N-terminal half forms the uDENN domain and consists of a central antiparallel β-sheet layered between one helix and two helices. A long random-coil region links the two lobes. The C-terminal lobe is composed of the cDENN and dDENN domains. The cDENN domain is an alpha/beta three layered sandwich domain with a central sheet of 5-strands. The dDENN domain is an all-alpha helical domain, whose core contains two alpha-hairpins which diverge rapidly in sequence [,
].Divergent types of the tripartite DENN domain have also been detected in other protein families [
], such as folliculin (FLCN), a tumour suppressor protein disrupted in various cancers and the Birt-Hogg-Dube syndrome, and Smith-Magenis syndrome chromosomal region candidate eight protein (SMCR8), which has been implicated in autophagy [,
,
].
Prokaryotic cells have a defence mechanism against a sudden heat-shock stress. Commonly, they induce a set of proteins that protect cellular proteins from being denatured by heat. Among such proteins are the GroE and DnaK chaperones whose transcription is regulated by a heat-shock repressor protein HrcA. HrcA is a winged helix-turn-helix repressor that negatively regulates the transcription of dnaK and groE operons by binding the upstream CIRCE (controlling inverted repeat of chaperone expression) element. In Bacillus subtilis this element is a perfect 9 base pair inverted repeat separated by a 9 base pair spacer. The crystal structure of a heat-inducible transcriptional repressor, HrcA, from Thermotoga maritima has been reported at 2.2A resolution. HrcA is composed of three domains: an N-terminal winged helix-turn-helix domain (WHTH), a GAF-like domain, and an inserted dimerizing domain (IDD). The IDD shows a unique structural fold with an anti-parallel β-sheet composed of three β-strands sided by four α-helices. HrcA crystallises as a dimer, which is formed through hydrophobic contact between the IDDs and a limited contact that involves conserved residues between the GAF-like domains []. The structural studies suggest that the inactive form of HrcA is the dimer and this is converted to its DNA-binding form by interaction with GroEL, which binds to a conserved C-terminal sequence region [,
]. Comparison of the HrcA-CIRCE complexes from B. subtilis and Bacillus thermoglucosidasius (Geobacillus thermoglucosidasius), which grow at vastly different ranges of temperature shows that the thermostability profiles were consistent with the difference in the growth temperatures suggesting that HrcA can function as a thermosensor to detect temperature changes in cells []. Any increase in temperature causes the dissociation of the HrcA from the CIRCE complex with the concomitant activation of transcription of the groE and dnaK operons. This superfamily represents the inserted dimerising domain of HrcA.
FGD4 (also known as FRABIN) is a member of the FGD family and is a small RhoGTPase Cdc42-guanine nucleotide exchange factor. It is associated with Charcot-Marie-Tooth neuropathy type 4 (CMT4), a group of progressive motor and sensory axonal and demyelinating neuropathies that are distinguished from other forms of CMT by autosomal recessive inheritance [
]. FGD4 has been shown to regulate Schwann cell endocytosis []. This entry represents the N-terminal PH domain of FGD4. FGDs have a RhoGEF (DH) domain, followed by an N-terminal PH domain, a FYVE domain and a C-terminal PH domain. All FGDs are guanine nucleotide exchange factors that activates the Rho GTPase Cdc42, an important regulator of membrane trafficking. The RhoGEF domain is responsible for GEF catalytic activity, while the N-terminal PH domain is involved in intracellular targeting of the DH domain [
]. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner [
]. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity []. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane []. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes [].
Proteins containing this domain are proteinase inhibitors belonging to MEROPS inhibitor family I19 (clan IW) and sharing a pacifastin
domain of ~35 residues, which contains a characteristic pattern of sixconserved cysteine residues (C-x(9,12)-C-N-x-C-x-C-x(2,3)-G-x(3,6)-C-T-x(3)-
C). The pacifastin domain consists of a twisted β-sheet composed of threeantiparallel strands and stabilised by an identical pattern (C1-C4, C2-C6,
C3-C5) of disulfide bridges [,
,
,
,
,
]. Proteins containing this domain were first isolated from Locusta migratoria migratoria(migratory locust). These were HI, LMCI-1 (PMP-D2) and LMCI-2 (PMP-C) [
,
,
]; five additional members SGPI-1 to 5 were identified in Schistocerca gregaria (desert locust) [,
], and a heterodimeric serine protease inhibitor (pacifastin) was isolated from the hemolymph of Pacifastacus leniusculus (Signal crayfish) []. Pacifastin is a 155kDa composed of two covalently linked subunits, which are separately encoded. The heavy chain of pacifastin (105kDa) is related to transferrins, containing three transferrin lobes, two of which seem to
be active for iron binding []. A number of the members of the transferrin family are also serine peptidases belong to MEROPS peptidase family S60 (). The light chain of pacifastin (44kDa) is the proteinase inhibitory subunit, and has nine cysteine-rich inhibitory domains that are homologous to each other. The locust inhibitors share a conserved array of six cysteine residues with the pacifastin light chain. The structure of members of this family reveal that they are comprised of a triple-stranded antiparallel β-sheet connected by three disulphide bridges [
].The biological function(s) of the locust inhibitors is (are) not fully understood. LMCI-1 and LMCI-2 were shown to inhibit the endogenous proteolytic activating cascade of prophenoloxidase [
]. Expression analysis shows that the genes encoding the SGPI precursors are differentially expressed in a time-, stage- and hormone-dependent manner.
This group of serine protease inhibitors belong to MEROPS inhibitor family I15, clan IO. They inhibit serine peptidases of the S1 family (
) [
] and are characterised by a well conserved pattern of cysteine residues. This is a family of leech anti-coagulants.Antistasin is a 15kDa protein found in the salivary glands of Haementeria officinalis (Mexican leech); it is an anticoagulant that functions by
inhibiting factor Xa. The protein contains 119 residues, with an unusually high cysteine content (20 residues in all), and exhibits a 2-foldinternal repeated structure. Four isoforms of antistasin have been identified in leech salivary gland extracts; partial sequence analysis
indicates that these isoforms differ only by 1 or 2 amino acid residues [].Ghilanten is an anticoagulant-antimetastatic protein of Haementeria ghilianii (Amazon leech). Like antistasin, it contains 119 amino acids,
with 20 cysteines, and a heparin-binding consensus motif at its C terminus. Arginine-34 is the residue involved in the active-site inhibition of trypsinand Factor Xa [
]. The 3D structure of antistasin has been determined to 1.9A resolution by X-ray crystallography [
]. The structure reveals a novel protein fold comprising two similar domains, which can be divided into two similarly sized subdomains, with different relative orientations. Thus, the domain shapes differ, the N-terminal domain being wedge-shaped and the C-terminal domain flat []. Docking studies suggest that it is differences in domain shape that enable the N-terminal domain to bind and inhibit factor Xa, rather than the C-terminal domain, despite very similar active sites. A putative exosite binding region is evident in the N-terminal domain (residues 15-17), which is likely to interact with a cluster of positively charged residues on the factor Xa surface (Arg222/Lys223/Lys224), explaining the specificity and inhibitory potency of antistasin towards factor Xa.
Prokaryotic cells have a defence mechanism against a sudden heat-shock stress. Commonly, they induce a set of proteins that protect cellular proteins from being denatured by heat. Among such proteins are the GroE and DnaK chaperones whose transcription is regulated by a heat-shock repressor protein HrcA. HrcA is a winged helix-turn-helix repressor that negatively regulates the transcription of dnaK and groE operons by binding the upstream CIRCE (controlling inverted repeat of chaperone expression) element. In Bacillus subtilis this element is a perfect 9 base pair inverted repeat separated by a 9 base pair spacer. The crystal structure of a heat-inducible transcriptional repressor, HrcA, from Thermotoga maritima has been reported at 2.2A resolution. HrcA is composed of three domains: an N-terminal winged helix-turn-helix domain (WHTH), a GAF-like domain, and an inserted dimerizing domain (IDD). The IDD shows a unique structural fold with an anti-parallel β-sheet composed of three β-strands sided by four α-helices. HrcA crystallises as a dimer, which is formed through hydrophobic contact between the IDDs and a limited contact that involves conserved residues between the GAF-like domains [
]. The structural studies suggest that the inactive form of HrcA is the dimer and this is converted to its DNA-binding form by interaction with GroEL, which binds to a conserved C-terminal sequence region [,
]. Comparison of the HrcA-CIRCE complexes from B. subtilis and Bacillus thermoglucosidasius (Geobacillus thermoglucosidasius), which grow at vastly different ranges of temperature shows that the thermostability profiles were consistent with the difference in the growth temperatures suggesting that HrcA can function as a thermosensor to detect temperature changes in cells []. Any increase in temperature causes the dissociation of the HrcA from the CIRCE complex with the concomitant activation of transcription of the groE and dnaK operons. This entry represents the C terminus of HrcA, consisting of the GAF-like domain with the inserted dimerising domain.
Myotubularin-related protein 6 (MTMR6) is a catalytically active member of the myotubularin (MTM) family, which possess 3-phosphatase activity dephosphorylating phosphatidylinositol-3-phoshate and phosphatidylinositol-3,5-bisphosphate. MTMR6 forms a heteromer with enzymatically inactive MTMR9. MTMR9 increases MTMR6 binding to phospholipids and increases the 3-phosphatase activity of MTMR6 [
]. MTMR6 is reported to be involved in the regulation of the Ca2+-activated K+ channel KCa3.1 [] and apoptosis []. The cellular localisation of MTMR6 is regulated by Rab1B in the early secretory and autophagic pathways [].The myotubularin family constitutes a large group of conserved proteins, with 14 members in humans consisting of myotubularin (MTM1) and 13 myotubularin-related proteins (MTMR1-MTMR13). Orthologues have been found throughout the eukaryotic kingdom, but not in bacteria. MTM1 dephosphorylates phosphatidylinositol 3-monophosphate (PI3P) to phosphatidylinositol and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2] to phosphatidylinositol 5-monophosphate (PI5P) [,
]. The substrate phosphoinositides (PIs) are known to regulate traffic within the endosomal-lysosomal pathway []. MTMR1, MTMR2, MTMR3, MTMR4, and MTMR6 have also been shown to utilise PI(3)P as a substrate, suggesting that this activity is intrinsic to all active family members. On the other hand, six of the MTM family members encode for catalytically inactive phosphatases. Inactive myotubularin phosphatases contain substitutions in the Cys and Arg residues of the Cys-X5-Arg motif. MTM pseudophosphatases have been found to interact with MTM catalytic phosphatases []. The myotubularin family includes several members mutated in neuromuscular diseases or associated with metabolic syndrome, obesity, and cancer [].MTMR6 contains an N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain, and a C-terminal coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. This entry represents the PH-GRAM domain of MTMR6.
Wnt proteins constitute a large family of secreted molecules that are involved in intercellular signalling during development. The name derives from the first 2 members of the family to be discovered: int-1 (mouse) and wingless (Drosophila) [
]. It is now recognised that Wnt signalling controls many cell fate decisions in a variety of different organisms, including mammals []. Wnt signalling has been implicated in tumourigenesis, early mesodermal patterning of the embryo, morphogenesis of the brain and kidneys, regulation of mammary gland proliferation and Alzheimer's disease [,
].Wnt-mediated signalling is believed to proceed initially through binding to cell surface receptors of the frizzled family; the signal is subsequently transduced through several cytoplasmic components to B-catenin, which enters the nucleus and activates the transcription of several genes important in
development []. Several non-canonical Wnt signalling pathways have also been elucidated that act independently of B-catenin. Canonical and noncanonical Wnt signaling branches are highly interconnected, and cross-regulate each other [].Members of the Wnt gene family are defined by their sequence similarity to mouse Wnt-1 and Wingless in Drosophila. They encode proteins of ~350-400 residues in length, with orthologues identified in several, mostly vertebrate, species. Very little is known about the structure of
Wnts as they are notoriously insoluble, but they share the following features characteristics of secretory proteins: a signal peptide, several potential N-glycosylation sites and 22 conserved cysteines [] that are probably involved in disulphide bonds. The Wnt proteins seem to adhere to the plasma membrane of the secreting cells and are therefore likely to signal over only few cell diameters. Fifteen major Wnt gene families have been identified in vertebrates, with multiple subtypes within some classes.Wnt has a two-domain structure, resembling a "hand"with "thumb"and "index"fingers. This entry represents the C-terminal Wnt domain [
].
This family consists of lytic murein transglycosylases (murein hydrolases) related to MltB (
), which is a 38kDa membrane-bound lipoprotein in Escherichia coli. The N-terminal region of this protein contains a lipoprotein-processing site which is conserved in about half the members of this family. Proteolytic cleavage of MltB produces a fully-active, soluble form of this enzyme known as Slt35 (for soluble lytic transglycosylase). This enzyme catalyzes the cleavage of the glycosidic bonds between N-acetylmuramic acid and N-acetylglucosamine residues in peptidoglycan. Its physiological role is unknown as deletion of the gene shows no obvious phenotype [
], though it has been suggested to play a role in recycling of muropeptides during cell elongation and/or cell division.The Slt35 enzyme is a monomer with an ellipsoid shape and is composed of three distinct domains known as the alpha, beta and core domains [
,
]. The alpha domain contains mainly α-helices, while the beta domain consists of a five-stranded antiparallel β-sheet flanked by a short α-helix. The core domain is sandwiched between the alpha and beta domains and its fold is similar to that of lysozyme, but contains a single metal ion binding site in a helix-loop-helix module that is similar to the eukaryotic EF-hand calcium-binding fold, though in this case the loop is slightly longer than usual. Binding of Ca(2+) to this EF hand motif has been shown to be important for thermal stability of the protein []. The substrate binding sites are found in the cleft formed by the core domain of the enzyme [].Members of this family do not contain the putative peptidoglycan binding domain described by
, which is associated with several other classes of bacterial cell wall lytic enzymes.
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].This entry spans the seven transmembrane regions of rhodopsin-like GPCRs. It also identifies some non rhodopsin-like GPCRs, including a number of taste receptors and vomeronasal receptors.
This group represents metallopeptidases of the MEROPS peptidase family A31 (HybD endopeptidase family). Peptidase family A31 includes endopeptidases involved in hydrogenase maturation. HycI (hydrogenase 3 maturation protease) is a protease involved in the C-terminal processing of HycE, the large subunit of hydrogenase 3 [
,
,
]. HybD is involved in processing of pre-HybC (the large subunit of hydrogenase 2) []; and HyaD is assumed to be involved in processing of the large subunit of hydrogenase 1.The large subunit of [NiFe]-hydrogenase, as well as other nickel metalloenzymes, is synthesized as a precursor devoid of the metalloenzyme active site. This precursor undergoes a complex post-translational maturation process that requires a number of accessory proteins [,
,
]. At one step of this process, after nickel incorporation, each hydrogenase isoenzyme is processed by proteolytic cleavage at the C-terminal end by the corresponding hydrogenase maturation endopeptidase []. The cleavage site is after a His or an Arg, liberating a short peptide [,
]. This cleavage occurs only in the presence of nickel, and the endopeptidase probably uses the metal in the large subunit of [NiFe]-hydrogenases as a recognition motif [
]. There is no direct evidence for the active site or substrate-binding site, but there are predictions based on an available structure [].Nomenclature note: the following names are used in different organisms for members of this group: HycI, HybD, HyaD, HoxM, HoxW, HupD, HynC, HupM, VhoD, VhtD [
]. Gene/protein names are sometimes used interchangeably to designate various "hydrogenase cluster"proteins unrelated to each other in various organisms. For example, the following names are used for members of this group, but also for unrelated proteins: HupD is used in Azotobacter chroococcum and Anabaena species to designate an unrelated hydrogenase maturation factor; HydD is used to designate hydrogenase structural genes in Thermococcus litoralis, Pyrococcus abyssi, and other species.
Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll 'a' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane [
,
,
]. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) andD2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection [
]. This family represents the low molecular weight bitopic transmembrane protein PsbN found in PSII. It localises in stroma lamellae and contains a highly conserved C-terminal region, exposed to the stroma. Although PsbN is not a constituent subunit of PSII, it is required for repair from photoinhibition and efficient assembly of the PSII reaction centers [
].
SRF-like/Type I subfamily of MADS (MCM1, Agamous, Deficiens, and SRF (serum response factor)) box family of eukaryotic transcriptional regulators [
]. Binds DNA and exists as hetero- and homo-dimers [,
]. Differs from the MEF-like/Type II subgroup mainly in position of the alpha 2 helix responsible for the dimerization interface. Important in homeotic regulation in plants and in immediate-early development in animals []. Also found in fungi [,
].Human serum response factor (SRF) is a ubiquitous nuclear protein important for cell proliferation and differentiation. SRF function is essential for transcriptional regulation of numerous growth-factor-inducible genes, such as c-fos oncogene and muscle-specific actin genes. A core domain of around 90 amino acids is sufficient for the activities of DNA-binding, dimerisation and interaction with accessory factors. Within the core is a DNA-binding region, designated the MADS box [
], that is highly similar to many eukaryotic regulatory proteins: among these are MCM1, the regulator of cell type-specific genes in fission yeast; DSRF, a Drosophila trachea development factor; the MEF2 family of myocyte-specific enhancer factors; and the Agamous and Deficiens families of plant homeotic proteins.In SRF, the MADS box has been shown to be involved in DNA-binding and dimerisation [
]. Proteins belonging to the MADS family function as dimers, the primary DNA-binding element of which is an anti-parallel coiled coil of two amphipathic α-helices, one from each subunit. The DNA wraps around the coiled coil allowing the basic N-termini of the helices to fit into the DNA major groove. The chain extending from the helix N-termini reaches over the DNA backbone and penetrates into the minor groove. A 4-stranded, anti-parallel β-sheet packs against the coiled-coil face opposite the DNA and is the central element of the dimerisation interface. The MADS-box domain is commonly found associated with K-box region see (
).
The P-loop guanosine triphosphatases (GTPases) control a
multitude of biological processes, ranging from cell division, cell cycling,and signal transduction, to ribosome assembly and protein synthesis. GTPases
exert their control by interchanging between an inactive GDP-bound state andan active GTP-bound state, thereby acting as molecular switches. The common
denominator of GTPases is the highly conserved guanine nucleotide-binding (G)domain that is responsible for binding and hydrolysis of guanine nucleotides.The FeoB family of GTPases is widespread, although not ubiquitous, in Bacteria
and Archaea, but missing from Eukaryota. FeoB is involved in the uptake offerrous iron (Fe(2+)), an important cofactor in biological electron transfer
and catalysis. Most of the FeoB proteins contain an N-terminal G-domain,connected by an entirely α-helical linker peptide to the membrane domain
with 8 to 12 predicted membrane-spanning α-helices, while in someorganisms the G-domain is expressed separately as a soluble protein. The FeoB-
type G domain belongs to the TrmE-Era-EngA-EngB-Septin-like (TEES) superfamilyof the TRAFAC class GTPases.The structure of the FeoB-type G domain follows the typical fold of small GTP-
binding proteins, consisting of a seven-stranded β-sheet surrounded by fiveα-helices. The ~170-residue FeoB-type G domain harbours
five short amino-acid motifs (G1-G5) that are critical in the binding of botha magnesium (Mg(2+)) ion and the guanine nucleotide. The G1 motif (GxxxxGKS/T)
(P-loop) is in position to stabilise the beta- and gamma-phosphates of GTP byhydrogen bonds donated by main-chain amides. The threonine of the G2 motif
(P/AGxT) coordinates the Mg(2+). The G3 motif (DxxG) interacts with the Mg(2+)and an oxygen of the gamma-phosphate. The G4 motif (NxxD) is involved in
recognition of the guanine nucleotide by forming hydrogen bonds to the guaninebase. The G5 motif (S/VSTV]) is, despite low sequence conservation, attributedto critical guanine base coordination [
,
,
,
,
,
].
The process of vesicular membrane fusion in eukaryotic cells depends on a conserved fusion machinery called SNARE (soluble N-ethylmaleimide-sensitive factor (NSF) attachment protein (SNAP) receptors). In the process of vesicle docking, proteins present on the vesicle (v-SNARE) have to bind to their counterpart on the target membrane (t-SNARE) to form a core complex that can then recruit the soluble proteins NSF and SNAP. This so called fusion complex can then disassemble after ATP hydrolysis mediated by the ATPase NSF in a process that leads to membrane fusion and the release of the vesicle contents. v-SNAREs include proteins homologous to synaptobrevin [
,
,
].Structurally the SNARE complex is generally a four-helix bundle comprised of three coiled-coil-forming domains from t-SNAREs and one from
v-SNARE. Although sequence similarity in the t- and v-SNARE coiled-coil homology domains are low there is a striking conservation of theso-called heptad repeat that is of central importance in forming a coiled-coil structure. In a coiled-coil motif, seven residues constitute a canonical
heptad and are designated 'a' through 'g', with 'a' and 'd' being occupied by hydrophobic residues. The association of the four α-helices in the SNARE fusion complex structure produces highly conserved layers of interacting amino acid side chains in the centre of the four-helix bundle. The centre of the bundle is made up of 15 hydrophobic layers from the 'a' and 'd' positions of the heptad repeats of the coiled-coil-forming domains, whereas the central 'ionic' layer is highly conserved and polar in nature, containing a glutamine residue in the three t-SNAREs and an arginine in the v-SNARE, hence the classification of v- and t-SNAREs as R- and Q-SNAREs, respectively. The v-SNARE coiled-coil homology domain is around 60 amino acids in length [,
,
].The entry represents the entire v-SNARE coiled-coil homology domain.
Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [
,
,
,
,
]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few [
]. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents DNA glycosylase/AP lyase enzymes that are involved in base excision repair of DNA damaged by oxidation or by mutagenic agents. These enzymes are primarily from bacteria, and have both DNA glycosylase activity (
) and AP lyase activity (
). Examples include formamidopyrimidine-DNA glycosylases (Fpg; MutM) and endonuclease VIII (Nei). These enzymes contain a zinc finger domain that is important for DNA-binding.
Formamidopyrimidine-DNA glycosylases (Fpg, MutM) is a trifunctional DNA base excision repair enzyme that removes a wide range of oxidation-damaged bases (N-glycosylase activity;
) and cleaves both the 3'- and 5'-phosphodiester bonds of the resulting apurinic/apyrimidinic site (AP lyase activity;
). Fpg has a preference for oxidised purines, excising oxidized purine bases such as 7,8-dihydro-8-oxoguanine (8-oxoG). ITs AP (apurinic/apyrimidinic) lyase activity introduces nicks in the DNA strand, cleaving the DNA backbone by beta-delta elimination to generate a single-strand break at the site of the removed base with both 3'- and 5'-phosphates. Fpg is a monomer composed of 2 domains connected by a flexible hinge [
]. The two DNA-binding motifs (a zinc finger and the helix-two-turns-helix motifs) suggest that the oxidized base is flipped out from double-stranded DNA in the binding mode and excised by a catalytic mechanism similar to that of bifunctional base excision repair enzymes []. Fpg binds one ion of zinc at the C terminus, which contains four conserved and essential cysteines [,
].Endonuclease VIII (Nei) has the same enzyme activities as Fpg above (
,
), but with a preference for oxidized pyrimidines, such as thymine glycol, 5,6-dihydrouracil and 5,6-dihydrothymine [
]. Three human homologues of Escherichia coli Nei have been identified, called Nei-like (NEIL) enzyme. NEIL2 (Nei-like-2) shares structural features and reaction mechanism with E. coli Nei (and Fpg), but it contains a C4-type zinc finger in place of the CHCC-type found in Nei and Fpg []. By contrast, the structure of NEIL1 exhibits the same overall fold as E. coli Nei; however, the β-hairpin zinc finger found in other Fpg/Nei family members is replaced by a structural motif composed of two antiparallel β-strands that mimics a zinc finger but lacks the loops that harbour the zinc-binding residues and, therefore, does not coordinate zinc []. This entry identifies the zinc finger in NEIL2, but not the "zincless finger"in NEIL1.
The type I glycoprotein S of Coronavirus, trimers of which constitute the typical viral spikes, is assembled into virions through noncovalent interactions with the M protein. The spike glycoprotein is translated as a large polypeptide that is subsequently cleaved to S1 (
) and S2 [
]. The cleavage of S can occur at two distinct sites: S2 or S2' []. The spike is present intwo very different forms: pre-fusion (the form on mature virions) and post-fusion (the form after membrane fusion has been completed). The spike is cleaved sequentially by host proteases at two sites: first at the S1/S2 boundary (i.e. S1/S2 site) and second within S2 (i.e. S2' site). After the cleavages, S1 dissociates from S2, allowing S2 to transition to the post-fusion structure [
]. Both chimeric S proteins appeared to cause cell fusion when expressed individually, suggesting that they were biologically fully active [
]. The spike is a type I membrane glycoprotein that possesses a conserved transmembrane anchor and an unusual cysteine-rich (cys) domain that bridges the putative junction of the anchor and the cytoplasmic tail [].SARS-CoV S is largely uncleaved after biosynthesis. It can be later processed by endosomal cathepsin L, trypsin, thermolysin, and elastase, which are shown to induce syncytia formation and virus entry. Other proteases that are of potential biological relevance in potentiating SARS-CoV S include TMPRSS2, TMPRSS11a, and HAT which are localized on the cell surface and are highly expressed in the human airway [
]. The furin-like S2' cleavage site at KR/SF with P1 and P2 basic residues and a P2' hydrophobic Phe downstream of the IFP is identical between the SARS-CoV-2 and SARS-CoV. One or more furin-like enzymes would cleave the S2' site at KR/SF [,
]. Deletion of SARS-CoV-2 furin cleavage site suggests that it may not be required for viral entry but may affect replication kinetics and altered sites have been still seen proteolytically cleaved. Several substitutions within the S2' cleavage domain of SARS-COV-2 have been reported, including P812L/S/T, S813I/G, F817L, I818S/V, but further experimental study of their consequences and the replication properties of the altered viruses are required to understand the role of furin cleavage in SARS-CoV-2 infection and virulence []. The S2 subunit normally contains multiple key components, including one or more fusion peptides (FP), a second proteolytic site (S2') and two conserved heptad repeats (HRs), driving membrane penetration and virus-cell fusion. The HRs can trimerize into a coiled-coil structure built of three HR1-HR2 helical hairpins presenting as a canonical six-helix bundle and drag the virus envelope and the host cell bilayer into close proximity, preparing for fusion to occur [
]. The fusion core is composed of HR1 and HR2 and at least three membranotropic regions that are denoted as the fusion peptide (FP), internal fusion peptide (IFP), and pretransmembrane domain (PTM). The HR regions are further flanked by the three membranotropic components. Both FP and IFP are located upstream of HR1, while PTM is distally downstream of HR2 and directly precedes the transmembrane domain of SARS-CoV S. All of these three components are able to partition into the phospholipid bilayer to disturb membrane integrity. []. During the pandemic, many conservative amino acid changes in FP segment of SARS-CoV-2 have been reported (i.e., L821I, L822F, K825R, V826L, T827I, L828P, A829T, D830G/A, A831V/S/T, G832C/S, F833S, I834T), although their impact is not known as the active conformation and mode of insertion of SARS-CoV-2 fusion peptide have not been experimentally characterised. Differences in HR1 sequences between SARS-CoV and SARS-CoV-2 suggest that SARS-CoV-2 HR2 makes stronger interactions with HR1. However, the substitutions observed in the solvent accessible surface of the HR1 domain (e.g., D936Y, S943P, S939F) of SARS-CoV-2 do not seem to be involved in stabilizing interactions with HR2. Substitutions in HR2 (e.g., K1073N, V1176F) or the TM or cytoplasmic tail domains have also been observed, but further experimental work is required to determine the effects of these changes [].
Potassium channels are the most diverse group of the ion channel family [
,
]. They are important in shaping the action potential, and in neuronal excitability and plasticity []. The potassium channel family is composed of several functionally distinct isoforms, which can be broadly separated into 2 groups []: the practically non-inactivating 'delayed' group and the rapidly inactivating 'transient' group.
These are all highly similar proteins, with only small amino acid changes causing the diversity of the voltage-dependent gating mechanism, channel conductance and toxin binding properties. Each type of K
+channel is activated by different signals and conditions depending on their type of regulation: some open in response to depolarisation of the plasma membrane; others in response to hyperpolarisation or an increase in intracellular calcium concentration; some can be regulated by binding of a transmitter, together with intracellular kinases; while others are regulated by GTP-binding proteins or other second messengers [
]. In eukaryotic cells, K+channels are involved in neural signalling and generation of the cardiac rhythm, act as effectors in signal transduction pathways involving G protein-coupled receptors (GPCRs) and may have a role in target cell lysis by cytotoxic T-lymphocytes [
]. In prokaryotic cells, they play a role in the maintenance of ionic homeostasis [].All K
+channels discovered so far possess a core of alpha subunits, each comprising either one or two copies of a highly conserved pore loop domain (P-domain). The P-domain contains the sequence (T/SxxTxGxG), which has been termed the K
+selectivity sequence. In families that contain one P-domain, four subunits assemble to form a selective pathway for K
+across the membrane. However, it remains unclear how the 2 P-domain subunits assemble to form a selective pore. The functional diversity of these families can arise through homo- or hetero-associations of alpha subunits or association with auxiliary cytoplasmic beta subunits. K
+channel subunits containing one pore domain can be assigned into one of two superfamilies: those that possess six transmembrane (TM) domains and those that possess only two TM domains. The six TM domain superfamily can be further subdivided into conserved gene families: the voltage-gated (Kv) channels; the KCNQ channels (originally known as KvLQT channels); the EAG-like K
+channels; and three types of calcium (Ca)-activated K
+channels (BK, IK and SK) [
]. The 2TM domain family comprises inward-rectifying K+channels. In addition, there are K
+channel alpha-subunits that possess two P-domains. These are usually highly regulated K
+selective leak channels.
Two types of beta subunit (KCNE and KCNAB) are presently known to associate with voltage-gated alpha subunits (Kv, KCNQ and eag-like). However, not all combinations of alpha and beta subunits are possible. The KCNE family of K+ channel subunits are membrane glycoproteins that possess a single transmembrane (TM) domain. They share no structural relationship with the alpha subunit proteins, which possess pore forming domains. The subunits appear to have a regulatory function, modulating the kinetics and voltage dependence of the alpha subunits of voltage-dependent K+ channels. KCNE subunits are formed from short polypeptides of ~130 amino acids, and are divided into five subfamilies: KCNE1 (MinK/IsK), KCNE2 (MiRP1), KCNE3 (MiRP2), KCNE4 (MiRP3) and KCNE1L (AMMECR2). KCNE2 subunits associate with the eag-like HERG alpha subunits, which arethe pore-forming subunits of cardiac IKr channels. Channels formed solely
from HERG subunits display similar properties to native IKr channels;however, they differ in their gating and single channel conductance.
Channels formed from both KCNE2 and HERG exhibit properties that are identical to those seen in native IKr channels. Three mutations in the KCNE2
gene are associated with long QT syndrome and ventricular fibrillation. These mutations result in channels that open slower and close more rapidly,
the net effect being a reduced K+ current [].
ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [
].The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [
,
,
].The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly β-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel β-sheet of armI by a two-fold axis [
,
,
,
,
,
].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions [
]. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [,
]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [,
,
].In yeast, the PDR and CDR ABC transporters display extensive sequence homology, and confer resistance to several anti-fungal compounds by actively transporting their substrates out of the cell. These transporters have two homologous halves, each with an N-terminal intracellular hydrophilic region that contains an ATP-binding site, followed by a C-terminal membrane-associated region containing six transmembrane segments [
]. This entry represents a domain of the PDR/CDR ABC transporter comprising extracellular loop 3, transmembrane segment 6 and a linker region.
2-aminoethylphosphonate ABC transport system, ATP-binding component PhnT2
Type:
Family
Description:
ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [
].The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [,
,
].The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly β-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel β-sheet of armI by a two-fold axis [
,
,
,
,
,
].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions [
]. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [,
]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [,
,
].The enzyme phosphonatase catalyses the degradation of 2-aminoethylphosphonate (AEP) in bacteria. This allows them to metabolise a range of organophosphonate compounds, including 2-aminoethylphosphonate, as a sole source of carbon, energy and phosphorus for growth [
]. The C-P bond in phosphonoacetaldehyde (Pald) is hydrolysed and a bi-covalent Lys53ethylenamine/Asp12 aspartylphosphate intermediate is formed []. This step can also be catalysed by C-P lyase [], with some bacteria having the genes for both pathways and some only for one of them. The 2-aminoethylphosphonate ABC transport system functions in the transport of 2-aminoethylphosphonate across the membrane for utilisation in the bacterial cell [].This entry represents the ATP-binding component PhnT2 of the 2-aminoethylphosphonate ABC transport system.
ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [
].The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [
,
,
].The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly β-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel β-sheet of armI by a two-fold axis [
,
,
,
,
,
].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions [
]. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [,
]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [,
,
].This family consists of a single polypeptide chain transporter in the ATP-binding cassette (ABC) transporter family, MsbA, which exports lipid A [
,
,
]. It may also act in multidrug resistance being linked to the efflux of amphipathic drugs []. Lipid A, a part of lipopolysaccharide, is found in the outer leaflet of the outer membrane of most Gram-negative bacteria. Members of this family are restricted to the Proteobacteria (although lipid A is more broadly distributed) and often are clustered with lipid A biosynthesis genes [].
Potassium channels are the most diverse group of the ion channel family [
,
]. They are important in shaping the action potential, and in neuronal excitability and plasticity []. The potassium channel family is composed of several functionally distinct isoforms, which can be broadly separated into 2 groups []: the practically non-inactivating 'delayed' group and the rapidly inactivating 'transient' group.These are all highly similar proteins, with only small amino acid changes causing the diversity of the voltage-dependent gating mechanism, channel conductance and toxin binding properties. Each type of K
+channel is activated by different signals and conditions depending on their type of regulation: some open in response to depolarisation of the plasma membrane; others in response to hyperpolarisation or an increase in intracellular calcium concentration; some can be regulated by binding of a transmitter, together with intracellular kinases; while others are regulated by GTP-binding proteins or other second messengers []. In eukaryotic cells, K+channels are involved in neural signalling and generation of the cardiac rhythm, act as effectors in signal transduction pathways involving G protein-coupled receptors (GPCRs) and may have a role in target cell lysis by cytotoxic T-lymphocytes [
]. In prokaryotic cells, they play a role in the maintenance of ionic homeostasis [].All K
+channels discovered so far possess a core of alpha subunits, each comprising either one or two copies of a highly conserved pore loop domain (P-domain). The P-domain contains the sequence (T/SxxTxGxG), which has been termed the K
+selectivity sequence. In families that contain one P-domain, four subunits assemble to form a selective pathway for K
+across the membrane. However, it remains unclear how the 2 P-domain subunits assemble to form a selective pore. The functional diversity of these families can arise through homo- or hetero-associations of alpha subunits or association with auxiliary cytoplasmic beta subunits. K
+channel subunits containing one pore domain can be assigned into one of two superfamilies: those that possess six transmembrane (TM) domains and those that possess only two TM domains. The six TM domain superfamily can be further subdivided into conserved gene families: the voltage-gated (Kv) channels; the KCNQ channels (originally known as KvLQT channels); the EAG-like K
+channels; and three types of calcium (Ca)-activated K
+channels (BK, IK and SK) [
]. The 2TM domain family comprises inward-rectifying K+channels. In addition, there are K
+channel alpha-subunits that possess two P-domains. These are usually highly regulated K
+selective leak channels.
Inwardly-rectifying potassium channels (Kir) are the principal class of two-TM domain potassium channels. They are characterised by the property of inward-rectification, which is described as the ability to allow large inward currents and smaller outward currents. Inwardly rectifying potassium channels (Kir) are responsible for regulating diverse processes including: cellular excitability, vascular tone, heart rate, renal salt flow, and insulin release [
]. To date, around twenty members of this superfamily have been cloned, which can be grouped into six families by sequence similarity, and these are designated Kir1.x-6.x [,
].Cloned Kir channel cDNAs encode proteins of between ~370-500 residues, both N- and C-termini are thought to be cytoplasmic, and the N terminus lacks a signal sequence. Kir channel alpha subunits possess only 2TM domains linked with a P-domain. Thus, Kir channels share similarity with the fifth and sixth domains, and P-domain of the other families. It is thought that four Kir subunits assemble to form a tetrameric channel complex, which may be hetero- or homomeric [
].Kir1.1 channels (also known as ATP-sensitive inward rectifier potassium channel 1, Kcnj1 and ROMK1-6) are thought to underlie
K+secretion in the kidney. Their activity is modulated by intracellular pH, with acidosis inhibiting the channel. Both N- and C-termini are thought to be involved in this modulation. Mutations in Kir1.1 lead to Bartter's syndrome type III, an inherited kidney disorder, which leads to salt-wasting, hypokalaemia and metabolic acidosis [
,
,
]. This protein plays a role in cell proliferation, invasion, and apoptosis [].
Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll 'a' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane [
,
,
]. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection []. This family represents cytochrome b559, which forms part of the reaction centre core of PSII as a heterodimer composed of one alpha subunit (PsbE), one beta (PsbF) subunit, and a haem cofactor. Two histidine residues from each subunit coordinate the haem. Although cytochrome b559 is a redox-active protein, it is unlikely to be involved in the primary electron transport in PSII due to its very slow photo-oxidation and photo-reduction kinetics. Instead, cytochrome b559 could participate in a secondary electron transport pathway that helps protect PSII from photo-damage. Cytochrome b559 is essential for PSII assembly [
].
RNA (C5-cytosine) methyltransferases (RCMTs) catalyse the transfer of a methyl group to the 5th carbon of a cytosine base in RNA sequences to produce C5-methylcytosine. RCMTs use the cofactor S-adenosyl-L-methionine (SAM) as a methyl donor [
]. The catalytic mechanism of RCMTs involves an attack by the thiolate of a Cys residue on position 6 of the target cytosine base to form a covalent link, thereby activating C5 for methyl-group transfer. Following the addition of the methyl group, a second Cys residue acts as a general base in the beta-elimination of the proton from the methylated cytosine ring. The free enzyme is restored and the methylated product is released [].Numerous putative RCMTs have been identified in archaea, bacteria and eukaryota [
,
]; most are predicted to be nuclear or nucleolar proteins []. The Escherichia coli Ribosomal RNA Small-subunit Methyltransferase Beta (RSMB) FMU (FirMicUtes) represents the first protein identified and characterised as a cytosine-specific RNA methyltransferase. RSMB was reported to catalyse the formation of C5-methylcytosine at position 967 of 16S rRNA [,
].A classification of RCMTs has been proposed on the basis of sequence similarity [
]. According to this classification, RCMTs are divided into 8 distinct subfamilies []. Recently, a new RCMT subfamily, termed RCMT9, was identified []. Members of the RCMT contain a core domain, responsible for the cytosine-specific RNA methyltransferase activity. This 'catalytic' domain adopts the Rossman fold for the accommodation of the cofactor SAM []. The RCMT subfamilies are also distinguished by N-terminal and C-terminal extensions, variable both in size and sequence [].The prototypical member of the Nucleolar Protein 2 RCMT subfamily, the S.cerevisiae NOP2, is an essential nucleolar protein required for pre-rRNA processing and 60S ribosomal subunit assembly [
] that acts as a ribosomal RNA methyltransferase [,
]. Its human homologue, the proliferation-associated nucleolar antigen P120, is a promising tumour marker []. P120 has been demonstrated to be implicated in rRNA biogenesis [,
], and is also proposed to act as an rRNA methyltransferase [].
Chorismate mutase (CM) is a regulatory enzyme (
) required for biosynthesis of the aromatic amino acids phenylalanine and tyrosine. CM catalyzes the Claisen rearrangement of chorismate to prephenate, which can subsequently be converted to precursors of either L-Phe or L-Tyr. In bifunctional enzymes the CM domain can be fused to a prephenate dehydratase (P-protein for Phe biosynthesis), to a prephenate dehydrogenase (T-protein, for Tyr biosynthesis), or to 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase (
). Besides these prokaryotic bifunctional enzymes, monofunctional CMs occur in prokaryotes as well as in fungi, plants and nematode worms [
].The type I or AroH class of CM is represented by Bacillus subtilis aroH, a monofunctional, nonallosteric, homotrimeric enzyme characterized by its pseudo-alpha/β-barrel 3D structure. Each monomer folds into a 5-stranded mixed β-sheet packed against an α-helix and a 3-10 helix. The core is formed by a closed barrel of mixed β-sheets surrounded by helices. The interfaces between adjacent subunits form three equivalent clefts that harbor the active sites [
].The type II or AroQ class of CM has a completely different all-helical 3D structure, represented by the CM domain of the bifunctional Escherichia coli P-protein. This type is named after the Enterobacter agglomerans monofunctional CM encoded by the aroQ gene [
]. All CM domains from bifunctional enzymes as well as most monofunctional CMs belong to this class, including archaeal CM.Eukaryotic CM from plants and fungi form a separate subclass of AroQ, represented by the Baker's yeast allosteric CM [
]. These enzymes show only partial sequence similarity to the prokaryotic CMs due to insertions of regulatory domains, but the helix-bundle topology and catalytic residues are conserved and the 3D structure of the E. coli CM dimer resembles a yeast CM monomer [,
,
]. The E. coli P-protein CM domain consists of 3 helices and lacks allosteric regulation. The yeast CM has evolved by gene duplication and dimerization and each monomer has 12 helices. Yeast CM is allosterically activated by Trp and inhibited by Tyr [
].This entry represents chorismate mutase from eukaryotes.
Most prokaryotic signal-transduction systems and a few eukaryotic pathways use phosphotransfer schemes involving two conserved components, a histidine protein kinase (HK) and a response regulator protein (RR). The HK, which is regulated by environmental stimuli, autophosphorylates at a histidine residue, creating a high-energy phosphoryl group that is subsequently transferred to an aspartate residue in the RR domain. Phosphorylation induces a conformational change in RR that results in activation of an associated domain that effects the response.Both prokaryotic and eukaryotic HKs contain the same basic signaling components, namely a diverse sensing domain and a highly conserved kinase core that has a unique fold, distinct from that of the Ser/Thr/Tyr kinase superfamily. The overall activity of the kinase is modulated by input signals to the sensing domain. HKs undergo an ATP-dependent autophosphorylation at a conserved His residue in the kinase core. Autophosphorylation is a bimolecular reaction between homodimers, in which one HK monomer catalyzes the phosphorylation of the conserved His residue in the second monomer.The sensing domains are variable in sequence, reflective of the many different environmental signals to which HKs are responsive, whereas the about 250-residue kinase core is more conserved. The kinase core is composed of a dimerization domain and an ATP/ADP-binding phosphotransfer or catalytic domain and can be identified by five conserved primary sequence motifs present in both eukaryotic and prokaryotic HKs. These motifs have been termed the H, N, G1, F and G2 boxes. The conserved His substrate is the central feature in the H box, whereas the N, G1, F and G2 boxes define the nucleotide binding cleft. In most HKs, the H box is part of the dimerization domain. However, for some proteins, like CheA, the conserved His is located at the far N terminus of the protein in a separate HPt domain. The N, G1, F and G2 boxes are usually contiguous, but the spacing between these motifs is somewhat varied. The catalytic core forms an α-β sandwich consisting of five antiparallel beta strands and three alpha helices [,
,
].The entry represents the histidine kinase core.
This entry represents MRG protein family, whose members include MORF4L1/2 (MRG15/MRGX) and MSL3L1/2 from humans, ESA1-associated factor 3 (Eaf3) from yeasts and male-specific lethal 3 (MSL3) from flies. They contain an N-terminal chromodomain that binds H3K36me3, a histone mark associated with transcription elongation [
]. Saccharomyces cerevisiae Eaf3 is a component of both NuA4 histone acetyltransferase and Rpd3S histone deacetylase complexes [
,
]. It was found that Eaf3 mediates preferential deacetylation of coding regions through an interaction between the Eaf3 chromodomain and methylated H3-K36 that presumably results in preferential association of the Rpd3 complex []. The Drosophila MSL proteins (MSL1, MSL2, MSL3, MLE, and MOF) are essential for elevating transcription of the single X chromosome in the male (X chromosome dosage compensation) [
]. Together with two partlyredundant non-coding RNAs, roX1 and roX2, they form the MSL complex, also known as dosage compensation complex or DCC. MSL complex upregulates transcription by spreading the histone H4 Lys16 (H4K16) acetyl mark [
] and allows compensation for the loss of one X-chromosomal allele by increasing the transcription from the retained allele []. The MSL3 chromodomain has been shown to bind DNA and methylated H4K20 in vitro []. Human MORF4L1, also known as MRG15, is a component of the NuA4 histone acetyltransferase complex that transcriptional activates genes by acetylation of nucleosomal histones H4 and H2A. This modification may both alter nucleosome - DNA interactions and promote interaction of the modified histones with other proteins which positively regulate transcription. NuA4 complex may also play a direct role in DNA repair when directly recruited to sites of DNA damage. MRG15 is also a component of the mSin3A/Pf1/HDAC complex which acts to repress transcription by deacetylation of nucleosomal histones. MRG15 was found to interact with PALB2, a tumour suppressor protein that plays a crucial role in DNA damage repair by homologous recombination [
]. Furthermore, MRG15 play a role in the response to double strand breaks (DSBs) by recruiting the BRCA complex (BRCA1, PALB2, BRCA2 and RAD51) to sites of damaged DNA [,
].
Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll 'a' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane [
,
,
]. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection []. This family represents the low molecular weight transmembrane protein Psb28 (PsbW) found in PSII, where it is a subunit of the oxygen-evolving complex. Psb28 appears to have several roles, including guiding PSII biogenesis and assembly, stabilising dimeric PSII [
], and facilitating PSII repair after photo-inhibition []. There appears to be two classes of Psb28, class 1 being found predominantly in algae and cyanobacteria, and class 2 being found predominantly in plants. This entry represents class 1 Psb28.
Signal recognition particle receptor, alpha subunit, N-terminal
Type:
Domain
Description:
The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes [
,
]. SRP recognises the signal sequence of the nascent polypeptide on the ribosome. In eukaryotes this retards its elongation until SRP docks the ribosome-polypeptide complex to the RER membrane via the SR receptor []. Eukaryotic SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor []. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRPtime to dock the ribosome-polypeptide complex to the RER membrane. In archaea, the SRP complex contains 7S RNA like its eukaryotic counterpart, yet only includes two of the six protein subunits found in the eukarytic complex: SRP19 and SRP54 [
].The SR receptor is a monomer consisting of the loosely membrane-associated SR-alpha homologue FtsY, while the eukaryotic SR receptor is a heterodimer of SR-alpha (70kDa) and SR-beta (25kDa), both of which contain a GTP-binding domain [
]. SR-alpha regulates the targeting of SRP-ribosome-nascent polypeptide complexes to the translocon []. SR-alpha binds to the SRP54 subunit of the SRP complex. The SR-beta subunit is a transmembrane GTPase that anchors the SR-alpha subunit (a peripheral membrane GTPase) to the ER membrane []. SR-beta interacts with the N-terminal SRX-domain of SR-alpha, which is not present in the bacterial FtsY homologue. SR-beta also functions in recruiting the SRP-nascent polypeptide to the protein-conducting channel. This entry represents the alpha subunit of the SR receptor.
Mediator of RNA polymerase II transcription subunit 22
Type:
Family
Description:
The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins. The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11. The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation.
The head module contains: MED6, MED8, MED11, SRB4/MED17, SRB5/MED18, ROX3/MED19, SRB2/MED20 and SRB6/MED22. The middle module contains: MED1, MED4, NUT1/MED5, MED7, CSE2/MED9, NUT2/MED10, SRB7/MED21 and SOH1/MED31. CSE2/MED9 interacts directly with MED4. The tail module contains: MED2, PGD1/MED3, RGR1/MED14, GAL11/MED15 and SIN4/MED16. The CDK8 module contains: MED12, MED13, CCNC and CDK8. Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.This entry represents subunit Med22 of the Mediator complex. It contains several eukaryotic Surfeit locus protein 5 (SURF5) sequences. The human Surfeit locus has been mapped on chromosome 9q34.1. The locus includes six tightly clustered housekeeping genes (Surf1-6), and the gene organisation is similar in human, mouse and chicken Surfeit loci [
].
Signal recognition particle receptor, beta subunit
Type:
Family
Description:
The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes [
,
]. SRP recognises the signal sequence of the nascent polypeptide on the ribosome. In eukaryotes this retards its elongation until SRP docks the ribosome-polypeptide complex to the RER membrane via the SR receptor []. Eukaryotic SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor []. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane. In archaea, the SRP complex contains 7S RNA like its eukaryotic counterpart, yet only includes two of the six protein subunits found in the eukarytic complex: SRP19 and SRP54 [].The SR receptor is a monomer consisting of the loosely membrane-associated SR-alpha homologue FtsY, while the eukaryotic SR receptor is a heterodimer of SR-alpha (70kDa) and SR-beta (25kDa), both of which contain a GTP-binding domain [
]. SR-alpha regulates the targeting of SRP-ribosome-nascent polypeptide complexes to the translocon []. SR-alpha binds to the SRP54 subunit of the SRP complex. The SR-beta subunit is a transmembrane GTPase that anchors the SR-alpha subunit (a peripheral membrane GTPase) to the ER membrane []. SR-beta interacts with the N-terminal SRX-domain of SR-alpha, which is not present in the bacterial FtsY homologue. SR-beta also functions in recruiting the SRP-nascent polypeptide to the protein-conducting channel. The beta subunit of the signal recognition particle receptor (SRP) is a transmembrane GTPase, which anchors the alpha subunit to the endoplasmic reticulum membrane [
].
RNA helicases from the DEAD-box family are found in almost all organisms and
have important roles in RNA metabolism such as splicing, RNA transport,ribosome biogenesis, translation and RNA decay. They are enzymes that unwind
double-stranded RNA molecules in an energy dependent fashion through thehydrolysis of NTP. DEAD-box RNA helicases belong to superfamily 2 (SF2) of
helicases. As other SF1 and SF2 members they contain seven conserved motifswhich are characteristic of these two superfamilies [
].DEAD-box is named after the amino acids of motif II or Walker B (Mg2+-binding
aspartic acid). Besides these seven motifs, DEAD-box RNA helicases contain aconserved cluster of nine amino-acids (the Q motif) with an invariant
glutamine located N-terminally of motif I. An additional highly conserved butisolated aromatic residue is also found upstream of these nine residues [
].The Q motif is characteristic of and unique to DEAD box family of helicases.
It is supposed to control ATP binding and hydrolysis, and therefore itrepresents a potential mechanism for regulating helicase activity.
Several structural analyses of DEAD-box RNA helicases have been reported [
,
]. The Q motif is located in close proximity to motif I. Theconserved glutamine and aromatic residues interact with the ADP molecule.
Some proteins known to contain a Q motif:
Eukaryotic initiation factor 4A (eIF4A). An ATP-dependent RNA helicase
which is a subunit of the eIF4F complex involved in cap recognition andrequired for mRNA binding to ribosome.Various eukaryotic helicases involved in ribosome biogenesis (DBP3, DRS1,
SPB4, MAK5, DBP6, DBP7, DBP9, DBP10).Eukaryotic DEAD-box proteins involved in pre-mRNA splicing (Prp5p, Prp28p
and Sub2p).DEAD-box proteins required for mitochondrial genome expression (MSS116 and
MRH4).Fungi ATP-dependent RNA helicase DHH1. It is required for decapping and
turnover of mRNA.Fungi ATP-dependent RNA helicase DBP5. It is involved in nucleo-cytoplasmic
transport of poly(A) RNA.Bacterial ATP-dependent RNA helicase rhlB. It is involved in the RNA
degradosome, a multi-enzyme complex important in RNA processing andmessenger RNA degradation.Bacterial cold-shock DEAD box protein A.This entry represents a region stretching from the conserved aromatic residue to one amino acid after the glutamine of the Q motif.
Actin [
,
] is a ubiquitous protein involved in the formation of filaments that are major components of the cytoskeleton. These filaments interact with myosin to produce a sliding effect, which is the basis of muscular contraction and many aspects of cell motility, including cytokinesis. Each actin protomer binds one molecule of ATP and has one high affinity site for either calcium or magnesium ions, as well as several low affinity sites. Actin exists as a monomer in low salt concentrations, but filaments form rapidly as salt concentration rises, with the consequent hydrolysis of ATP. Actin from many sources forms a tight complex with deoxyribonuclease (DNase I) although the significance of this is still unknown. The formation of this complex results in the inhibition of DNase I activity, and actin loses its ability to polymerise. It has been shown that an ATPase domain of actin shares similarity with ATPase domains of hexokinase and hsp70proteins [
,
].In vertebrates there are three groups of actin isoforms: alpha, beta and gamma. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exists in most cell types as components of the cytoskeleton and as mediators of internal cell motility. In plants there are many isoforms which are probably involved in a variety of functions such as cytoplasmic streaming, cell shape determination, tip growth, graviperception, cell wall deposition, etc.Recently some divergent actin-like proteins have been identified in several species. These proteins include centractin (actin-RPV) from mammals, fungi yeast ACT5, Neurospora crassa ro-4 and Pneumocystis carinii, which seems to be a component of a multi-subunit centrosomal complex involved in microtubule based vesicle motility (this subfamily is known as ARP1); ARP2 subfamily, which includes chicken ACTL, Saccharomyces cerevisiae ACT2, Drosophila melanogaster 14D and Caenorhabditis elegans actC; ARP3 subfamily, which includes actin 2 from mammals, Drosophila 66B, yeast ACT4 and Schizosaccharomyces pombe act2; and ARP4 subfamily, which includes yeast ACT3 and Drosophila 13E.
Over 70 metallopeptidase families have been identified to date. In these enzymes a divalent cation which is usually zinc, but may be cobalt, manganese or copper, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. In some families of co-catalytic metallopeptidases, two metal ions are observed in crystal structures ligated by five amino acids, with one amino acid ligating both metal ions. The known metal ligands are His, Glu, Asp or Lys. At least one other residue is required for catalysis, which may play an electrophillic role.
Many metalloproteases contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases [].This group of metallopeptidases belong to the MEROPS peptidase family M18, (clan MH). The proteins have two catalytic zinc ions at the active site, bound by His/Asp, Asp, Glu, Asp/Glu and His. The catalysed reaction involves the release of an N-terminal aminoacid, usually neutral or hydrophobic, from a polypeptide [
].The type example is aminopeptidase I from Saccharomyces cerevisiae (Baker's yeast), the sequence of which has been deduced, and the mature protein shown to consist
of 469 amino acids []. A 45-residue presequence contains bothpositively- and negatively-charged and hydrophobic residues, which could be arranged
in an N-terminal amphiphilic α-helix []. The presequence differs fromsignal sequences that direct proteins across bacterial plasma membranes and
endoplasmic reticulum or into mitochondria. It is unclear how this uniquepresequence targets aminopeptidase I to yeast vacuoles, and how this
sorting utilises classical protein secretory pathways [].This entry represents the beta roll structural domain found in M18 family and aspartyl aminopeptidases.
ATP-dependent protease complexes are present in all three kingdoms of life, where they rid the cell of misfolded or damaged proteins and control the level of certain regulatory proteins. They include the proteasome in Eukaryotes, Archaea, and Actinomycetales and the HslVU (ClpQY, clpXP) complex in other eubacteria. Genes homologues to eubacterial HslU (ClpY, clpX) have also been demonstrated in to be present in the genome of trypanosomatid protozoa [
].The proteasome (or macropain) (
) [
,
,
,
,
] is a multicatalytic proteinase complex in eukaryotes and archaea, and in some bacteria, that is involved in an ATP/ubiquitin-dependent non-lysosomal proteolytic pathway. In eukaryotes the 20S proteasome is composed of 28 distinct subunits which form a highly ordered ring-shaped structure (20S ring) of about 700kDa. Proteasome subunits can be classified on the basis of sequence similarities into two groups, alpha (A) and beta (B). The proteasome consists of four stacked rings composed of alpha/beta/beta/alpha subunits. There are seven different alpha subunits and seven different beta subunits []. Three of the seven beta subunits are peptidases, each with a different specificity. Subunit beta1c (MEROPS identifier T01.010) has a preference for cleaving glutaminyl bonds ("peptidyl-glutamyl-like"or "caspase-like"), subunit beta2c (MEROPS identifier T01.011) has a preference for cleaving arginyl and lysyl bonds ("trypsin-like"), and subunit beta5c (MEROPS identifier T01.012) cleaves after hydrophobic amino acids ("chymotrypsin-like") [
]. The proteasome subunits are related to N-terminal nucleophile hydrolases, and the catalytic subunits have an N-terminal threonine nucleophile.The prokaryotic ATP-dependent proteasome is coded for by the heat-shock locus VU (HslVU). It consists of HslV, a peptidase, and HslU (
), the ATPase and chaperone belonging to the AAA/Clp/Hsp100 family. The crystal structure of Thermotoga maritima HslV has been determined to 2.1-A resolution. The structure of the dodecameric enzyme is well conserved compared to those from Escherichia coli and Haemophilus influenzae [
,
].This family consists of the beta (or B type) subunits of the eukaryotic proteasome as well as the archaeal and bacterial proteasomes. These proteins belong to family T1 in the classification of peptidases.
The SET domain is a 130 to 140 amino acid, evolutionary well conserved sequence motif that was initially characterised in the Drosophila proteins Su(var)3-9, Enhancer-of-zeste and Trithorax. In addition to these chromosomal proteins modulating gene activities and/or chromatin structure, the SET domain is found in proteins of diverse functions ranging from yeast to mammals, but also including some bacteria and viruses [,
].The SET domains of mammalian SUV39H1 and 2 and fission yeast clr4 have been shown to be necessary for the methylation of lysine-9 in the histone H3 N terminus []. However, this histone methyltransferase (HMTase) activity is probably restricted to a subset of SET domain proteins as it requires the combination of the SET domain with the adjacent cysteine-rich regions, one located N-terminally (pre-SET) and the other posterior to the SET domain (post-SET). Post- and pre- SET regions seem then to play a crucial role when it comes to substrate recognition and enzymatic activity [,
].The structure of the SET domain and the two adjacent regions pre-SET and post-SET have been solved [,
,
]. The SET structure is all beta, but consists only in sets of few short strands composing no more than a couple of small sheets. Consequently the SET structure is mostly defined by turns and loops. An unusual feature is that the SET core is made up of two discontinual segments of the primary sequence forming an approximate L shape [,
,
]. Two of the most conserved motifs in the SET domain are constituted by (1) a stretch at the C-terminal containing a strictly conserved tyrosine residue and (2) a preceding loop inside which the C-terminal segment passes forming a knot-like structure, but not quite a true knot. These two regions have been proven to be essential for SAM binding and catalysis, particularly the invariant tyrosine where in all likelihood catalysis takes place [,
].
The STAS (Sulphate Transporter and AntiSigma factor antagonist) domain is found in the bacterial anti-sigma factor antagonists (ASA) and the C-terminal region of SLC26 (SulP) anion transporters. The activity of bacterial sigma transcription factors is controlled by a regulatory cascade involving an antisigma-factor, the antisigma-factor antagonist (ASA) and a phosphatase. The antisigma-factor binds to sigma and holds it in an inactive complex. The ASA can also interact with the anti-sigma-factor, allowing the release of the active sigma factor. As the antisigma-factor is a protein kinase, it can phosphorylate the antisigma antagonist on a conserved serine residue of the STAS domain. This phosphorylation inactivates the ASA that can be reactivated through dephosphorylation by a phosphatase [
,
]. The STAS domain of the ASA SpoIIAA binds GTP and ATP and possesses a weak NTPase activity. Strong sequence conservation suggests that the STAS domain could possess general NTP-binding activity, and it has been proposed that the NTPs are likely to elicit specific conformational changes in the STAS domain through binding and/or hydrolysis []. Resolution of the solution structure of the ASA SpoIIAA from Bacillus subtilis has shown that the STAS domain consists of a four-stranded β-sheet and four α-helices. The STAS domain forms a characteristic α-helical handle-like structure [,
]. The STAS domain of E. coli YchM protein, a SLC26 (SulP) family member, has been shown to interact with acyl carrier protein (ACP), which is an activated thiol ester carrier of acyl intermediates during fatty acid biosynthesis (FAB) and other acylation reactions [
]. Malfunctions in members of the SLC26A family of anion transporters are involved in three human diseases: diastrophic dysplasia/achondrogenesis type 1B (DTDST), Pendred's syndrome (PDS) and congenital chloride diarrhea (CLD). These proteins contain 12 transmembrane helices followed by a cytoplasmic STAS domain at the C terminus. The importance of the STAS domain in these transporters is illustrated by the fact that a number of mutations in PDS and DTDST map to it [
,
].
This entry represents phosphoinositol-specific phospholipase C (PLC) from eukaryotes. Proteins in this entry include PLC-beta, gamma, delta, epsilon, eta, zeta and inactive phospholipase C-like protein 2 (PLC-L2). Phosphoinositol-specific phospholipase C (PLC; (
) plays an important role in signal transduction processes [
], mediating the cellular actions of a variety of hormones, neurotransmitters and growth factors. Upon agonist-dependent activation, PLC catalyses the hydrolysis of membrane phosphatidylinositol 4,5-bisphosphate (PIP2), generating the second messengers inositol 1,4,5-trisphosphate (IP3) and diacylglycerol (DAG). IP3 binds specific intracellular receptors to trigger Ca2+mobilisation, while DAG mediates activation of a family of protein kinase C isozymes. This catalytic process is tightly regulated by reversible phosphorylation and binding of regulatory proteins [
,
,
]. Based on molecular size, immunoreactivity and amino acid sequence, several subtypes have been classified. Overall, sequence identity between sub-types is low, yet all isoforms share a split TIM barrel containing two conserved domains, designated X and Y []. The core eukaryotic PLC enzyme is composed of a pleckstrin homology (PH) domain, four tandem EF hand domains, a split TIM barrel, and a C2 domain [
]. The presence of an insert in the TIM barrel led to the naming of the N- and C-terminal halves of the TIM barrel as 'X-box' and 'Y-box'. The order of these two regions is always the same (NH2-X-Y-COOH), but the spacing is variable. In most isoforms, the distance between these two regions is only 50-100 residues, for example, in PLC-beta subtypes, X and Y domains are separated by a stretch of 70-120 amino acids rich in Ser, Thr and acidic residues (their C terminus is rich in basic residues). However, in PLC-gammas, there is an insert of more than 400 residues containing a PH domain, two SH2 domains, and one SH3 domain. The two conserved X and Y domains have been shown to be important for the catalytic activity. C-terminal to the Y-box, there is a C2 domain, possibly involved in Ca-dependent membrane attachment.
The N-acetyltransferases (NAT) ([intenz:2.3.1.-]) are enzymes that use acetyl coenzyme A (CoA) to transfer an acetyl group to a substrate, a reaction implicated in various functions from bacterial antibiotic resistance to mammalian circadian rhythm and chromatin remodelling. The Gcn5-related N-acetyltransferases (GNAT) catalyse the transfer of the acetyl from the CoA donor to a primary amine of the acceptor. The GNAT proteins share a domain composed of four conserved sequence motifs A-D [,
]. This GNAT domain is named after yeast GCN5 (from General Control Nonrepressed) and related histone acetyltransferases (HATs) like Hat1 and PCAF. HATs acetylate lysine residues of N-terminal histone tails, resulting in transcription activation. Another category of GNAT, the aminoglycoside N-acetyltransferases, confer antibiotic resistance by catalysing the acetylation of amino groups in aminoglycoside antibiotics []. GNAT proteins can also have anabolic and catabolic functions in both prokaryotes and eukaryotes [,
,
,
,
].The acetyltransferase/GNAT domain forms a structurally conserved fold of 6 to 7 β-strands (B) and 4 helices (H) in the topology B1-H1-H2-B2-B3-B4-H3-B5-H4-B6, followed by a C-terminal strand which may be from the same monomer or contributed by another [
,
]. Motifs D (B2-B3), A (B4-H3) and B (B5-H4) are collectively called the HAT core [,
,
], while the N-terminal motif C (B1-H1) is less conserved.Some proteins known to contain a GNAT domain:Actinobacterial mycothiol acetyltransferase (MshD), which catalyses the transfer of acetyl from acetyl-CoA to desacetylmycothiol to form mycothiol. Yeast GCN5 and Hat1, which are histone acetyltransferases (EC 2.3.1.48).Human PCAF, a histone acetyltransferase.Mammalian serotonin N-acetyltransferase (SNAT) or arylalkylamine NAT
(AANAT), which acetylates serotonin into a circadian neurohormone that mayparticipate in light-dark rhythms, and human mood and behaviour.Mammalian glucosamine 6-phosphate N-acetyltransferase (GNA1) (EC 2.3.1.4).Escherichia coli RimI and RimJ, which acetylate the N-terminal alanine ofribosomal proteins S18 and S5, respectively (EC 2.3.1.128).Mycobacterium tuberculosis aminoglycoside 2'-N-acetyltransferase (Aac),
which acetylates the 2' hydroxyl or amino group of a broad spectrum ofaminoglycoside antibiotics.Bacillus subtilis BltD and PaiA, which acetylate spermine and spermidine.This entry represents the entire GNAT domain.
Xeroderma pigmentosum (XP) [
] is a human autosomal recessive disease, characterised by a high incidence of sunlight-induced skin cancer. People's skin cells with this condition are hypersensitive to ultraviolet light, due to defects in the incision step of DNA excision repair. There are a minimum of seven genetic complementation groups involved in this pathway: XP-A to XP-G. XP-G is one of the most rare and phenotypically heterogeneous of XP, showing anything from slight to extreme dysfunction in DNA excision repair [,
]. XP-G can be corrected by a 133 Kd nuclear protein, XPGC []. XPGC is an acidic protein that confers normal UV resistance in expressing cells []. It is a magnesium-dependent, single-strand DNA endonuclease that makes structure-specific endonucleolytic incisions in a DNA substrate containing a duplex region and single-stranded arms [,
]. XPGC cleaves one strand of the duplex at the border with the single-stranded region [].XPG (ERCC-5) belongs to a family of proteins that includes RAD2 from Saccharomyces cerevisiae (Baker's yeast) and rad13 from Schizosaccharomyces pombe (Fission yeast), which are single-stranded DNA endonucleases [
,
,
]; mouse and human FEN-1, a structure-specific endonuclease; RAD2 from fission yeast and RAD27 from budding yeast; fission yeast exo1, a 5'-3' double-stranded DNA exonuclease that may act in a pathway that corrects mismatched base pairs; yeast DHS1, and yeast DIN7. Sequence alignment of this family of proteins reveals that similarities are largely confined to two regions. The first is located at the N-terminal extremity (N-region) and corresponds to the first 95 to 105 amino acids. The second region is internal (I-region) and found towards the C terminus; it spans about 140 residues and contains a highly conserved core of 27 amino acids that includes a conserved pentapeptide (E-A-[DE]-A-[QS]). It is possible that the conserved acidic residues are involved in the catalytic mechanism of DNA excision repair in XPG. The amino acids linking the N- and I-regions are not conserved.This entry represents the N-terminal of XPG.
Site-specific recombination plays an important role in DNA rearrangement in prokaryotic organisms. Two types of site-specific recombination are known to occur:Recombination between inverted repeats resulting in the reversal of a DNA segment.Recombination between repeat sequences on two DNA molecules resulting in their cointegration, or between repeats on one DNA molecule resulting in the excision of a DNA fragment.Site-specific recombination is characterised by a strand exchange mechanism that requires no DNA synthesis or high energy cofactor; the phosphodiester bond energy is conserved in a phospho-protein linkage during strand cleavage and re-ligation.Two unrelated families of recombinases are currently known [
]. The first, called the 'phage integrase' family, groups a number of bacterial, phage and yeast plasmid enzymes. The second [], called the 'resolvase' family, groups enzymes which share the following structural characteristics: an N-terminal catalytic and dimerization domain that contains a conserved serine residue involved in the transient covalent attachment to DNA, and a C-terminal helix-turn-helix DNA-binding domain.The resolvase family is currently known to include the following proteins:DNA invertase from Salmonella typhimurium (gene hin). Hin can invert a 900 bp DNA fragment adjacent to a gene for one of the flagellar antigens.DNA invertase from Escherichia coli (gene pin).DNA invertase from Bacteriophage Mu (gene gin), P1 and P7 (gene cin).Resolvases from transposons Tn3, Tn21, Tn501, Tn552, Tn917, Tn1546, Tn1721, Tn2501 and Tn1000 (known as gamma-delta resolvase).Resolvase from Clostridium perfringens plasmid pIP404.Resolvase from E. coli plasmid R46.Resolvase from E. coli plasmid RP4 (gene parA).A putative recombinase from Bacillus subtilis (gene cisA) [
] which plays an important role in sporulation by catalyzing the recombination of genes spoIIIC and spoIVCB to form polymerase sigma-K factor.Uvp1, a protein from E. coli plasmid pR which cooperates with the mucAB genes in the DNA repair process and could be a resolvase [
].Generally, proteins from the resolvase family have 180 to 200 amino-acid residues, excepting cisA which is much larger (500 residues).
Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll 'a' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane [
,
,
]. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection []. This entry represents the low molecular weight transmembrane protein PsbL found in PSII. PsbL is located in a gene cluster with PsbE, PsbF and PsbJ (PsbEFJL). Both PsbL and PsbJ (
) are essential for proper assembly of the OEC. Mutations in PsbL prevent the formation of both PSII core dimers and PSII-light harvesting complex [
]. In addition, both PsbL and PsbJ are involved in the unidirectional flow of electrons, where PsbJ regulates the forward electron flow from D2 (Qa) to the plastoquinone pool, and PsbL prevents the reduction of PSII by back electron flow from plastoquinol protecting PSII from photo-inactivation [].
Transcription factors of the T-box family are required both for early cell-fate decisions, such as those necessary for formation of the basic vertebrate body plan, for differentiation and organogenesis [
] and also have been associated to multiple aspects of development and in adult terminal cell-type differentiation in different animal lineages []. The T-box is defined as the minimal region within the T-box protein that is both necessary and sufficient for sequence-specific DNA binding, all members of the family so far examined bind to the DNA consensus sequence TCACACCT and function as transcriptional repressors and/or activators []. The T-box is a relatively large DNA-binding domain, generally comprising about a third of the entire protein (17-26kDa) [].These genes were uncovered on the basis of similarity to the DNA binding domain [
] of Mus musculus (Mouse) Brachyury (T) gene product, which similarity is the defining feature of the family. The Brachyury gene is named for its phenotype, which was identified 70 years ago as a mutant mouse strain with a short blunted tail. The gene, and its paralogues, have become a well-studied model for the family, and hence much of what is known about the T-box family is derived from the murine Brachyury gene.Consistent with its nuclear location, Brachyury protein has a sequence-specific DNA-binding activity and can act as a transcriptional regulator [
]. Homozygous mutants for the gene undergo extensive developmental anomalies, thus rendering the mutation lethal []. The postulated role of Brachyury is as a transcription factor, regulating the specification and differentiation of posterior mesoderm during gastrulation in a dose-dependent manner [].T-box proteins tend to be expressed in specific organs or cell types, especially during development, and they are generally required for the development of those tissues, for example, Brachyury is expressed in posterior mesoderm and in the developing notochord, and it is required for the formation of these cells in mice [
]. The T-box family is an ancient group that appears to play a critical role in development in all animal species [
].
The annexins (or lipocortins) are a family of proteins that bind to phospholipids in a calcium-dependent manner [
]. The 12 annexins common to vertebrates are classified in the annexin A family and named as annexins A1-A13 (or ANXA1-ANXA13), leaving A12 unassigned in the official nomenclature. Annexins outside vertebrates are classified into families B (in invertebrates), C (in fungi and some groups of unicellular eukaryotes), D (in plants), and E (in protists) []. Annexins are absent from yeasts and prokaryotes [].Most eukaryotic species have 1-20 annexin (ANX) genes. All annexins share a core domain made up of four similar repeats, each approximately 70 amino acids long [
]. Each individual annexin repeat (sometimes referred to as endonexin folds) is folded into five α-helices, and in turn are wound into a right-handed super-helix; they usually contain a characteristic 'type 2' motif for binding calcium ions with the sequence 'GxGT-[38 residues]-D/E'. Animal and fungal annexins also have variable amino-terminal domains. The core domains of most vertebrate annexins have been analysed by X-ray crystallography, revealing conservation of their secondary and tertiary structures despite only 45-55% amino-acid identity among individual members. The four repeats pack into a structure that resembles a flattened disc, with a slightly convex surface on which the Ca2+ -binding loops are located and a concave surface at which the amino and carboxyl termini come into close apposition.
Annexins are traditionally thought of as calcium-dependent phospholipid-binding proteins, but recent work suggests a more complex set of functions. The family has been linked with inhibition of phospholipase activity, exocytosis and endocytosis, signal transduction, organisation of the extracellular matrix, resistance to reactive oxygen species and DNA replication [
].This entry represents Annexin A1, which inhibits phospholipase A2, either in response to inflammation, or following dephosphorylation by protein kinases involved in the signal transduction pathway. The protein may also associate with the cell cytoskeleton by binding to actin fibres.
Melatonin is a naturally occurring compound found in animals, plants, and microbes [
,
]. In animals melatonin is secreted by the pineal gland during darkness [,
]. It regulates a variety of neuroendocrine functions and is thought to play an essential role in circadian rhythms []. Drugs that modify the action of melatonin, and hence influence circadian cycles, are of clinical interest for example, in the treatment of jet-lag []. Many of the biological effects of melatonin are produced through the activation of melatonin receptors [
], which are members of rhodopsin-like G protein-coupled receptor family. There are three melatonin receptor subtypes. Melatonin receptor type 1A and melatonin receptor type 1B are present in humans and other mammals [] while melatonin receptor type 1C has been identified in amphibia and birds []. There is also a closely-related orphan receptor, termed melatonin-related receptor type 1X (also known as GPR50) [], is yet to achieve receptor status from the International Union of Basic and Clinical Pharmacology (IUPHAR), since a robust response mediated via the protein has not been reported in the literature. Melatonin receptor type 1C receptors are 80% identical and are distinct from 1A and 1B subtypes. Similar ligand binding and functional characteristics are observed in expressed 1A and 1C receptors. The melatonin receptors inhibit adenylyl cyclase via a pertussis-toxin-sensitive G-protein, probably of the Gi/Go class.
This entry represents melatonin-related receptor 1X from human pituitary, also known as G protein-coupled receptor 50 (GPR50). It is the mammalian orthologue of melatonin receptor 1C described in non-mammalian vertebrates [
]. It is closely related to the other melatonin subtypes as it is 45% identical to human 1A and 1B receptors [], However, it lacks N-linked glycosylation sites and bears a >300 residue C-terminal tail. 1X receptor is expressed in hypothalamus and pituitary, suggesting that the receptor and its natural ligand are involved in neuroendocrine function [
].
Fibrinogen plays key roles in both blood clotting and platelet aggregation. During blood clot formation, the conversion of soluble fibrinogen to insoluble fibrin is triggered by thrombin, resulting in the polymerisation of fibrin, which forms a soft clot; this is then converted to a hard clot by factor XIIIA, which cross-links fibrin molecules. Platelet aggregation involves the binding of the platelet protein receptor integrin alpha(IIb)-beta(3) to the C-terminal D domain of fibrinogen [
]. In addition to platelet aggregation, platelet-fibrinogen interaction mediates both adhesion and fibrin clot retraction. Fibrinogen occurs as a dimer, where each monomer is composed of three non-identical chains, alpha, beta and gamma, linked together by several disulphide bonds [
]. The N-terminals of all six chains come together to form the centre of the molecule (E domain), from which the monomers extend in opposite directions as coiled coils, followed by C-terminal globular domains (D domains). Therefore, the domain composition is: D-coil-E-coil-D. At each end, the C-terminal of the alpha chain extends beyond the D domain as a protuberance that is important for cross-linking the molecule. During clot formation, the N-terminal fragments of the alpha and beta chains (within the E domain) in fibrinogen are cleaved by thrombin, releasing fibrinopeptides A and B, respectively, and producing fibrin. This cleavage results in the exposure of four binding sites on the E domain, each of which can bind to a D domain from different fibrin molecules. The binding of fibrin molecules produces a polymer consisting of a lattice network of fibrins that form a long, branching, flexible fibre [
,
]. Fibrin fibres interact with platelets to increase the size of the clot, as well as with several different proteins and cells, thereby promoting the inflammatory response and concentrating the cells required for wound repair at the site of damage.This superfamily represents a subdomain of the C-terminal globular D domain found in fibrinogen alpha, beta and gamma chains, and related domains found in protein involved in protein or cell binding.
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].The term opioid refers to a class of substance that produces its effects
via the major classes of opioid receptor, termed mu, delta and kappa.The delta opioid receptor has a more discrete distribution in the CNS
relative to the mu and kappa opioid receptors: it is found in the cerebralcortex, amygdala, nucleus accumbens, olfactory tubercle and pontine
nucleus. It is also found in certain smooth muscles, e.g. hamster vasdeferens, and in cell lines. Delta-receptors mediate analgesia.
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Several 7TM receptors have been cloned but their endogenous ligands are
unknown; these have been termed orphan receptors. G10d was isolated from arat genomic library and a liver cDNA library [
]. It is widely distributed,being found in high levels in the lung, liver and adrenal gland, and also
in the kidney, aorta, heart, spinal cord, gut and testis [].
Aconitase (aconitate hydratase;
) is an iron-sulphur protein that contains a [4Fe-4S]-cluster and catalyses the interconversion of isocitrate and citrate via a cis-aconitate intermediate. Aconitase functions in both the TCA and glyoxylate cycles, however unlike the majority of iron-sulphur proteins that function as electron carriers, the [4Fe-4S]-cluster of aconitase reacts directly with an enzyme substrate. In eukaryotes there is a cytosolic form (cAcn) and a mitochondrial form (mAcn) of the enzyme. In bacteria there are also 2 forms, aconitase A (AcnA) and B (AcnB). Several aconitases are known to be multi-functional enzymes with a second non-catalytic, but essential function that arises when the cellular environment changes, such as when iron levels drop [
,
]. Eukaryotic cAcn and mAcn, and bacterial AcnA have the same domain organisation, consisting of three N-terminal alpha/beta/alpha domains, a linker region, followed by a C-terminal 'swivel' domain with a beta/beta/alpha structure (1-2-3-linker-4), although mAcn is small than cAcn. However, bacterial AcnB has a different organisation: it contains an N-terminal HEAT-like domain, followed by the 'swivel' domain, then the three alpha/beta/alpha domains (HEAT-4-1-2-3) [].This entry represents the N-terminal HEAT-like domain superfamily, which is present in bacterial aconitase (AcnB), but not in AcnA or eukaryotic cAcn/IRP2 or mAcn. This domain consists of 10 alpha helices, forming two curved layers in a right-handed α-α superhelix. The first and last alpha helix interact with another domain within aconitate B, while the middle 8 form a structure made up of four repeating units. This HEAT-like domain is also referred to as domain 5. The helices from domain 5 pack against domain 4 to form a funnel-like structure towards the active site, implicating domain 5 in protein-protein interactions. The HEAT-like domain and the 'swivel' domain that follows it were shown to be sufficient for dimerisation and for AcnB binding to mRNA. An iron-mediated dimerisation mechanism may be responsible for switching AcnB between its catalytic and regulatory roles, as dimerisation requires iron while mRNA binding is inhibited by iron.
The STAS (Sulphate Transporter and AntiSigma factor antagonist) domain is found in the bacterial anti-sigma factor antagonists (ASA) and the C-terminal region of SLC26 (SulP) anion transporters. The activity of bacterial sigma transcription factors is controlled by a regulatory cascade involving an antisigma-factor, the antisigma-factor antagonist (ASA) and a phosphatase. The antisigma-factor binds to sigma and holds it in an inactive complex. The ASA can also interact with the anti-sigma-factor, allowing the release of the active sigma factor. As the antisigma-factor is a protein kinase, it can phosphorylate the antisigma antagonist on a conserved serine residue of the STAS domain. This phosphorylation inactivates the ASA that can be reactivated through dephosphorylation by a phosphatase [
,
]. The STAS domain of the ASA SpoIIAA binds GTP and ATP and possesses a weak NTPase activity. Strong sequence conservation suggests that the STAS domain could possess general NTP-binding activity, and it has been proposed that the NTPs are likely to elicit specific conformational changes in the STAS domain through binding and/or hydrolysis []. Resolution of the solution structure of the ASA SpoIIAA from Bacillus subtilis has shown that the STAS domain consists of a four-stranded β-sheet and four α-helices. The STAS domain forms a characteristic α-helical handle-like structure [,
]. The STAS domain of E. coli YchM protein, a SLC26 (SulP) family member, has been shown to interact with acyl carrier protein (ACP), which is an activated thiol ester carrier of acyl intermediates during fatty acid biosynthesis (FAB) and other acylation reactions [
]. Malfunctions in members of the SLC26A family of anion transporters are involved in three human diseases: diastrophic dysplasia/achondrogenesis type 1B (DTDST), Pendred's syndrome (PDS) and congenital chloride diarrhea (CLD). These proteins contain 12 transmembrane helices followed by a cytoplasmic STAS domain at the C terminus. The importance of the STAS domain in these transporters is illustrated by the fact that a number of mutations in PDS and DTDST map to it [
,
].
Somatostatin (SST), also known as somatotropin release-inhibiting factor (SRIF), is a hypothalamic hormone, a pancreatic hormone, and a central and peripheral neurotransmitter. Somatostatin has a wide distribution throughout the central nervous system (CNS) as well as in peripheral tissues, for example in the pituitary, pancreas and stomach. The various actions of somatostatin are mediated by a family of rhodopsin-like G protein-coupled receptors, which comprise of five distinct subtypes: Somatostatin receptor 1 (SSTR1), Somatostatin receptor 2 (SSTR2), Somatostatin receptor 3 (SSTR3), Somatostatin receptor 4 (SSTR4) and Somatostatin receptor 5 (SSTR5) [
,
,
]. These subtypes are widely expressed in many tissues [,
,
,
,
,
], and frequently multiple subtypes coexist in the same cell []. The somatostatin receptor subtypes also share common signalling pathways, such as the inhibition of adenylyl cyclase [,
], activation of phosphotyrosine phosphatase (PTP), and modulation of mitogen-activated protein kinase (MAPK) through G protein-dependent mechanisms. Some of the subtypes are also coupled to inward rectifying K+ channels (SSTR2, SSTR3, SSTR4, SSTR5) [,
], to voltage-dependent Ca2+ channels (SSTR1, SSTR2) [], to an Na+/H+ exchanger (SSTR1), AMPA/kainate glutamate channels (SSTR1, SSTR2), phospholipase C (SSTR2, SSTR5), and phospholipase A2 (SSTR4) []. Amongst the wide spectrum of somatostatin effects, several biological responses have been identified that display absolute or relative subtype selectivity. These include GH secretion (SSTR2 and 5), insulin secretion (SSTR5), glucagon secretion (SSTR2), and immune responses (SSTR2) [
].This entry represents SST5R. It is expressed in range of tissues including the small intestine, heart, adrenal, cerebellum, pituitary, placenta and skeletal muscle. It is also expressed in pancreatic islets [
], where somatostatin is a known regulator of insulin and glucagon secretion. All five human somatostatin receptors expressed in COS-7 cells have been shown to couple to activation of phosphoinositide (PI)-specific PLC-beta; and Ca2+ mobilisation via pertussis toxin-sensitive G protein(s) with an order of potency of SSTR5 >SSTR2 >SSTR3 >SSTR4 >SSTR1 [
].
Somatostatin (SST), also known as somatotropin release-inhibiting factor (SRIF), is a hypothalamic hormone, a pancreatic hormone, and a central and peripheral neurotransmitter. Somatostatin has a wide distribution throughout the central nervous system (CNS) as well as in peripheral tissues, for example in the pituitary, pancreas and stomach. The various actions of somatostatin are mediated by a family of rhodopsin-like G protein-coupled receptors, which comprise of five distinct subtypes: Somatostatin receptor 1 (SSTR1), Somatostatin receptor 2 (SSTR2), Somatostatin receptor 3 (SSTR3), Somatostatin receptor 4 (SSTR4) and Somatostatin receptor 5 (SSTR5) [
,
,
]. These subtypes are widely expressed in many tissues [,
,
,
,
,
], and frequently multiple subtypes coexist in the same cell []. The somatostatin receptor subtypes also share common signalling pathways, such as the inhibition of adenylyl cyclase [,
], activation of phosphotyrosine phosphatase (PTP), and modulation of mitogen-activated protein kinase (MAPK) through G protein-dependent mechanisms. Some of the subtypes are also coupled to inward rectifying K+ channels (SSTR2, SSTR3, SSTR4, SSTR5) [,
], to voltage-dependent Ca2+ channels (SSTR1, SSTR2) [], to an Na+/H+ exchanger (SSTR1), AMPA/kainate glutamate channels (SSTR1, SSTR2), phospholipase C (SSTR2, SSTR5), and phospholipase A2 (SSTR4) []. Amongst the wide spectrum of somatostatin effects, several biological responses have been identified that display absolute or relative subtype selectivity. These include GH secretion (SSTR2 and 5), insulin secretion (SSTR5), glucagon secretion (SSTR2), and immune responses (SSTR2) [
].This entry represents SSTR3. It is widely distributed in mouse brain, with high levels in the forebrain, hippocampus and amygdala; moderate levels are also present in the substantia nigra. All five human somatostatin receptors expressed in COS-7 cells are coupled to activation of phosphoinositide (PI)-specific PLC-beta; and Ca2+ mobilisation via pertussis toxin-sensitive G protein(s) with an order of potency of SSTR5 >SSTR2 >SSTR3 >SSTR4 >SSTR1 [
]. Inhibition of angiogenesis has been shown to be via the SSTR3, and involves the inhibition of MAPK and endothelial nitric oxide synthase (eNOS) activity [].
Somatostatin (SST), also known as somatotropin release-inhibiting factor (SRIF), is a hypothalamic hormone, a pancreatic hormone, and a central and peripheral neurotransmitter. Somatostatin has a wide distribution throughout the central nervous system (CNS) as well as in peripheral tissues, for example in the pituitary, pancreas and stomach. The various actions of somatostatin are mediated by a family of rhodopsin-like G protein-coupled receptors, which comprise of five distinct subtypes: Somatostatin receptor 1 (SSTR1), Somatostatin receptor 2 (SSTR2), Somatostatin receptor 3 (SSTR3), Somatostatin receptor 4 (SSTR4) and Somatostatin receptor 5 (SSTR5) [
,
,
]. These subtypes are widely expressed in many tissues [,
,
,
,
,
], and frequently multiple subtypes coexist in the same cell []. The somatostatin receptor subtypes also share common signalling pathways, such as the inhibition of adenylyl cyclase [,
], activation of phosphotyrosine phosphatase (PTP), and modulation of mitogen-activated protein kinase (MAPK) through G protein-dependent mechanisms. Some of the subtypes are also coupled to inward rectifying K+ channels (SSTR2, SSTR3, SSTR4, SSTR5) [,
], to voltage-dependent Ca2+ channels (SSTR1, SSTR2) [], to an Na+/H+ exchanger (SSTR1), AMPA/kainate glutamate channels (SSTR1, SSTR2), phospholipase C (SSTR2, SSTR5), and phospholipase A2 (SSTR4) []. Amongst the wide spectrum of somatostatin effects, several biological responses have been identified that display absolute or relative subtype selectivity. These include GH secretion (SSTR2 and 5), insulin secretion (SSTR5), glucagon secretion (SSTR2), and immune responses (SSTR2) [
].This entry represents SSTR1 [
]. In humans, it is expressed at high levels in the jejunum and stomach, with lower levels in the pancreas, colon and kidney, but it is absent in the brain. Conversely, in rodent tissue, high levels are found in the brain, but are absent in peripheral tissues []. All five human somatostatin receptors expressed in COS-7 cells are coupled to activation of phosphoinositide (PI)-specific PLC-beta; and Ca2+ mobilisation via pertussis toxin-sensitive G protein(s) with an order of potency of SSTR5 >SSTR2 >SSTR3 >SSTR4 >SSTR1 [
].
Aquaporins are water channels, present in both higher and lower organisms, that belong to the major intrinsic protein family. Most aquaporins are highly selective for water, though some also facilitate the movement of small uncharged molecules such as glycerol [
]. In higher eukaryotes these proteins play diverse roles in the maintenance of water homeostasis, indicating that membrane water permeability can be regulated independently of solute permeability. In microorganisms however, many of which do not contain aquaporins, they do not appear to play such a broad role. Instead, they assist specific microbial lifestyles within the environment, e.g. they confer protection against freeze-thaw stress and may help maintain water permeability at low temperatures []. The regulation of aquaporins is complex, including transcriptional, post-translational, protein-trafficking and channel-gating mechanisms that are frequently distinct for each family member.Structural studies show that aquaporins are present in the membrane as tetramers, though each monomer contains its own channel [
,
,
]. The monomer has an overall "hourglass"structure made up of three structural elements: an external vestibule, an internal vestibule, and an extended pore which connects the two vestibules. Substrate selectivity is conferred by two mechanisms. Firstly, the diameter of the pore physically limits the size of molecules that can pass through the channel. Secondly, specific amino acids within the molecule regulate the preference for hydrophobic or hydrophilic substrates.
Aquaporins are classified into two subgroups: the aquaporins (also known as orthodox aquaporins), which transport only water, and the aquaglyceroporins, which transport glycerol, urea, and other small solutes in addition to water [
,
].Aquaporin-9 was identified from human leukocytes by homology cloning [
]. AQP9 has unusually broad solute permeability. It is expressed in hepatocyte plasma membranes and also in lung, small intestine and spleen cells []. Expression of AQP9 in liver was induced up to 20-fold in rats fasted for 24 to 96 hours, and the AQP9 level gradually declined after re-feeding []. AQP9 shares greater sequence identity with AQP3 and AQP7 than with other members of the family, suggesting that these 3 proteins belong to a subfamily.
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Melanocyte-stimulating hormones (MSH), adrenocorticotrophin (ACTH) and
beta-endorphin are peptide products of pituitary pro-opiomelanocortin.MSH has a trophic action on melanocytes, and regulates pigment production
in fish and amphibia. The MSH receptor is expressed in high levels inmelanocytes, melanomas and their derived cell lines. Receptors are found in low levels in the CNS. MSH regulates temperature control in the septal region of the brain and releases prolactin from the pituitary.
Tumor necrosis factor receptor superfamily member 16 (TNFRSF16), also known as nerve growth factor receptor (NGFR) or p75 neurotrophin receptor (p75NTR or p75), CD271, or Gp80-LNGFR, is a common receptor for both neurotrophins and proneurotrophins, and plays a diverse role in many tissues, including the nervous system. It has been shown to be expressed in various types of stem cells and has been used to prospectively isolate stem cells with different degrees of potency [
]. p75NTR owes its signaling to the recruitment of intracellular binding proteins, leading to the activation of different signaling pathways []. It binds nerve growth factor (NGF) and the complex can initiate a signaling cascade which has been associated with both neuronal apoptosis and neuronal survival of discrete populations of neurons, depending on the presence or absence of intracellular signaling molecules downstream of p75NTR (e.g. NF-kB, JNK, or p75NTR intracellular death domain). p75NTR can also bind NGF in concert with the neurotrophic tyrosine kinase receptor type 1 (TrkA) protein where it is thought to modulate the formation of the high-affinity neurotrophin binding complex [].In melanoma cells, p75NTR is an immunosuppressive factor, induced by interferon (IFN)-gamma, and mediates down-regulation of melanoma antigens [
]. It can interact with the aggregated form of amyloid beta (Abeta) peptides, and plays an important role in etiopathogenesis of Alzheimer's disease by influencing protein tau hyper-phosphorylation []. p75NTR is involved in the formation and progression of retina diseases; its expression is induced in retinal pigment epithelium (RPE) cells and its knockdown rescues RPE cell proliferation activity and inhibits RPE apoptosis induced by hypoxia []. It can therefore be a potential therapeutic target for RPE hypoxia or oxidative stress diseases.This entry represents the N-terminal domain of TNFRSF16. TNF-receptors are modular proteins. The N-terminal extracellular part contains a cysteine-rich region responsible for ligand-binding. This region is composed of small modules of about 40 residues containing 6 conserved cysteines; the number and type of modules can vary in different members of the family [
,
,
].
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors inhumans and mice [
]. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli [
]. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf' []. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr [,
,
]. Many of these proteins have homologues in Caenorhabditis briggsae.This entry represents the chemoreceptor Srd [
].
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [
].The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli [
]. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf' []. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr [,
,
]. Many of these proteins have homologues in Caenorhabditis briggsae.Srh is part of the Str superfamily of chemoreceptors [
].
[NiFe] hydrogenases function in H2 metabolism in a variety of microorganisms, enabling them to use H2 as a source of reducing equivalent under aerobic and anaerobic conditions [NiFe]hydrogenases consist of two subunits, hydrogenase large and hydrogenase small. The large subunit contains the binuclear [NiFe] active site, while the small subunit binds at least one [4Fe-4S]cluster [
].Energy-converting [NiFe] hydrogenases (or [NiFe]-hydrogenase-3-type) form a distinct group within the [NiFe] hydrogenase family [,
,
]. Members of this subgroup include:Hydrogenase 3 and 4 (Hyc and Hyf) from Escherichia coliCO-induced hydrogenase (Coo) from Rhodospirillum rubrumMbh hydrogenase from Pyrococcus furiosusEha and Ehb hydrogenases from Methanothermobacter speciesEch hydrogenase from Methanosarcina barkeriEnergy-converting [NiFe] hydrogenases are membrane-bound enzymes with a six-subunit core: the large and small hydrogenase subunits, plus two hydrophilic proteins and two integral membrane proteins. Their large and small subunits show little sequence similarity to other [NiFe]hydrogenases, except for key conserved residues coordinating the active site and [FeS] cluster. However, they show considerable sequence similarity to the six-subunit, energy-conserving NADH:quinone oxidoreductases (complex I), which are present in cytoplasmic membranes of many bacteria and in inner mitochondrial membranes. However, the reactions they catalyse differ significantly from complex I. Energy-converting [NiFe]hydrogenases function as ion pumps.Eha and Ehb hydrogenases contain extra subunits in addition to those shared by other energy-converting [NiFe] hydrogenases (or [NiFe]-hydrogenase-3-type). Eha contains a 6[4Fe-4S] polyferredoxin, a 10[4F-4S]polyferredoxin, ten other predicted integral membrane proteins (EhaA
, EhaB
, EhaC
, EhaD
, EhaE
, EhaF
, EhaG
, EhaI
, EhaK
, EhaL
and
) and four hydrophilic subunits (EhaM, EhaR, EhS, EhT) [
,
]. The ten predicted integral membrane proteins are absent from Ech, Coo, Hyc and Hyf complexes, which may have simpler membrane components than Eha. Eha and Ehb catalyse the reduction of low-potential redox carriers (e.g. ferredoxins or polyferredoxins), which then might function as electron donors to oxidoreductases.This entry represents the Energy-converting hydrogenase subunit EhaL from Methanobacteria and Methanococci, including Methanocaldococcus jannaschii.
The ubiquitous bacterial second messenger cyclic-di-GMP (c-di-GMP) is associated with the regulation of biofilm formation, the control of exopolysaccharide synthesis, flagellar- and pili-based motility, gene expression, interactions of bacteria with eukaryotic hosts and multicellular behaviour in diverse bacteria. This second messenger binds to PliZ domains from cytoplasmic receptors through its RXXXR and [D/N]hSXXG motifs []. However, some PilZ-related domains lack these motifs and do not bind c-di-GMP. The crystal structure, at 1.7 A, of a PilZ domain::c-di-GMP complex from Vibrio cholerae shows c-di-GMP contacting seven of nine strongly conserved residues. Binding of c-di-GMP causes a conformational switch whereby the C- and N-terminal domains are brought into close opposition forming a new allosteric interaction surface that spans these domains and the c-di-GMP at their interface []. Structural and sequence analysis of PilZ-related domains allow the description of three types of domains, the canonical PilZ domain (represented in this entry), whose structure includes a six-stranded β-barrel and a C-terminal alpha helix, an atypical PilZ domain containing two additional alpha helices and forms tetramers, and divergent PilZ-related domains, which include the PilZ protein and the YcgR N-terminal domains (PilZN and PilZNR). PilZN-type domains are evolutionarily related to PliZ domains and are found fused to the canonical PilZ domains in specific taxa, such as spirochetes, actinobacteria, aquificae, cellulose-degrading clostridia, and deltaproteobacteria [].Some examples of proteins containing this domain are BcsA subunits of bacterial cellulose synthases [
] and flagellar brake protein YcgR (see ) [
]. c-di-GMP binding to PilZ brings about conformational changes in the protein that stabilise the bound ligand and probability initiates the downstream signal transduction cascade. In the case of YcgR, c-di-GMP binding regulates flagellum-based motility in a c-di-GMP-dependent manner (see ) [
]. The association of the PilZ domain with a variety of other domains, including likely components of bacterial multidrug secretion system, could provide clues to multiple functions of the c-di-GMP in bacterial pathogenesis and cell development.
The ubiquitous bacterial second messenger cyclic-di-GMP (c-di-GMP) is associated with the regulation of biofilm formation, the control of exopolysaccharide synthesis, flagellar- and pili-based motility, gene expression, interactions of bacteria with eukaryotic hosts and multicellular behaviour in diverse bacteria. This second messenger binds to PliZ domains from cytoplasmic receptors through its RXXXR and [D/N]hSXXG motifs []. However, some PilZ-related domains lack these motifs and do not bind c-di-GMP. The crystal structure, at 1.7 A, of a PilZ domain::c-di-GMP complex from Vibrio cholerae shows c-di-GMP contacting seven of nine strongly conserved residues. Binding of c-di-GMP causes a conformational switch whereby the C- and N-terminal domains are brought into close opposition forming a new allosteric interaction surface that spans these domains and the c-di-GMP at their interface []. Structural and sequence analysis of PilZ-related domains allow the description of three types of domains, the canonical PilZ domain (represented in this entry), whose structure includes a six-stranded β-barrel and a C-terminal alpha helix, an atypical PilZ domain containing two additional alpha helices and forms tetramers, and divergent PilZ-related domains, which include the PilZ protein and the YcgR N-terminal domains (PilZN and PilZNR). PilZN-type domains are evolutionarily related to PliZ domains and are found fused to the canonical PilZ domains in specific taxa, such as spirochetes, actinobacteria, aquificae, cellulose-degrading clostridia, and deltaproteobacteria [].Some examples of proteins containing this domain are BcsA subunits of bacterial cellulose synthases [
] and flagellar brake protein YcgR (see ) [
]. c-di-GMP binding to PilZ brings about conformational changes in the protein that stabilise the bound ligand and probability initiates the downstream signal transduction cascade. In the case of YcgR, c-di-GMP binding regulates flagellum-based motility in a c-di-GMP-dependent manner (see ) [
]. The association of the PilZ domain with a variety of other domains, including likely components of bacterial multidrug secretion system, could provide clues to multiple functions of the c-di-GMP in bacterial pathogenesis and cell development.This entry represents a PilZ domain found mainly in Myxococcales.
Chorismate mutase, AroQ class superfamily, eukaryotic
Type:
Homologous_superfamily
Description:
Chorismate mutase (CM) is a regulatory enzyme (
) required for biosynthesis of the aromatic amino acids phenylalanine and tyrosine. CM catalyzes the Claisen rearrangement of chorismate to prephenate, which can subsequently be converted to precursors of either L-Phe or L-Tyr. In bifunctional enzymes the CM domain can be fused to a prephenate dehydratase (P-protein for Phe biosynthesis), to a prephenate dehydrogenase (T-protein, for Tyr biosynthesis), or to 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase (
). Besides these prokaryotic bifunctional enzymes, monofunctional CMs occur in prokaryotes as well as in fungi, plants and nematode worms [
].The type I or AroH class of CM is represented by Bacillus subtilis aroH, a monofunctional, nonallosteric, homotrimeric enzyme characterized by its pseudo-alpha/β-barrel 3D structure. Each monomer folds into a 5-stranded mixed β-sheet packed against an α-helix and a 3-10 helix. The core is formed by a closed barrel of mixed β-sheets surrounded by helices. The interfaces between adjacent subunits form three equivalent clefts that harbor the active sites [
].The type II or AroQ class of CM has a completely different all-helical 3D structure, represented by the CM domain of the bifunctional Escherichia coli P-protein. This type is named after the Enterobacter agglomerans monofunctional CM encoded by the aroQ gene [
]. All CM domains from bifunctional enzymes as well as most monofunctional CMs belong to this class, including archaeal CM.Eukaryotic CM from plants and fungi form a separate subclass of AroQ, represented by the Baker's yeast allosteric CM [
]. These enzymes show only partial sequence similarity to the prokaryotic CMs due to insertions of regulatory domains, but the helix-bundle topology and catalytic residues are conserved and the 3D structure of the E. coli CM dimer resembles a yeast CM monomer [,
,
]. The E. coli P-protein CM domain consists of 3 helices and lacks allosteric regulation. The yeast CM has evolved by gene duplication and dimerization and each monomer has 12 helices. Yeast CM is allosterically activated by Trp and inhibited by Tyr [].
Competence is the ability of a cell to take up exogenous DNA from its environment, resulting in transformation. It is widespread among bacteria and is probably an important mechanism for the horizontal transfer of genes. DNA usually becomes available by the death and lysis of other cells. Competent bacteria use components of extracellular filaments called type 4 pili to create pores in their membranes and pull DNA through the pores into the cytoplasm. This process, including the development of competence and the expression of the uptake machinery, is regulated in response to cell-cell signalling and/or nutritional conditions [
].The development of genetic competence in Bacillus subtilis is a highly regulated adaptive response to stationary-phase stress. For competence to develop, the transcriptional regulator, ComK, must be activated. ComK is required for the expression of genes encoding proteins that function in DNA uptake. In log-phase cultures, ComK is inactive in a complex with MecA and ClpC. The comS gene is induced in response to high culture cell density and nutritional stress and its product functions to release active ComK from the complex. ComK then stimulates the transcription initiation of its own gene as well as that of the late competence operons [
].This entry represents proteins encoded in the comE operon for "late competence"as characterised in B. subtilis [
]. It is under competence control and is required for both DNA binding to the competent cell surface, and for uptake. The presence of a cytidine/deoxycytidine deaminase domain in these proteins suggests that they may perform this activity. comE contains three open reading frames (ORF1-3) read in the forward direction, preceded by a long untranslated leader sequence and an E sigma A promoter. The comE transcript is present at a very low level during growth and at an elevated level in stationary-phase cells. Conversely, the reverse transcript is present during exponential growth and disappears during the stationary phase. ORF1 and ORF3 are predicted to be integral membrane proteins. The latter is specifically required for DNA uptake but not for binding [].
Signal transduction histidine kinase, osmosensitive K+ channel sensor, N-terminal
Type:
Domain
Description:
Signal transducing histidine kinases are the key elements in two-component signal transduction systems, which control complex processes such as the initiation of development in microorganisms [
,
]. Examples of histidine kinases are EnvZ, which plays a central role in osmoregulation [], and CheA, which plays a central role in the chemotaxis system []. Histidine kinases usually have an N-terminal ligand-binding domain and a C-terminal kinase domain, but other domains may also be present. The kinase domain is responsible for the autophosphorylation of the histidine with ATP, the phosphotransfer from the kinase to an aspartate of the response regulator, and (with bifunctional enzymes) the phosphotransfer from aspartyl phosphate back to ADP or to water []. The kinase core has a unique fold, distinct from that of the Ser/Thr/Tyr kinase superfamily. HKs can be roughly divided into two classes: orthodox and hybrid kinases [
,
]. Most orthodox HKs, typified by the Escherichia coli EnvZ protein, function as periplasmic membrane receptors and have a signal peptide and transmembrane segment(s) that separate the protein into a periplasmic N-terminal sensing domain and a highly conserved cytoplasmic C-terminal kinase core. Members of this family, however, have an integral membrane sensor domain. Not all orthodox kinases are membrane bound, e.g., the nitrogen regulatory kinase NtrB (GlnL) is a soluble cytoplasmic HK []. Hybrid kinases contain multiple phosphodonor and phosphoacceptor sites and use multi-step phospho-relay schemes instead of promoting a single phosphoryl transfer. In addition to the sensor domain and kinase core, they contain a CheY-like receiver domain and a His-containing phosphotransfer (HPt) domain.This entry represents the N-terminal domain found in KdpD sensor kinase proteins, which regulate the kdpFABC operon responsible for potassium transport [
]. The N-terminal domain forms part of the cytoplasmic region of the protein, which functions as the sensor domain responsible for sensing turgor pressure [
]. It recognises K+ and C-di_AMP (Matilla et.al., FEMS Microbiology Reviews, fuab043, 45, 2021, 1. https://doi.org/10.1093/femsre/fuab043).
Transcriptional activation and repression is required for control of cell proliferation and differentiation during embryonic development and homeostasis in the adult organism. Perturbations of these processes can lead to the development of cancer [
]. The Eight-Twenty-One (ETO) gene product is able to form complexes with corepressors and deacetylases, such as nuclear receptor corepressor (N-CoR), which repress transcription when recruited by transcription factors []. The ETO gene derives its name from its association with many cases of acute myelogenous leukaemia (AML), in which a reciprocal translocation, t(8;21), brings together a large portion of the ETO gene from chromosome eight and part of the AML1 gene from chromosome 21. The human ETO gene family currently comprises three major subfamilies: ETO/myeloid transforming gene on chromosome 8 (MTG8); myeloid transforming gene related protein-1 (MTGR1) and myeloid transforming gene on chromosome 16 (MTG16). ETO proteins are composed of four evolutionarily conserved domains termed nervy homology regions (NHR) 1-4. NHR1 is thought to stabilise the formation of high molecular weight complexes, but is not directly responsible for repressor activity. NHR2 and its flanking sequence comprise the core repressor domain, which mediates 50% of the wild type repressor activity. Furthermore, there is evidence that the amphipathic helical structure of NHR2 promotes the formation of ETO/AML1 homodimers []. NHR3 and NHR4 have been shown to act in concert to bind N-CoR. NHR4 contains two zinc finger motifs, which are thought to play a role in protein interactions rather than DNA binding []. The ultrabithorax (Ubx) gene is a homeotic gene in Drosophila that determines the morphological characteristics of each segment. Ubiquitous expression of Ubx gave rise to increased expression of several mRNAs; one of these transcripts was localised to the nervous system precursor cells of the head, thoracic and abdominal segments, and was termed nervy. Analysis of nervy cDNA revealed that it shared significant sequence similarity with the human ETO gene; it was also found to contain a region of similarity to the TATA binding protein-associated factor TAF110 [].
Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll 'a' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane [
,
,
]. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection []. This entry represents the low molecular weight transmembrane protein PsbL found in PSII. PsbL is located in a gene cluster with PsbE, PsbF and PsbJ (PsbEFJL). Both PsbL and PsbJ (
) are essential for proper assembly of the OEC. Mutations in PsbL prevent the formation of both PSII core dimers and PSII-light harvesting complex [
]. In addition, both PsbL and PsbJ are involved in the unidirectional flow of electrons, where PsbJ regulates the forward electron flow from D2 (Qa) to the plastoquinone pool, and PsbL prevents the reduction of PSII by back electron flow from plastoquinol protecting PSII from photo-inactivation [].