Transposons are mobile genetic elements that move from one DNA site to another
within their host's genome, often with profound biological consequences. TheMu genome is the largest and most efficient transposon known. The Mu
transposase (MuA) is a multidomain protein, which is responsible fortranslocation of the Mu genome. Mu transposase can be divided into three
structurally distinct domains, each with specific functions. The amino-terminal domain (30kDa) is responsible for sequence-specific DNA binding and
can further be subdivided into two separate subdomains, which bind an internalactivation sequence (IAS) and the ends of the phage genome, respectively. A
highly homologous IAS binding domain is also present in the Mu repressorprotein (MuR), but its binding promotes lysogeny of the phage by repressing
the expression of genes required for lytic growth and by directly blocking MuAaccess to the IAS. The IAS binding domains of the MuR and MuA proteins are
DNA-binding, winged helix-turn-helix (wHTH) domains of about 75 residues (Mu-type HTH) [
,
].The Mu-type HTH domain consists of a three-membered α-helical bundle
buttressed by a three-stranded antiparallel β-sheet with an overall B1-H1-T-H2-B2-W-B3-H3 topology (where B,H,T and W stand for β-strand, alpha-
helix, turn and wing, repectively. Helices H1 and H2 and theseven-residue turn connecting them comprise a helix-turn-helix (HTH) motif.
While the general appearance of the Mu-type DNA-binding domain is similar tothat of other winged HTH proteins, the connectivity of the secondary structure
elements is permuted. Hence this fold represents a novel class of winged HTHDNA-binding domain [
,
].
The synapsins are a family of neuron-specific phosphoproteins that coat
synaptic vesicles and are involved in the binding between these vesiclesand the cytoskeleton (including actin filaments). The family comprises 5
homologous proteins Ia, Ib, IIa, IIb and III. Synapsins I, II, and III areencoded by 3 different genes. The a and b isoforms of synapsin I and II are
splice variants of the primary transcripts [].Synapsin I is mainly associated with regulation of neurotransmitter release
from presynaptic neuron terminals []. Synapsin II, as well as being involved in neurotransmitter release, has a role in the synaptogenesis and synaptic plasticity responsible for long term potentiation []. Recent studies implicate synapsin III with a developmental role in neurite elongation and synapse formation that is distinct from the functions of synapsins I and II [].Structurally, synapsins are multidomain proteins, of which 3 domains are
common to all the mammalian forms. The N-terminal `A' domain is ~30 residueslong and contains a serine residue that serves as an acceptor site for
protein kinase-mediated phosphorylation. This is followed by the `B' linkerdomain, which is ~80 residues long and is relatively poorly conserved.
Domain `C' is the longest, spanning approximately 300 residues. This domainis highly conserved across all the synapsins (including those from
Drosophila) and is possessed by all splice variants. The remaining six
domains, D-I, are not shared by all the synapsins and differ both betweenthe primary transcripts and the splice variants.
Ferredoxins are a group of iron-sulphur proteins which mediate electron transfer in a wide variety of metabolic reactions. Ferredoxins can be divided into several subgroups depending upon the physiological nature of the iron-sulphur cluster(s). One of these subgroups are the 4Fe-4S ferredoxins, which are found in bacteria and which are thus often referred as 'bacterial-type' ferredoxins. The structure of these proteins [
] consists of the duplication of a domain of twenty six amino acid residues; each of these domains contains four cysteine residues that bind to a 4Fe-4S centre.Several structures of the 4Fe-4S ferredoxin domain have been determined [
]. The clusters consist of two interleaved 4Fe- and 4S-tetrahedra forming a cubane-like structure, in such a way that the four iron occupy the eight corners of a distorted cube. Each 4Fe-4S is attached to the polypeptide chain by four covalent Fe-S bonds involving cysteine residues.
A number of proteins have been found [
] that include one or more 4Fe-4S binding domains similar to those of bacterial-type ferredoxins.The pattern of cysteine residues in the iron-sulphur region is sufficient to detect this class of 4Fe-4S binding proteins. This entry represents the whole domain.Note:In some bacterial ferredoxins, one of the two duplicated domains has lost one or more of the four conserved cysteines. The consequence of such variations is that these domains have either lost their iron-sulphur binding property or bind to a 3Fe-3S centre instead of a 4Fe-4S centre.
The exchange of macromolecules between the nucleus and cytoplasm takes place through nuclear pore complexes within the nuclear membrane. Active transport of large molecules through these pore complexes require carrier proteins, called karyopherins (importins and exportins), which shuttle between the two compartments.Members of the importin-alpha (karyopherin-alpha) family can form heterodimers with importin-beta. As part of a heterodimer, importin-beta mediates interactions with the pore complex, while importin-alpha acts as an adaptor protein to bind the nuclear localisation signal (NLS) on the cargo through the classical NLS import of proteins. Proteins can contain one (monopartite) or two (bipartite) NLS motifs. Importin-alpha contains several armadillo (ARM) repeats, which produce a curving structure with two NLS-binding sites, a major one close to the N terminus and a minor one close to the C terminus.Ran GTPase helps to control the unidirectional transfer of cargo. The cytoplasm contains primarily RanGDP and the nucleus RanGTP through the actions of RanGAP and RanGEF, respectively. In the nucleus, RanGTP binds to importin-beta within the importin/cargo complex, causing a conformational change in importin-beta that releases it from importin-alpha-bound cargo. The N-terminal importin-beta-binding (IBB) domain of importin-alpha contains an auto-regulatory region that mimics the NLS motif [
]. The release of importin-beta frees the auto-regulatory region on importin-alpha to loop back and bind to the major NLS-binding site, causing the cargo to be released [].This entry represents the N-terminal IBB domain of importin-alpha that contains the auto-regulatory region.
This entry represents a domain located centrally in ERCC1, at the C-terminal of Rad10 and at the N-terminal of Swi10. In ERCC1, this domain interacts tightly with XPF and may be involved in binding to single-stranded DNA [
].This group of proteins includes Rad10 from budding yeasts, Swi10 from fission yeasts and
ERCC-1 from animals and plants. All proteins in this family for which functions are known are components in a multiprotein endonuclease complex (usually made up of Rad1 and Rad10 homologues). This complex is used primarily for nucleotide excision repair but also for some aspects of recombination repair. In budding yeast, Rad10 works as a heterodimer with Rad1, and is involved in nucleotide excision repair of DNA damaged with UV light, bulky adducts or cross-linking agents. The complex forms an endonuclease which specifically degrades single-stranded DNA [
].ERCC1 and XPF (xeroderma pigmentosum group F-complementing protein) are two structure-specific endonucleases of a class of seven containing an ERCC4 domain. Together they form an obligate complex that functions primarily in nucleotide excision repair (NER), a versatile pathway able to detect and remove a variety of DNA lesions induced by UV light and environmental carcinogens, and secondarily in DNA inter-strand cross-link repair and telomere maintenance. This domain in fact binds simultaneously to both XPF and single-stranded DNA; this ternary complex explains the important role of Ercc1 in targeting its catalytic XPF partner to the NER pre-incision complex [
].
Cytokines can be grouped into a family on the basis of sequence, functional and structural similarities [
,
,
]. Tumor necrosis factor (TNF) (also known as TNF-alpha or cachectin) is a monocyte-derived cytotoxin that has been implicated in tumour regression, septic shock and cachexia [,
]. The protein is synthesised as a prohormone with an unusually long and atypical signal sequence, which is absent from the mature secreted cytokine []. A short hydrophobic stretch of amino acids serves to anchor the prohormone in lipid bilayers []. Both the mature protein and a partially-processed form of the hormone are secreted after cleavage of the propeptide [].There are a number of different families of TNF, but all these cytokines seem to form homotrimeric (or heterotrimeric in the case of LT-alpha/beta) complexes that are recognised by their specific receptors. TNF exerts its function mainly through two TNF receptors, TNF-1 and TNF-2, which are expressed on nearly all cells of the body. This entry represents TNF-2.TNFs and their receptors can select and kill virus-infected cells [
].Poxviruses are large DNA viruses that encode many proteins capable of interfering with host immune functions, including soluble versions of cytokine receptors such as vTNFalpha-2 (viral TNFalpha receptor 2). These soluble cytokine receptors effectively block cytokine activity and modulate viral virulence. The C22L type receptor in vaccinia virus is equivalent to the CrmB (cytokine response modifier B) protein in cowpox virus [
,
,
].
EF2 (or EFG) participates in the elongation phase of protein synthesis by promoting the GTP-dependent translocation of the peptidyl tRNA of the nascent protein chain from the A-site (acceptor site) to the P-site (peptidyl tRNA site) of the ribosome. EF2 also has a role after the termination phase of translation, where, together with the ribosomal recycling factor, it facilitates the release of tRNA and mRNA from the ribosome, and the splitting of the ribosome into two subunits [
]. EF2 is folded into five domains, with domains I and II forming the N-terminal block, domains IV and V forming the C-terminal block, and domain III providing the covalently-linked flexible connection between the two. Domains III and V have the same fold (although they are not completely superimposable and domain III lacks some of the superfamily characteristics), consisting of an alpha/beta sandwich with an antiparallel β-sheet in a (beta/alpha/beta)x2 topology [].Elongation factor 4 (EF4/LepA) is a highly conserved guanosine triphosphatase translation factor. EF4 has six domains, of which four (I, II, III, and V) are homologous to corresponding domains in EF-G. It differs from EF-G by having a short domain IV, and possessing a conserved C-terminal domain [
,
].This superfamily represents a domain found in EF2, EF4, as well as in some tetracycline resistance proteins, peptide chain release factors [
] and in the C-terminal region of the bacterial hypothetical protein, YigZ.
The genus Yersinia contains just three species: Yersinia enterocolitica, Yersinia pestis, and Yersinia pseudotuberculosis [
]. Although the three use different routes to infect their host, each targets the lymphoid tissue for invasion, and all have developed specific systems to evade host immune cells []. PYV, a major virulence plasmid common to all members of this family, harbours the genes necessary for type III secretion in the host and the exotoxins translocated by them. One of the proteins encoded within this plasmid is the Yersinia YadA
non-fimbrial adhesin, a moiety that facilitates cell interaction between the host and pathogen [
]. Mutational studies indicate that this protein allows intimate attachment and subsequent uptake by host macrophages of the bacterial cell. Synergistic mechanisms by two other PYV-encoded proteins,YopH and YopE, inhibit the action of YadA. Electron microscopy of the
mature YadA adhesins suggest that they form distinct "lollipop"shaped
structures on the cell surface []. This is a trait shared by the adhesins of another pathogen, namely Moraxella catarrhalis UspA1 and UspA2 []. The YadA protein itself exists as a homotrimer of 45kDa subunits, anchored
in the outer bacterial membrane by its C terminus []. The lollipop's globular head is formed by the N terminus in the extracellular space. A Yersinia bacterial cell thus coated with YadA can bind a number of host cell macromolecules, including collagen, laminin, mucus and fibronectin, enhancing its capacity for infection.
This protein family includes nucleotidyltransferases that use nuclear RNA as substrate, such as Trf4 (also known as Poly(A) RNA polymerase protein 2, PAP2) from S. cerevisiae and its homologues Terminal nucleotidyltransferase 4A/B from humans (TENT4A and TENTB, also known as PAPD7/TRF4-1 and PAPD5/TRF4-2, respectively), Poly(A) RNA polymerase protein 1 from Saccharomyces cerevisiae (TRF5), and Poly(A) RNA polymerase cid12 from Schizosaccharomyces pombe (Cid12 or PAP). These proteins function as subunits of the TRAMP-like complexes, which have a poly(A) RNA polymerase activity and are involved in a post-transcriptional quality control mechanism limiting inappropriate expression of genetic information [,
,
,
]. TRF4 from humans catalyses preferentially the transfer of ATP and GTP on RNA 3' poly(A) tail creating a heterogeneous 3' poly(A) tail leading to mRNAs stabilization by protecting mRNAs from active deadenylation [
,
].In S. cerevisiae, TRF5 polyadenylates RNA processing and degradation intermediates of snRNAs, snoRNAs and mRNAs that accumulate in strains lacking a functional exosome. TRF5 is also required for proper nuclear division in mitosis and sister chromatid cohesion. It is involved in the regulation of histone mRNA levels [
,
,
]. Cid12 from S. pombe, has a role in the RNA interference (RNAi) pathway which is important for heterochromatin formation and accurate chromosome segregation. It is a member of the RNA-directed RNA polymerase complex (RDRC) which is involved in the generation of small interfering RNAs (siRNAs) and mediate their association with the RNA-induced transcriptional silencing (RITS) complex [
].
This entry represents the chymotrypsin-like fold found in proteins from MEROPS peptidase family S1 (clan PA). The PA clan contains both cysteine and serine proteases that can be found in plants, animals, fungi, eubacteria, archaea and viruses [
].The severe acute respiratory syndrome (SARS) 3C-like protease (3CL) consists of two distinct folds, namely the N-terminal chymotrypsin fold containing domains I and II, hosting the complete catalytic machinery and the C-terminal extra helical domain III, unique for the coronavirus 3CL proteases [
,
].The structure of a 3CL, CoV M-pro, has been solved. It is a dimer where each subunit is composed of three domains I, II and III. Domains I and II consist of six-stranded antiparallel beta barrels and together resemble the architecture of chymotrypsin and of picornaviruses 3C proteinases. The substrate-binding site is located in a cleft between these two domains. The catalytic site is situated at the centre of the cleft. A long loop connects domain II to the C-terminal domain (domain III). This latter domain, a globular cluster of five helices, has been implicated in the proteolytic activity of M-pro. In the active site of M-pro, Cys and His form a catalytic dyad. In contrast to serine proteinases and other cysteine proteinases, which have a catalytic triad, there is no third catalytic residue present [,
,
,
]. Many drugs have been developed to inhibit CoV M-pro [,
].
Positive-stranded RNA (+RNA) viruses that belong to the order Nidovirales infect a wide range of vertebrates (families Arteriviridae and Coronaviridae) or invertebrates (Mesoniviridae and Roniviridae). Examples of nidoviruses with high economic and societal impact are the arterivirus porcine reproductive and respiratory syndrome virus (PRRSV) and the zoonotic coronaviruses (CoVs) causing severe acute respiratory syndrome (SARS), Middle East respiratory syndrome (MERS) and Covid-19 (SARS-CoV-2) in humans. The replicase gene encodes two polyproteins, pp1a and pp1ab, which are proteolytically processed to nonstructural proteins (NSPs). Among the NSPs found in Nidovirales, nonstructural protein 15 (NSP15) from coronaviruses (CoV) and NSP11 from arteriviruses (AV) participate in the viral replication process and in the evasion of the host immune system. They contain in their C-terminal region a conserved endoribonuclease domain called nidoviral uridylate-specific endoribonuclease (NendoU) with cleavage specificity for single- and double-stranded RNA 5' of uridine nucleotides to produce a 2'-3'-cyclic phosphate end product. Arterivirus Nsp11 contains two conserved compact domains: the N-terminal domain (NTD) and C-terminal domain (NendoU), whereas CoV NSP15 folds into three domains: N-terminal, middle domain, and C-terminal catalytic NendoU domain. No counterpart corresponding to the NTD of CoV NSP15 exists in AV NSP11. The NTD of AV NSP11 is small and related to NSP15 middle domain, which may serve as an interaction hub with other proteins and RNA [,
,
,
,
,
].This domain contains a central β-sheet flanked by two small α-helices on either side [
,
,
].
During the development of the vertebrate nervous system, many neurons
become redundant (because they have died, failed to connect to targetcells, etc.) and are eliminated. At the same time, developing neurons send
out axon outgrowths that contact their target cells []. Such cells controltheir degree of innervation (the number of axon connections) by the
secretion of various specific neurotrophic factors that are essential forneuron survival. One of these is nerve growth factor (NGF or beta-NGF), a vertebrate protein that stimulates
division and differentiation of sympathetic and embryonic sensory neurons [,
]. NGF is mostly found outside the centralnervous system (CNS), but slight traces have been detected in adult CNS
tissues, although a physiological role for this is unknown []; it has alsobeen found in several snake venoms [
,
].NGF is a protein of about 120 residues that is cleaved from a larger
precursor molecule. It contains six cysteines all involved in intrachaindisulphide bonds. A schematic representation of the structure of NGF is shown
below:+------------------------+
| || |
xxxxxxCxxxxxxxxxxxxxxxxxxxxxCxxxxCxxxxxCxxxxxxxxxxxxxCxCxxxx| | | |
+--------------------------|-----+ |+---------------------+
'C': conserved cysteine involved in a disulphide bond.This entry also contains NGF-related proteins such as neutrophin 3, which promotes the survival of visceral and proprioceptive sensory neurons, and brain-derived neurotrophin, which promotes the survival of neuronal populations that are located either in the central nervous system or directly connected to it [
,
].This entry covers the central region of the proteins and include two of the six cysteines involved in disulphide bonds.
CynR is a LysR-like transcriptional regulator of the cyn operon, which encodes genes that allow cyanate to be used as a sole source of nitrogen. The operon includes three genes in the following order: cynT (cyanate permease), cynS (cyanase), and cynX (a protein of unknown function) [
]. CynR negatively regulates its own expression independently of cyanate. CynR binds to DNA and induces bending of DNA in the presence or absence of cyanate, but the amount of bending is decreased by cyanate. CynR, as other LysR-type transcriptional regulators, is composed of two functional domains joined by a linker helix involved in oligomerization: an N-terminal HTH (helix-turn-helix) domain, which is responsible for the DNA-binding specificity, and a C-terminal substrate-binding domain, which is structurally homologous to the type 2 periplasmic binding proteins (PBP2) []. The PBP2 are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the substrate- binding domains from ionotropic glutamate receptors, LysR-like transcriptional regulators, and unorthodox sensor proteins involved in signal transduction [
,
,
].
This entry represents the C-terminal substrate binding domain of LysR-type transcriptional regulator BlaA, which is involved in control of the expression of beta-lactamase genes, blaA and blaB. Beta-lactamases are responsible for bacterial resistance to beta-lactam antibiotics such as penicillins. The blaA gene is located just upstream of blaB in the opposite direction and regulates the expression of the blaB. BlaA also negatively auto-regulates the expression of its own gene, blaA. BlaA (a constitutive class A penicllinase) belongs to the LysR family of transcriptional regulators, whereas BlaB (an inducible class C cephalosporinase or AmpC) can be referred to as a penicillin binding protein but it does not act as a beta-lactamase [
]. The topology of this substrate-binding domain is most similar to that of the type 2 periplasmic binding proteins (PBP2).The PBP2 are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the substrate- binding domains from ionotropic glutamate receptors, LysR-like transcriptional regulators, and unorthodox sensor proteins involved in signal transduction [
,
,
].
This entry consists of the voltage-dependent potassium channel beta subunit KCNAB and related proteins. The bacterial proteins in this entry lack apparent alpha subunit partners and predicted to function as soluble aldo/keto reductase enzymes [
,
].Potassium channels are the most diverse group of the ion channel family [
,
]. They are important in shaping the action potential, and in neuronal excitability and plasticity []. The potassium channel family is composed of several functionally distinct isoforms, which can be broadly separated into 2 groups []: the practically non-inactivating 'delayed' group and the rapidly inactivating 'transient' group.These are all highly similar proteins, with only small amino acid changes causing the diversity of the voltage-dependent gating mechanism, channel conductance and toxin binding properties. Each type of K
+channel is activated by different signals and conditions depending on their type of regulation: some open in response to depolarisation of the plasma membrane; others in response to hyperpolarisation or an increase in intracellular calcium concentration; some can be regulated by binding of a transmitter, together with intracellular kinases; while others are regulated by GTP-binding proteins or other second messengers [
]. In eukaryotic cells, K+channels are involved in neural signalling and generation of the cardiac rhythm, act as effectors in signal transduction pathways involving G protein-coupled receptors (GPCRs) and may have a role in target cell lysis by cytotoxic T-lymphocytes [
]. In prokaryotic cells, they play a role in the maintenance of ionic homeostasis [].All K
+channels discovered so far possess a core of alpha subunits, each comprising either one or two copies of a highly conserved pore loop domain (P-domain). The P-domain contains the sequence (T/SxxTxGxG), which has been termed the K
+selectivity sequence. In families that contain one P-domain, four subunits assemble to form a selective pathway for K
+across the membrane. However, it remains unclear how the 2 P-domain subunits assemble to form a selective pore. The functional diversity of these families can arise through homo- or hetero-associations of alpha subunits or association with auxiliary cytoplasmic beta subunits. K
+channel subunits containing one pore domain can be assigned into one of two superfamilies: those that possess six transmembrane (TM) domains and those that possess only two TM domains. The six TM domain superfamily can be further subdivided into conserved gene families: the voltage-gated (Kv) channels; the KCNQ channels (originally known as KvLQT channels); the EAG-like K
+channels; and three types of calcium (Ca)-activated K
+channels (BK, IK and SK) [
]. The 2TM domain family comprises inward-rectifying K+channels. In addition, there are K
+channel alpha-subunits that possess two P-domains. These are usually highly regulated K
+selective leak channels.
The KCNAB family (also known as the Kvbeta family) of voltage-dependent potassium channel beta subunits form complexes with the alpha subunits which can modify the properties of the channel. Four of these soluble beta subunits form a complex with four alpha subunit cytoplasmic (T1) regions. These subunits belong to the family of are NADPH-dependent aldo-keto reductases, and bind NADPH-cofactors in their active sites. Changes in the oxidoreductase activity appear to markedly influence the gating mode of Kv channels, since mutations to the catalytic residues in the active site lessen the inactivating activity of KCNAB [
]. The KCNAB family is further divided into 3 subfamilies: KCNAB1 (Kvbeta1), KCNAB2 (Kvbeta2) and KCNAB3 (Kvbeta3).
Signal transduction histidine kinase, TMAO sensor TorS
Type:
Family
Description:
Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions [
]. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk []. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more []. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) []. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK.A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response [
,
].Signal transducing histidine kinases are the key elements in two-component signal transduction systems, which control complex processes such as the initiation of development in microorganisms [
,
]. Examples of histidine kinases are EnvZ, which plays a central role in osmoregulation [], and CheA, which plays a central role in the chemotaxis system []. Histidine kinases usually have an N-terminal ligand-binding domain and a C-terminal kinase domain, but other domains may also be present. The kinase domain is responsible for the autophosphorylation of the histidine with ATP, the phosphotransfer from the kinase to an aspartate of the response regulator, and (with bifunctional enzymes) the phosphotransfer from aspartyl phosphate back to ADP or to water []. The kinase core has a unique fold, distinct from that of the Ser/Thr/Tyr kinase superfamily. HKs can be roughly divided into two classes: orthodox and hybrid kinases [
,
]. Most orthodox HKs, typified by the Escherichia coli EnvZ protein, function as periplasmic membrane receptors and have a signal peptide and transmembrane segment(s) that separate the protein into a periplasmic N-terminal sensing domain and a highly conserved cytoplasmic C-terminal kinase core. Members of this family, however, have an integral membrane sensor domain. Not all orthodox kinases are membrane bound, e.g., the nitrogen regulatory kinase NtrB (GlnL) is a soluble cytoplasmic HK []. Hybrid kinases contain multiple phosphodonor and phosphoacceptor sites and use multi-step phospho-relay schemes instead of promoting a single phosphoryl transfer. In addition to the sensor domain and kinase core, they contain a CheY-like receiver domain and a His-containing phosphotransfer (HPt) domain.This entry represents TorS proteins, which are part of a regulatory system for the torCAD operon that encodes the pterin molybdenum cofactor-containing enzyme trimethylamine-N-oxide (TMAO) reductase (TorA), a cognate chaperone (TorD), and a penta-haem cytochrome (TorC). TorS works together with the inducer-binding protein TorT and the response regulator TorR. TorS contains histidine kinase ATPase (
), HAMP (
), phosphoacceptor (
), and phosphotransfer (
) domains and a response regulator receiver domain (
).
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Adrenocorticotrophin (ACTH), melanocyte-stimulating hormones (MSH) and
beta-endorphin are peptide products of pituitary pro-opiomelanocortin.ACTH regulates synthesis and release of glucocorticoids and aldosterone
in the adrenal cortex; it also has a trophic action on these cells.ACTH and beta-endorphin are synthesised and released in response to
corticotrophin-releasing factor at times of stress (heat, cold, infections,etc.) - their release leads to increased metabolism and analgesia.
MSH has a trophic action on melanocytes, and regulates pigment production
in fish and amphibia. The ACTH receptor is found in high levels inthe adrenal cortex - binding sites are present in lower levels in the
CNS. The MSH receptor is expressed in high levels in melanocytes,melanomas and their derived cell lines. Receptors are found in low
levels in the CNS. MSH regulates temperature control in the septal regionof the brain and releases prolactin from the pituitary.A further gene, which encodes a melanocortin receptor that is functionally
distinct from the ACTH and MSH receptors, has also been characterised [,
,
,
,
].The protein contains ~300 amino acids, with calculated molecular mass of
~36kDa, and potential N-linked glycosylation and phosphorylation sites[
]. The melanocortin 4 receptor (MC4-R) is regulated by opiateadministration [
]. Rat MC4-R is 95% identical to human MC4-R, and thepotency of melanocortin peptides to stimulate cAMP production is similar in
these two species homologues []. Expression of MC4-R mRNA was found to beenriched in the striatum, nucleus accumbens, and periaque-ductal gray, all
of which are regions implicated in the behavioral effects of opiates(and are regions in which MC1-, MC3- and MC5-R are expressed at low or
undetectable levels) []. MC4-R mRNA has been found in multiple sites invirtually every brain region, including the cortex, thalamus, hypothalamus,
brainstem, and spinal cord []. Unlike the MC3-R, MC4-R mRNA is found inboth parvicellular and magnocellular neurons of the paraventricular nucleus
of the hypothalamus, suggesting a role in the central control of pituitaryfunction [
].
Potassium voltage-gated channel subfamily E member 1
Type:
Family
Description:
KCNE1 (Potassium voltage-gated channel subfamily E member 1, also known as Mink) subunits associate with KCNQ1 alpha subunits to form channels that are responsible for the IkS currents that determine the duration of the action potential in cardiac muscle [
]. Mutations in both of the genes encoding these subunits cause an inherited disorder that increases the risk of death from cardiac arrhythmia (long QT syndrome type 1) and Jervell and Lange-Nielsen syndrome, associated with congenital deafness [].Two types of beta subunit (KCNE and KCNAB) are presently known to associate with voltage-gated alpha subunits (Kv, KCNQ and eag-like). However, not all combinations of alpha and beta subunits are possible. The KCNE family of K+ channel subunits are membrane glycoproteins that possess a single transmembrane (TM) domain. They share no structural relationship with the alpha subunit proteins, which possess pore forming domains. The subunits appear to have a regulatory function, modulating the kinetics and voltage dependence of the alpha subunits of voltage-dependent K+ channels. KCNE subunits are formed from short polypeptides of ~130 amino acids, and are divided into five subfamilies: KCNE1 (MinK/IsK), KCNE2 (MiRP1), KCNE3 (MiRP2), KCNE4 (MiRP3) and KCNE1L (AMMECR2). Potassium channels are the most diverse group of the ion channel family [
,
]. They are important in shaping the action potential, and in neuronal excitability and plasticity []. The potassium channel family is composed of several functionally distinct isoforms, which can be broadly separated into 2 groups []: the practically non-inactivating 'delayed' group and the rapidly inactivating 'transient' group.These are all highly similar proteins, with only small amino acid changes causing the diversity of the voltage-dependent gating mechanism, channel conductance and toxin binding properties. Each type of K
+channel is activated by different signals and conditions depending on their type of regulation: some open in response to depolarisation of the plasma membrane; others in response to hyperpolarisation or an increase in intracellular calcium concentration; some can be regulated by binding of a transmitter, together with intracellular kinases; while others are regulated by GTP-binding proteins or other second messengers []. In eukaryotic cells, K+channels are involved in neural signalling and generation of the cardiac rhythm, act as effectors in signal transduction pathways involving G protein-coupled receptors (GPCRs) and may have a role in target cell lysis by cytotoxic T-lymphocytes [
]. In prokaryotic cells, they play a role in the maintenance of ionic homeostasis [].All K
+channels discovered so far possess a core of alpha subunits, each comprising either one or two copies of a highly conserved pore loop domain (P-domain). The P-domain contains the sequence (T/SxxTxGxG), which has been termed the K
+selectivity sequence. In families that contain one P-domain, four subunits assemble to form a selective pathway for K
+across the membrane. However, it remains unclear how the 2 P-domain subunits assemble to form a selective pore. The functional diversity of these families can arise through homo- or hetero-associations of alpha subunits or association with auxiliary cytoplasmic beta subunits. K
+channel subunits containing one pore domain can be assigned into one of two superfamilies: those that possess six transmembrane (TM) domains and those that possess only two TM domains. The six TM domain superfamily can be further subdivided into conserved gene families: the voltage-gated (Kv) channels; the KCNQ channels (originally known as KvLQT channels); the EAG-like K
+channels; and three types of calcium (Ca)-activated K
+channels (BK, IK and SK) [
]. The 2TM domain family comprises inward-rectifying K+channels. In addition, there are K
+channel alpha-subunits that possess two P-domains. These are usually highly regulated K
+selective leak channels.
Potassium channels are the most diverse group of the ion channel family [
,
]. They are important in shaping the action potential, and in neuronal excitability and plasticity []. The potassium channel family is composed of several functionally distinct isoforms, which can be broadly separated into 2 groups []: the practically non-inactivating 'delayed' group and the rapidly inactivating 'transient' group.These are all highly similar proteins, with only small amino acid changes causing the diversity of the voltage-dependent gating mechanism, channel conductance and toxin binding properties. Each type of K
+channel is activated by different signals and conditions depending on their type of regulation: some open in response to depolarisation of the plasma membrane; others in response to hyperpolarisation or an increase in intracellular calcium concentration; some can be regulated by binding of a transmitter, together with intracellular kinases; while others are regulated by GTP-binding proteins or other second messengers [
]. In eukaryotic cells, K+channels are involved in neural signalling and generation of the cardiac rhythm, act as effectors in signal transduction pathways involving G protein-coupled receptors (GPCRs) and may have a role in target cell lysis by cytotoxic T-lymphocytes [
]. In prokaryotic cells, they play a role in the maintenance of ionic homeostasis [].All K
+channels discovered so far possess a core of alpha subunits, each comprising either one or two copies of a highly conserved pore loop domain (P-domain). The P-domain contains the sequence (T/SxxTxGxG), which has been termed the K
+selectivity sequence. In families that contain one P-domain, four subunits assemble to form a selective pathway for K
+across the membrane. However, it remains unclear how the 2 P-domain subunits assemble to form a selective pore. The functional diversity of these families can arise through homo- or hetero-associations of alpha subunits or association with auxiliary cytoplasmic beta subunits. K
+channel subunits containing one pore domain can be assigned into one of two superfamilies: those that possess six transmembrane (TM) domains and those that possess only two TM domains. The six TM domain superfamily can be further subdivided into conserved gene families: the voltage-gated (Kv) channels; the KCNQ channels (originally known as KvLQT channels); the EAG-like K
+channels; and three types of calcium (Ca)-activated K
+channels (BK, IK and SK) [
]. The 2TM domain family comprises inward-rectifying K+channels. In addition, there are K
+channel alpha-subunits that possess two P-domains. These are usually highly regulated K
+selective leak channels.
Two types of beta subunit (KCNE and KCNAB) are presently known to associate with voltage-gated alpha subunits (Kv, KCNQ and eag-like). However, not all combinations of alpha and beta subunits are possible. The KCNE family of K+ channel subunits are membrane glycoproteins that possess a single transmembrane (TM) domain. They share no structural relationship with the alpha subunit proteins, which possess pore forming domains. The subunits appear to have a regulatory function, modulating the kinetics and voltage dependence of the alpha subunits of voltage-dependent K+ channels. KCNE subunits are formed from short polypeptides of ~130 amino acids, and are divided into five subfamilies: KCNE1 (MinK/IsK), KCNE2 (MiRP1), KCNE3 (MiRP2), KCNE4 (MiRP3) and KCNE1L (AMMECR2). KCNE3 is known to associate with the pore forming subunits KCNQ1, KCNQ4,
HERG and Kv3.4. KCNE3 forms complexes with Kv3.4 in skeletal muscle -KCNE3 mutations have been identified in families with skeletal muscle
disorders []. In the intestine, KCNE3 associates with KCNQ1 to formchannels that are stimulated by cAMP and are thought to be involved in
secretory diarrhoea and cystic fibrosis [].
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Adrenocorticotrophin (ACTH), melanocyte-stimulating hormones (MSH) and
beta-endorphin are peptide products of pituitary pro-opiomelanocortin.ACTH regulates synthesis and release of glucocorticoids and aldosterone
in the adrenal cortex; it also has a trophic action on these cells.ACTH and beta-endorphin are synthesised and released in response to
corticotrophin-releasing factor at times of stress (heat, cold, infections,etc.) - their release leads to increased metabolism and analgesia.
MSH has a trophic action on melanocytes, and regulates pigment productionin fish and amphibia. The ACTH receptor is found in high levels in
the adrenal cortex - binding sites are present in lower levels in theCNS. The MSH receptor is expressed in high levels in melanocytes,
melanomas and their derived cell lines. Receptors are found in lowlevels in the CNS. MSH regulates temperature control in the septal region
of the brain and releases prolactin from the pituitary.A further gene, which encodes a melanocortin receptor that is functionally
distinct from the ACTH and MSH receptors, has also been characterised [,
,
,
,
].The protein contains ~300 amino acids, with calculated molecular mass of
~36kDa, and potential N-linked glycosylation and phosphorylation sites[
]. The melanocortin 5 receptor (MC5-R) mediates increase in cAMPaccumulation with a characteristic pharmacology [
]. Very low expressionlevels have been detected in brain, while high levels are found in adrenals,
stomach, lung and spleen []. In situ hybridisation studies have also shownthe MC5 receptor to be expressed in the three layers of the adrenal cortex,
predominantly in the aldosterone-producing zona glomerulosa cells [].Structure-activity studies have indicated that N- and C-terminal portions
of alpha-MSH appear to be key determinants in the activation of mouseMC5R, while the melanocortin core heptapeptide sequence is devoid of
pharmacological activity [].
This domain includes the C-terminal domain from the fungal alpha aminoadipate reductase enzyme (also known as aminoadipate semialdehyde dehydrogenase) which is involved in the biosynthesis of lysine [
], as well as the reductase-containing component of the myxochelin biosynthetic gene cluster, MxcG []. The mechanism of reduction involves activation of the substrate by adenylation and transfer to a covalently-linked pantetheine cofactor as a thioester. This thioester is then reduced to give an aldehyde (thus releasing the product) and a regenerated pantetheine thiol []; in myxochelin biosynthesis this aldehyde is further reduced to an alcohol or converted to an amine by an aminotransferase. This is a fundamentally different reaction than beta-ketoreductase domains of polyketide synthases which act at a carbonyl two carbons removed from the thioester and forms an alcohol as a product. The majority of bacterial sequences containing this domain are non-ribosomal peptide synthetases in which this domain is similarly located proximal to a thiolation domain. In some cases this domain is found at the end of a polyketide synthetase enzyme, but is unlike ketoreductase domains which are found before the thiolase domains. Exceptions to this observed relationship with the thiolase domain include three proteins which consist of stand-alone reductase domains (from Mycobacterium leprae, Anabaena and from Streptomyces coelicolor) and one protein (from Nostoc) which contains N-terminal homology with a small group of hypothetical proteins but no evidence of a thiolation domain next to the putative reductase domain.This family consists of a short-chain dehydrogenase/reductase (SDR) module of multidomain proteins identified as putative polyketide sythases fatty acid synthases (FAS), and nonribosomal peptide synthases, among others. However, unlike the usual ketoreductase modules of FAS and polyketide synthase, these domains are related to the extended SDRs, and have canonical NAD(P)-binding motifs and an active site tetrad. Extended short-chain dehydrogenases/reductases (SDRs) are distinct from classical SDRs. In addition to the Rossmann fold (alpha/beta folding pattern with a central β-sheet) core region typical of all SDRs, extended SDRs have a less conserved C-terminal extension of approximately 100 amino acids. Extended SDRs are a diverse collection of proteins, and include isomerases, epimerases, oxidoreductases, and lyases; they typically have a TGXXGXXG cofactor binding motif. SDRs are a functionally diverse family of oxidoreductases that have a single domain with a structurally conserved Rossmann fold, an NAD(P)(H)-binding region, and a structurally diverse C-terminal region. Sequence identity between different SDR enzymes is typically in the 15-30% range; they catalyze a wide range of activities including the metabolism of steroids, cofactors, carbohydrates, lipids, aromatic compounds, and amino acids, and act in redox sensing. Classical SDRs have an TGXXX[AG].XG cofactor binding motif and a YXXXK active site motif, with the Tyr residue of the active site motif serving as a critical catalytic residue (Tyr-151, human 15-hydroxyprostaglandin dehydrogenase numbering). In addition to the Tyr and Lys, there is often an upstream Ser and/or an Asn, contributing to the active site; while substrate binding is in the C-terminal region, which determines specificity. The standard reaction mechanism is a 4-pro-S hydride transfer and proton relay involving the conserved Tyr and Lys, a water molecule stabilized by Asn, and nicotinamide. Atypical SDRs generally lack the catalytic residues characteristic of the SDRs, and their glycine-rich NAD(P)-binding motif is often different from the forms normally seen in classical or extended SDRs. Complex (multidomain) SDRs such as ketoreductase domains of fatty acid synthase have a GGXGXXG NAD(P)-binding motif and an altered active site motif (YXXXN). Fungal type ketoacyl reductases have a TGXXXGX(1-2)G NAD(P)-binding motif [
,
,
,
,
,
,
,
].
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The secretin-like GPCRs include secretin [
], calcitonin [], parathyroid hormone/parathyroid hormone-related peptides [] and vasoactive intestinal peptide [], all of which activate adenylyl cyclase and the phosphatidyl-inositol-calcium pathway. These receptors contain seven transmembrane regions, in a manner reminiscent of the rhodopsins and other receptors believed to interact with G-proteins (however there is no significant sequence identity between these families, the secretin-like receptors thus bear their own unique '7TM' signature). Their N-terminal is probably located on the extracellular side of the membrane and potentially glycosylated. This N-terminal region contains a long conserved region which allows the binding of large peptidic ligand such as glucagon, secretin, VIP and PACAP; this region contains five conserved cysteines residues which could be involved in disulphide bond. The C-terminal region of these receptor is probably cytoplasmic. Every receptor gene in this family is encoded on multiple exons, and several of these genes are alternatively spliced to yield functionally distinct products. The major physiological role of calcitonin is to inhibit bone resorption
thereby leading to a reduction in plasma Ca2+. Further, it enhances
excretion of ions in the kidney, prevents absorption of ions in the intestine, and inhibits secretion in endocrine cells (e.g. pancreas and
pituitary). In the CNS, calcitonin has been reported to be analgesicand to suppress feeding and gastric acid secretion. It is used to treat
Paget's disease of the bone. Calcitonin receptors are found predominantlyon osteoclasts or on immortal cell lines derived from these cells. It is
found in lower amounts in the brain (e.g. in hypothalamus and pituitarytissues) and in peripheral tissues (e.g. testes, kidney, liver and
lymphocytes). It has also been described in lung and breast cancer celllines. The predominant signalling pathway is activation of adenylyl cyclase
through G proteins, but calcitonin has also been described to have both stimulatoryand inhibitory actions on the phosphoinositide pathway.
Calcitonin gene-related peptide (CGRP) is a neuropeptide with diverse
biological effects including potent vasodilator activity []. Messenger RNA for this receptor is predominantly expressed in the lung and heart, with specific localisation to lung alveolar cells and cardiac myocytes []. In the rat lung, it is associated with blood vessels; the gene may therefore play an important role in the maintenance of vascular tone []. mRNA is also found in the cerebellum []. The ligand for this receptor-like protein remains to be discovered.
This entry consists of various predicted ABC transporter class ATPases. ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [
].The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [,
,
].The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly β-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel β-sheet of armI by a two-fold axis [
,
,
,
,
,
].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions [
]. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [,
]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [,
,
].
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The secretin-like GPCRs include secretin [
], calcitonin [], parathyroid hormone/parathyroid hormone-related peptides [] and vasoactive intestinal peptide [], all of which activate adenylyl cyclase and the phosphatidyl-inositol-calcium pathway. These receptors contain seven transmembrane regions, in a manner reminiscent of the rhodopsins and other receptors believed to interact with G-proteins (however there is no significant sequence identity between these families, the secretin-like receptors thus bear their own unique '7TM' signature). Their N-terminal is probably located on the extracellular side of the membrane and potentially glycosylated. This N-terminal region contains a long conserved region which allows the binding of large peptidic ligand such as glucagon, secretin, VIP and PACAP; this region contains five conserved cysteines residues which could be involved in disulphide bond. The C-terminal region of these receptor is probably cytoplasmic. Every receptor gene in this family is encoded on multiple exons, and several of these genes are alternatively spliced to yield functionally distinct products. Latrophilins are a family of secretin-like GPCRs that can be subdividedinto 3 subtypes: LPH1, LPH2 and LPH3. LPH1 is a brain-specific calcium
independent receptor of alpha-latrotoxin (LTX), a neurotoxin. It is the affinity of this form of the receptor for LTX that gives the family its name. LPH2 and LPH3, whilst sharing extensive sequence similarity to LPH1, do not bind LTX. LPH2 is distributed throughout most tissues, whereas LPH3 is also brain-specific []. The endogenous ligand(s) for these receptors are at present unknown. Binding of LTX to LPH1 stimulates exocytosis and the subsequent release of large amounts of neurotransmitters from neuronal and endocrine cells. The latrophilins possess up to 7 sites of alternative splicing; the resulting number of possible splice variants leads to a highly variable family of proteins.Structurally, these proteins have a seven-transmembrane region and a large extracellular N-terminal region which consists of several domains: a rhamnose binding lectin (RBL) domain, an olfactomedin-like (OLF) domain followed by a Serine/Threonine rich domain that is O-linked glycosylated, a hormone binding (HR) domain; and a GPCR Autoproteolysis INducing (GAIN) domain [
].This entry represents the C-terminal region of latrophilin.
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Neurotensin is a 13-residue peptide transmitter, sharing significant
similarity in its 6 C-terminal amino acids with several other neuropeptides,including neuromedin N. This region is responsible for the biological activity, the N-terminal portion having a modulatory role. Neurotensin is distributed throughout the central nervous system, with highest levels in the hypothalamus, amygdala and nucleus accumbens. It induces a variety of effects, including: analgesia, hypothermia and increased locomotor activity. It is also involved in regulation of dopamine pathways. In the periphery, neurotensin is found in endocrine cells of the small intestine, where it leads to secretion and smooth muscle contraction.The existence of 2 neurotensin receptor subtypes, with differing affinities
for neurotensin and differing sensitivities to the antihistamine levocabastine, was originally demonstrated by binding studies in rodent brain. Two neurotensin receptors (NT1 and NT2) with such properties have since been cloned and have been found to be G-protein-coupled receptor family members [].The NT1 receptor was cloned in 1990 from rat brain and found to act as ahigh affinity, levocabastine insensitive receptor for neurotensin [
]. The affinity of neurotensin for the receptor could be decreased by both sodium ions and guanosine triphosphate (GTP) []. The NT1 receptor is expressed predominantly in the brain and intestine. In the brain, expression has been found in the diagonal band of Broca, medial septal nucleus, nucleus basalis magnocellularis, suprachiasmatic nucleus, supramammillary area, substantia nigra and ventral tegmental area. The receptor is also expressed in the dorsal root ganglion neurones of the spinal cord. The predominant response upon activation of the receptor by neurotensin is activation of phospholipase C, causing an increase in intracellular calcium levels. The receptor can also stimulate cAMP formation, MAP kinase activation and the induction of growth related genes, such as krox-24 [].
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Neurotensin is a 13-residue peptide transmitter, sharing significant
similarity in its 6 C-terminal amino acids with several other neuropeptides,including neuromedin N. This region is responsible for the biological activity, the N-terminal portion having a modulatory role. Neurotensin is distributed throughout the central nervous system, with highest levels in the hypothalamus, amygdala and nucleus accumbens. It induces a variety of effects, including: analgesia, hypothermia and increased locomotor activity. It is also involved in regulation of dopamine pathways. In the periphery, neurotensin is found in endocrine cells of the small intestine, where it leads to secretion and smooth muscle contraction.The existence of 2 neurotensin receptor subtypes, with differing affinities
for neurotensin and differing sensitivities to the antihistamine levocabastine, was originally demonstrated by binding studies in rodent brain. Two neurotensin receptors (NT1 and NT2) with such properties have since been cloned and have been found to be G-protein-coupled receptor family members [].The NT2 receptor was cloned from rat, mouse and human brains based on its
similarity to the NT1 receptor. The receptor was found to be a low affinity,levocabastine sensitive receptor for neurotensin. Unlike the high affinity,
NT1 receptor, NT2 is insensitive to guanosine triphosphate and has low sensitivity to sodium ions []. Highest levels of expression of the receptor are found in the brain, in regions including: the olfactory system, cerebral and cerebellar cortices, hippocampus and hypothalamic nuclei. The distribution is distinct from that of the NT1 receptor, with only a fewareas (diagonal band of Broca, medial septal nucleus and suprachiasmatic nuclei) expressing both receptor subtypes. The receptor has also been found at lower levels in the kidney, uterus, heart and lung [
]. Activationof the NT2 receptor by non-peptide agonists suggests that the receptor can
couple to phospholipase C, phospholipase A2 and MAP kinase. A functionalresponse to neurotensin, however, is weak [
] or absent, and neurotensin appears to act as an antagonist of the receptor []. It has been suggested that a substance other than neurotensin may act as the natural ligand for this receptor.
This entry represents the active-site-containing domain found in the trypsin family members. The catalytic activity of the serine proteases from the trypsin family is provided by a charge relay system involving an aspartic acid residue hydrogen-bonded to a histidine, which itself is hydrogen-bonded to a serine. The sequences in the vicinity of the active site serine and histidine residues are well conserved in this family of proteases [
]. A partial list of proteases known to belong to the trypsin family is shown below.Acrosin.Blood coagulation factors VII, IX, X, XI and XII, thrombin, plasminogen,
and protein C.Cathepsin G.Chymotrypsins.Complement components C1r, C1s, C2, and complement factors B, D and I.Complement-activating component of RA-reactive factor.Cytotoxic cell proteases (granzymes A to H).Duodenase I.Elastases 1, 2, 3A, 3B (protease E), leukocyte (medullasin).Enterokinase (EC 3.4.21.9) (enteropeptidase).Hepatocyte growth factor activator.Hepsin.Glandular (tissue) kallikreins (including EGF-binding protein types A, B, and C, NGF-gamma chain, gamma-renin, prostate specific antigen (PSA) and tonin).Plasma kallikrein.Mast cell proteases (MCP) 1 (chymase) to 8.Myeloblastin (proteinase 3) (Wegener's autoantigen).Plasminogen activators (urokinase-type, and tissue-type).Trypsins I, II, III, and IV.Tryptases.All the above proteins belong to family S1 in the classification of peptidases [
] and originate from eukaryotic species. It should be noted that bacterial proteases that belong to family S2A are similar enough in the regions of the active site residues that they can be picked up by the same patterns. These proteases are listed below.Achromobacter lyticus protease I.Lysobacter alpha-lytic protease.Streptogrisin A and B (Streptomyces proteases A and B).Streptomyces griseus glutamyl endopeptidase II.Streptomyces fradiae proteases 1 and 2.
Fibronectin is a dimeric glycoprotein composed of disulfide-linked subunits
with a molecular weight of 220-250kDa each. It is involved in cell adhesion,
cell morphology, thrombosis, cell migration, and embryonic differentiation. Fibronectin is a modular protein composed of homologous repeats of threeprototypical types of domains known as types I, II, and III [
].Fibronectin type-III (FN3) repeats are both the largest and the most common of the fibronectin subdomains. Domains homologous to FN3 repeats have been found
in various animal protein families including other extracellular-matrixmolecules, cell-surface receptors, enzymes, and muscle proteins [
]. Structures of individual FN3 domains have revealed a conserved β-sandwich fold with one β-sheet containing four strands and the other sheet containing three strands (see for example ) [
]. This fold is topologically very similar to that of Ig-like domains, with a notable difference being the lack of a conserved disulfide bond in FN3 domains. Distinctive hydrophobic core packing and the lack of detectablesequence homology between immunoglobulin and FN3 domains suggest, however,
that these domains are not evolutionarily related [].FN3 exhibits functional as well as structural modularity. Sites of interaction with other molecules have been mapped to short stretch of amino acids such as the Arg-Gly-Asp (RGD) sequence found in various FN3 domains. The RGD sequences is involved in interactions with integrin. Small peptides containing the RGD sequence can modulate a variety of cell adhesion invents associated with thrombosis, inflammation, and tumour metastasis. These properties have led to the investigation of RGD peptides and RGD peptide analogues as potential therapeutic agents [
].
Protein protease inhibitors constitute a very important mechanism for regulating proteolytic activity. Serpins (SERine Proteinase INhibitors) belong to MEROPS inhibitor family I4, clan ID. Most serpin family members are indeed serine protease inhibitors, but several have additional cross-class inhibition functions and inhibit cysteine protease family members such as the caspases and cathepsins [
,
]. Others, such as ovalbumin, are incapable of protease inhibition and serve other functions []. The serpins are a functionally diverse family of proteins with a highly conserved structure. Members of the serpin family have been identified in a variety of organisms including animals, viruses, plants [
,
], archaea and bacteria [,
]. Vertebrate serpins are involved in fundamental biological processes such as blood coagulation, complement activation, fibrinolysis, angiogenesis, inflammation and tumor suppression []. A fungal serpin (celpin) has also been characterised and it is thought to protect the cellulose-degrading apparatus (cellulosome) against proteolytic degradation [].Serpins share a highly conserved core structure that is critical for their functioning as serine protease inhibitors [
,
]. Inhibitory serpins comprise several α-helix and β-strands together with an external reactive centre loop (RCL) containing the active site recognised by the target enzyme. The conserved native fold consists of three β-sheets (A, B and C) surrounded by α-helices (up to nine, A-I) and the RCL. Serpins form covalent complexes with target proteases. Their mechanism of protease inhibition is known as irreversible "trapping", in which a rapid conformational change traps the cognate protease in a covalent complex resulting in permanent inactivation of both the serpin and its cognate proteinase [].
Deubiquitinating enzymes (DUB) form a large family of cysteine protease that can deconjugate ubiquitin or ubiquitin-like proteins (see
) from ubiquitin-conjugated proteins. All DUBs contain a catalytic domain surrounded by one or more subdomains, some of which contribute to target recognition. The ~120-residue DUSP (domain present in ubiquitin-specific proteases) domain is one of these specific subdomains. Single or tandem DUSP domains are located both N- and C-terminal to the ubiquitin carboxyl-terminal hydrolase catalytic core domain (see
) [
]. The DUSP domain displays a tripod-like AB3 fold with a three-helix bundle and a three-stranded anti-parallel β-sheet resembling the legs and seat of the tripod. Conserved residues are predominantly involved in hydrophobic packing interactions within the three α-helices. The most conserved DUSP residues, forming the PGPI motif, are flanked by two long loops that vary both in length and sequence. The PGPI motif packs against the three-helix bundle and is highly ordered [
]. The function of the DUSP domain is unknown but it may play a role in protein/protein interaction or substrate recognition. This domain is associated with ubiquitin carboxyl-terminal hydrolase family 2 (
, MEROPS peptidase family C19). They are a family 100 to 200kDa peptides which includes the Ubp1 ubiquitin peptidase from yeast; others include:
Mammalian ubiquitin carboxyl-terminal hydrolase 4 (USP4),Mammalian ubiquitin carboxyl-terminal hydrolase 11 (USP11), Mammalian ubiquitin carboxyl-terminal hydrolase 15 (USP15), Mammalian ubiquitin carboxyl-terminal hydrolase 20 (USP20), Mammalian ubiquitin carboxyl-terminal hydrolase 32 (USP32), Vertebrate ubiquitin carboxyl-terminal hydrolase 33 (USP33), Vertebrate ubiquitin carboxyl-terminal hydrolase 48 (USP48).
Ferredoxin reductase is a member of the flavoprotein pyridine nucleotide cytochrome reductases [
] (FPNCRs) that catalyse the interchange of reducing equivalents between one-electron carriers and the two-electron-carrying nicotinamide dinucleotides. Ferredoxin reductase catalyzes the final step of electron transfer to make NADPH and ATP in plant chloroplasts during photosynthesis. Other family members include plant and fungal:NAD(P)H:nitrate reductases [
,
]
NADH:cytochrome b5 reductases [
]
NADPH:P450 reductases [
]
NADPH:sulphite reductases [
]
nitric oxide synthases [
]
phthalate dioxygenase reductase [
]
various other flavoproteinsDespite functional similarities, FPNCRs show no sequence similarity to NADPH:adrenodoxin reductases [
], nor to bacterial ferredoxin:NAD reductases and their homologues []. To date, structures for a number of family members have been solved: Spinacia oleracea (Spinach) ferredoxin:NADP reductase [
]
Burkholderia cepacia (Pseudomonas cepacia) phthalate dioxygenase reductase [
]
Zea mays (Maize) nitrate reductase flavoprotein domain [
]
Sus scrofa (Pig) NADH:cytochrome b5 reductase [
]. In all of them, the FAD-binding domain (N-terminal) has the topology of an anti-parallel β-barrel, while the NAD(P)-binding domain (C-terminal) has the topology of a classical pyridine dinucleotide-binding fold (i.e. a central parallel β-sheet with 2 helices on each side) [].Proteins in this family also include benzoyl-CoA oxygenase component A (BoxA), which forms a complex with BoxB that catalyses the aerobic reduction/oxygenation of the aromatic ring of benzoyl-CoA to form 2,3-dihydro-2,3-dihydroxybenzoyl-CoA. BoxA also acts as a reductase that uses NADPH to reduce the oxygenase component BoxB. BoxAB does not act on NADH or benzoate [
].
DNA replication in eukaryotes results from a highly coordinated interaction between proteins, often as part of protein complexes, and the DNA template. One of the key early steps leading to DNA replication is formation of the pre-replication complex, or pre-RC. The pre-RC is formed by the sequential binding of the origin recognition complex (ORC), Cdc6 and Cdt1 proteins, and the MCM complex. Activation of the pre-RC into the initiation complex (IC) is achieved via the action of S-phase kinases, eventually leading to the loading of the replication machinery.Recently, a novel replication complex, GINS (for Go, Ichi, Nii, and San; five, one, two, and three in Japanese), has been identified [
,
]. The precise function of GINS is not known. However, genetic and two-hybrid interactions indicate that it mediates the loading of the enzymatic replication machinery at a step after the action of the S-phase kinases [
]. Furthermore, GINS may be a part of the replication machinery itself, since it is found associated with replicating DNA [,
]. Electron microscopy of GINS shows that it forms a ring-like structure [], reminiscent of the structure of PCNA [], the DNA polymerase delta replication clamp. This observation, coupled with the observed interactions for GINS, indicates that the complex may represent the replication clamp for DNA polymerase epsilon [].The GINS complex is essential for initiation of DNA replication in Xenopus egg extracts [
]. This 100kDa stable complex includes Sld5, Psf1, Psf2, and Psf3. Homologues of these components are found also in other eukaryotes. This family of proteins represents the Psf3 component.
The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates ([intenz:2.4.1.-]) and related proteins into distinct sequence based families has been described []. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form 'clans'.Glycosyltransferase family 10 (
) comprises enzymes with two known activities; galactoside 3(4)-L-fucosyltransferase (
) and galactoside 3-fucosyltransferase (
).
The galactoside 3-fucosyltransferases display similarities with the alpha-2 and alpha-6-fucosyltranferases [
]. The biosynthesis of the carbohydrate antigen sialyl Lewis X (sLe(x)) is dependent on the activity of an galactoside 3-fucosyltransferase. This enzyme catalyses the transfer of fucose from GDP-beta-fucose to the 3-OH of N-acetylglucosamine present in lactosamine acceptors []. Some of the proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Galactoside 3(4)-L-fucosyltransferase () belongs to the Lewis blood group system and is associated with Le(a/b) antigen.
Ubiquitin-conjugating enzymes (
, UBC or E2 enzymes) catalyse the covalent attachment of ubiquitin to target proteins. Ubiquitin is conjugated to the target protein through the coordinated action of three enzyme activities designated E1, E2, and E3. The E1 or ubiquitin-activating enzyme forms, in an ATP-dependent manner, a thioester linkage between its active site cysteine and the carboxy terminus of ubiquitin. The activated ubiquitin moiety is then transferred from E1 to the active site cysteine in E2 through a trans-thiol esterification reaction. The UBC enzyme later ligates ubiquitin directly to substrate proteins with or without the assistance of 'N-end' recognizing proteins (E3) [
,
,
]. In most species there are many forms of UBC (at least 9 in yeast) which are implicated in diverse cellular functions. A cysteine residue is required for ubiquitin-thiolester formation. There is a single conserved cysteine in UBC's and the region around that residue is conserved in the sequence of known UBC isozymes. The UBC core is an alpha/beta domain containing one four-stranded antiparallel β-sheet and four α-helices (
). Three of these helices flank two opposite edges of the sheet, and one helix lays diagonally across one broad face of the sheet. The other face of the sheet is exposed to solvent. One turn of a 3(10)-helix is located between the fourth strand of the sheet and the second α-helix. The active site cysteine is situated in a segment between the fourth strand of the sheet and the 3(10)-helix [
]. The signature pattern, of this entry, contains the active-site cysteine and spans the complete catalytic domain.
The PapD-like superfamily of periplasmic chaperones directs the assembly of over 30 diverse adhesive surface organelles that mediate the attachment of many different pathogenic bacteria to host tissues, a critical early step in the development of disease. PapD, the prototypical chaperone, is necessary for the assembly of P pili. P pili contain the adhesin PapG, which mediates the attachment of uropathogenic Escherichia coli to Gal(alpha) Gal receptors present on kidney cells and are critical for the initiation of pyelonephritis. The PapD-like chaperones consist of two Ig-like domains oriented toward each other, forming L-shaped molecules. In the chaperone-subunit complex, the G1beta strand of the chaperone completes an atypical Ig fold in the subunit by occupying the groove and running parallel to the subunit C-terminal F strand. This donor strand complementation interaction simultaneously stabilises pilus subunits and caps their interactive surfaces, preventing their premature oligomerisation in the periplasm. During pilus biogenesis, the highly conserved N-terminal extension of one subunit has been proposed to displace the chaperone G1beta strand from its neighbouring subunit in a mechanism termed donor strand exchange [
].This entry represents the immunoglobulin (Ig)-like β-sandwich domain found in PapD, as well as in other periplasmic chaperone proteins that include FimC and SfaE from E. coli, and Caf1m from Yersinia pestis [
]. In addition, major sperm proteins (MSP) and other related sperm proteins (such as WR4 and SSP-19) contain an Ig-like domain with a similar structural fold to PapD [,
]. Major sperm proteins are central components in molecular interactions underlying sperm motility, with many isoforms existing in Caenorhabditis elegans.
The SANT domain is a motif of ~50 amino acids present in proteins involved in chromatin-remodelling and transcription regulation. This eukaryotic domain was identified in nuclear receptor co-repressors and named after switching-defective protein 3 (Swi3), adaptor 2 (Ada2), nuclear receptor co-repressor (N-CoR) and transcription factor (TF)IIIB [
]. Although SANT domains show remarkable sequence and structural similarity to the DNA-binding helix-turn-helix (HTH) domain of the myb-like tandem repeat, their function is not DNA binding. Instead, SANT domains are protein-protein interaction modules and some can bind to histone tails (e.g. in Ada2 and SMRT). The SANT domain has been proposed to function as a histone-interaction module that couples histone-tail binding to enzyme catalysis for the remodelling of nucleosomes [,
].SANT domains are found in combination with other domains, such as the SWIRM domain (
), the ZZ-type zinc finger (see
), the C2H2-type zinc finger, the GATA-type zinc finger (
), the MPN-domain and DEAH ATP-helicase domain.
The 3-dimensional structure of the SANT domain forms three alpha helices [
] similar to the DNA-binding myb-type HTH domain. Because of the strong resemblance, the SANT domain can also be detected as a myb-like "DNA-binding"domain. Most SANT domains have acidic amino acids at the start of helix 2 and in helix 3, while myb-like DNA-binding domains have more positively charged residues, in particular in their third 'recognition' helix. The bulky aromatic and hydrophobic residues in the centre of helix 3 that are incompatible with DNA contacts of myb-like DNA-binding domains form another distinguishing property of SANT domains.
The bacterial opsins are retinal-binding proteins that provide light- dependent ion transport and sensory functions to a family of halophilic bacteria [
,
]. They are integral membrane proteins believed to contain seven transmembrane (TM) domains, the last of which contains the attachment point for retinal (a conserved lysine). There are several classes of these bacterial proteins: they include bacteriorhodopsin and archaerhodopsin, which are light-driven proton pumps; halorhodopsin, a light-driven chloride pump; and sensory rhodopsin, which mediates both photoattractant (in the red) and photophobic (in the UV) responses.Fungi also contain proteins with similarities to opsin. In the Neurospora crassa opsin NOP-1 the chromophore is buried in a pocket within a 7TM structure, and bound by a protonated Schiff base to a lysine. The absorption of green light leads to an all-trans isomerisation of retinal, followed by the deprotonation of the Schiff base, resulting in a near-UV-absorbing intermediate. Archaeal rhodopsins employ this mechanism in order to pump protons over the plasma membrane and act predominantly as light-driven ion transporters the reaction cycle of NOP-1 is far too long (up to seconds) to operate as an effective ion pump, suggesting rather that it has signaling functions [
]. Deletion of nop-1 does not cause any discernible phenotype [,
].This entry contains two conserved patterns: the first pattern (BACTERIAL_OPSIN_1) corresponds to the third transmembrane region (called helix C) and includes an arginine residue which seems involved in the release of a proton from the Schiff's base to the extracellular medium, the second pattern (BACTERIAL_OPSIN_RET) includes the retinal binding lysine [].
Activator protein-2 (AP-2) transcription factors constitute a family of closely related and evolutionarily conserved proteins that bind to the DNA
consensus sequence 5'-GCCNNNGGC-3' and stimulate target gene transcription [,
]. Five different isoforms of AP-2 have been identified in mammals, termed AP-2 alpha, beta, gamma, delta and epsilon. Each family member shares a common structure, possessing a proline/glutamine-rich domain in the N-terminal region, which is responsible for transcriptional activation [], and a helix-span-helix domain in the C-terminal region, which mediates dimerisation and site-specific DNA binding [].The AP-2 family have been shown to be critical regulators of gene expression during embryogenesis. They regulate the development of facial prominence and limb buds, and are essential for cranial closure and development of the lens [
,
]; they have also been implicated in tumorigenesis. AP-2 protein expression levels have been found to affect cell transformation, tumour growth and metastasis, and may predict survival in some types of cancer [,
]. Mutations in human AP-2 have been linked with bronchio-occular-facial syndrome and Char Syndrome, congenital birth defects characterised by craniofacial deformities and patent ductus arteriosus, respectively []. AP-2 beta was originally isolated by cDNA screening of a human genomic
library []. The protein was designated AP-2 beta on the basis of its high sequence similarity to AP-2 alpha, its site-specific DNA binding, and its
ability to stimulate transcription []. Defects in AP-2 beta have been shownto cause Char syndrome, an autosomal dominant trait characterised by patent
ductus arteriosus, facial dysmorphism and hand anomalies.
Activator protein-2 (AP-2) transcription factors constitute a family of closely related and evolutionarily conserved proteins that bind to the DNA
consensus sequence 5'-GCCNNNGGC-3' and stimulate target gene transcription [,
]. Five different isoforms of AP-2 have been identified in mammals, termed AP-2 alpha, beta, gamma, delta and epsilon. Each family member shares a common structure, possessing a proline/glutamine-rich domain in the N-terminal region, which is responsible for transcriptional activation [], and a helix-span-helix domain in the C-terminal region, which mediates dimerisation and site-specific DNA binding [].The AP-2 family have been shown to be critical regulators of gene expression during embryogenesis. They regulate the development of facial prominence and limb buds, and are essential for cranial closure and development of the lens [
,
]; they have also been implicated in tumorigenesis. AP-2 protein expression levels have been found to affect cell transformation, tumour growth and metastasis, and may predict survival in some types of cancer [,
]. Mutations in human AP-2 have been linked with bronchio-occular-facial syndrome and Char Syndrome, congenital birth defects characterised by craniofacial deformities and patent ductus arteriosus, respectively []. AP-2 gamma was originally isolated from murine carcinoma cells [
]. The gene was found to be expressed in several embryonic areas whose development can be affected by retinoids, such as the forebrain, face and limb buds []. A human homologue has also been identified which is involved in the MTA1-mediated epigenetic regulation of ESR1 expression in breast cancer []. The protein was initially termed AP-2.2, but has since been reclassified as AP-2 gamma.
The basic structure of immunoglobulin (Ig) molecules is a tetramer of two light chains and two heavy chains linked by disulphide bonds. There are two types of light chains: kappa and lambda, each composed of a constant domain (CL) and a variable domain (VL). There are five types of heavy chains: alpha, delta, epsilon, gamma and mu, all consisting of a variable domain (VH) and three (in alpha, delta and gamma) or four (in epsilon and mu) constant domains (CH1 to CH4). Ig molecules are highly modular proteins, in which the variable and constant domains have clear, conserved sequence patterns. The domains in Ig and Ig-like molecules are grouped into four types: V-set (variable;
), C1-set (constant-1;
), C2-set (constant-2;
) and I-set (intermediate;
) [
]. Structural studies have shown that these domains share a common core Greek-key β-sandwich structure, with the types differing in the number of strands in the β-sheets as well as in their sequence patterns [,
].Immunoglobulin-like domains that are related in both sequence and structure can be found in several diverse protein families. Ig-like domains are involved in a variety of functions, including cell-cell recognition, cell-surface receptors, muscle structure and the immune system [
]. This entry represents the C2-set type domains found in the T-cell antigen CD80, as well as in related proteins. CD80 (B7-1) is a glycoprotein expressed on antigen-presenting cells [
]. The shared ligands on CD80 and CD86 (B7-2) deliver the co-stimulatory signal through CD28 and CTLA-4 on T-cells, where CD28 augments the T-cell response and CTLA-4 attenuates it [].
Fibronectin is a dimeric glycoprotein composed of disulfide-linked subunits
with a molecular weight of 220-250kDa each. It is involved in cell adhesion,cell morphology, thrombosis, cell migration, and embryonic differentiation. Fibronectin is a modular protein composed of homologous repeats of three
prototypical types of domains known as types I, II, and III [].Fibronectin type-III (FN3) repeats are both the largest and the most common of the fibronectin subdomains. Domains homologous to FN3 repeats have been foundin various animal protein families including other extracellular-matrix
molecules, cell-surface receptors, enzymes, and muscle proteins []. Structures of individual FN3 domains have revealed a conserved β-sandwich fold with one β-sheet containing four strands and the other sheet containing three strands (see for example ) [
]. This fold is topologically very similar to that of Ig-like domains, with a notable difference being the lack of a conserved disulfide bond in FN3 domains. Distinctive hydrophobic core packing and the lack of detectablesequence homology between immunoglobulin and FN3 domains suggest, however,
that these domains are not evolutionarily related [].FN3 exhibits functional as well as structural modularity. Sites of interaction with other molecules have been mapped to short stretch of amino acids such as the Arg-Gly-Asp (RGD) sequence found in various FN3 domains. The RGD sequences is involved in interactions with integrin. Small peptides containing the RGD sequence can modulate a variety of cell adhesion invents associated with thrombosis, inflammation, and tumour metastasis. These properties have led to the investigation of RGD peptides and RGD peptide analogues as potential therapeutic agents [
].
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes [
]. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Many families of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases [].Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base [
]. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [,
].This entry represents Apicomplexa rhomboid-like proteins (
), Rom4/Rom5, which are members of the S54 peptidase family of proteins [
]. These proteins are putative serine protease involved in intra-membrane proteolysis and the subsequent release of polypeptides from their membrane anchors. They cleave type-1 transmembrane domains using a catalytic triad composed of serine, histidine and asparagine contributed by different transmembrane domains.
The serum paraoxonases/arylesterases are enzymes that catalyse the hydrolysisof the toxic metabolites of a variety of organophosphorus insecticides. The
enzymes hydrolyse a broad spectrum of organophosphate substrates, including paraoxon and a number of aromatic carboxylic acid esters (e.g., phenyl
acetate), and hence confer resistance to organophosphate toxicity []. Mammals have 3 distinct paraoxonase types, termed PON1-3 [,
]. In mice andhumans, the PON genes are found on the same chromosome in close proximity.
PON activity has been found in variety of tissues, with highest levels in liver and serum - the source of serum PON is thought to be the liver. Unlike mammals, fish and avian species lack paraoxonase activity.
Human and rabbit PONs appear to have two distinct Ca2+ binding sites, onerequired for stability and one required for catalytic activity. The Ca2+
dependency of PONs suggests a mechanism of hydrolysis where Ca2+ acts as theelectrophillic catalyst, like that proposed for phospholipase A2. The
paraoxonase enzymes, PON1 and PON3, are high density lipoprotein (HDL)-associated proteins capable of preventing oxidative modification of low
density lipoproteins (LPL) []. Although PON2 has oxidative properties, theenzyme does not associate with HDL.
Within a given species, PON1, PON2 and PON3 share ~60% amino acid sequence identity, whereas between mammalian species particular PONs (1,2 or 3) share
79-90% identity at the amino acid level. Human PON1 and PON3 share numerous conserved phosphorylation and N-glycosylation sites; however, it is not
known whether the PON proteins are modified at these sites, or whether modification at these sites is required for activity in vivo [
].
Members of this group are predicted signal transduction proteins containing cytoplasmic sensor domain GAF and an RNA-binding anti-terminator ANTAR domain.In members of this group, regulation/signal receiving is predicted to be performed by the GAF domain. GAF is a ubiquitous signalling/sensory domain. It has been originally described as a non-catalytic cGMP-binding domain conserved in cyclic nucleotide phosphodiesterases [
]. Subsequently, this domain was recognised in cyanobacterial adenylate cyclases, histidine kinases and certain other proteins []. It has been predicted to regulate allosterically catalytic activities via binding ligands, such as nucleotides and small molecules [].ANTAR is a transcriptional anti-terminator domain [
,
,
] and is most often found fused to the CheY-like receiver domain to form response regulator anti-terminator (). Superficially, the coiled-coil and three-helix bundle that form this domain in AmiR [
] () appear radically different from the compact HTH DNA-binding domain of the NarL protein. However, the last three helices in AmiR are very similar in length and hydropathy profiles to those of NarL and its homologues, and are arranged in a very similar topology, suggesting an evolutionary relationship [
]. These C-terminal helices of AmiR appear to be essential for its transcription anti-termination activity []. However, helix-turn-helix domains like those in NarL or OmpR [] are adapted to sequence-specific binding in the major groove of double-stranded B-form DNA. It is not clear how such a structure might function in a protein whose role is to prevent the formation of a termination stem-loop structure, by binding single-stranded RNA [].
Myotubularin-related protein 7 (MTMR7) is a member of the myotubularin (MTM) family. MTMR9 is a binding partner of MTMR7, and binding of MTMR9 increases the phosphatase activity of MTMR7 [
]. MTMR9 and MTMR7 may be involved in regulating T-helper (Th) cells differentiation [].The myotubularin family constitutes a large group of conserved proteins, with 14 members in humans consisting of myotubularin (MTM1) and 13 myotubularin-related proteins (MTMR1-MTMR13). Orthologues have been found throughout the eukaryotic kingdom, but not in bacteria. MTM1 dephosphorylates phosphatidylinositol 3-monophosphate (PI3P) to phosphatidylinositol and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2] to phosphatidylinositol 5-monophosphate (PI5P) [,
]. The substrate phosphoinositides (PIs) are known to regulate traffic within the endosomal-lysosomal pathway []. MTMR1, MTMR2, MTMR3, MTMR4, and MTMR6 have also been shown to utilise PI(3)P as a substrate, suggesting that this activity is intrinsic to all active family members. On the other hand, six of the MTM family members encode for catalytically inactive phosphatases. Inactive myotubularin phosphatases contain substitutions in the Cys and Arg residues of the Cys-X5-Arg motif. MTM pseudophosphatases have been found to interact with MTM catalytic phosphatases []. The myotubularin family includes several members mutated in neuromuscular diseases or associated with metabolic syndrome, obesity, and cancer [].MTMR7 contains a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain, and a C-terminal coiled-coil region. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. This entry represents the PH-GRAM domain of MTMR7.
This superfamily represents the receptor-binding domain (RBD) of alpha-2-macroglobulin proteins. The RBD is located at the C terminus, its structure having an immunoglobulin-like fold consists of a sandwich of nine strands in two sheets with a Greek-key topology [
,
].The alpha-macroglobulin (aM) family of proteins includes protease inhibitors [
], typified by the human tetrameric a2-macroglobulin (a2M); they belong to the MEROPS proteinase inhibitor family I39, clan IL. These protease inhibitors share several defining properties, which include (i) the ability to inhibit proteases from all catalytic classes, (ii) the presence of a 'bait region' and a thiol ester, (iii) a similar protease inhibitory mechanism and (iv) the inactivation of the inhibitory capacity by reaction of the thiol ester with small primary amines. aM protease inhibitors inhibit by steric hindrance []. The mechanism involves protease cleavage of the bait region, a segment of the aM that is particularly susceptible to proteolytic cleavage, which initiates a conformational change such that the aM collapses about the protease. In the resulting aM-protease complex, the active site of the protease is sterically shielded, thus substantially decreasing access to protein substrates. Two additional events occur as a consequence of bait region cleavage, namely (i) the h-cysteinyl-g-glutamyl thiol ester becomes highly reactive and (ii) a major conformational change exposes a conserved COOH-terminal receptor binding domain [] (RBD). RBD exposure allows the aM protease complex to bind to clearance receptors and be removed from circulation []. Tetrameric, dimeric, and, more recently, monomeric aMprotease inhibitors have been identified [
,
].
The anti-apoptotic protein p35 from baculovirus is thought to prevent the suicidal response of
infected insect cells by inhibiting caspases. Ectopic expression of p35 in a number of transgenic animals or cell lines is also anti-apoptotic, giving rise to the hypothesis that the protein is a general inhibitor of caspases. This protein belongs to MEROPS proteinase inhibitor family I50, clan IQ. Purified recombinant p35 inhibits human caspase-1, -3, -6, -7, -8, and -10 but does not significantly inhibit unrelated serine or cysteine proteases, implying that p35 is a potent caspase-specific inhibitor. The interaction of p35 with caspase-3, as a model of the inhibitory mechanism,revealed classic slow-binding inhibition, with both active-sites of the caspase-3 dimer acting equally and independently. Inhibition resulted from complex formation between the enzyme and inhibitor, which could be visualised under non-denaturing conditions, but was dissociated by SDS to give p35 cleaved at Asp87, the P1 residue of the inhibitor. Complex formation requires the substrate-binding cleft to be unoccupied [].Infecting the insect cell line IPLB-Ld652Y with the baculovirus Autographa californica nuclear polyhedrosis virus (AcMNPV) results in global translation arrest, which correlates with the presence of the AcMNPV apoptotic suppressor, p35. However, the anti-apoptotic function of p35 in translation arrest is not solely due to caspase inactivation, but its activity enhances signalling to a separate translation arrest pathway, possibly by stimulating the late stages of the baculovirus infection cycle [
]. The baculovirus p35 structure forms a sandwich composed of 14 strands in 2 sheets with a greek-key topology.
Aurora kinase A (AURKA, also known as Aurora 2) is a mitotic serine/threonine kinase that contributes to the regulation of cell cycle progression. It associates with the centrosome and the spindle microtubules during mitosis and plays a critical role in regulating centrosome maturation and separation and bipolar spindle assembly [
,
]. It also plays an important role in the spindle checkpoint regulation []. Aurora A promotes mitotic entry by controlling activation of Cyclin-B/Cdk-1. It regulates the progression of mitosis by phosphorylation of multiple substrates, such as Polo-like kinase-1, ajuba, enhancer of filamentation 1, BORA, TPX2, PLK-1, astrin, growth arrest and DNA damage-inducible 45alpha, transforming acidic coiled-coil containing protein 3 (TACC3) and centrosomin []. During mitotic exit, AURKA is targeted for degradation through its interaction with the multi-subunit E3-ubiquitin ligase anaphase promoting complex/cyclosome (APC/C) [].The Aurora kinases are highly conserved serine/threonine kinases that regulate chromosomal alignment and segregation during mitosis and meiosis. Three mammalian Aurora kinases, Aurora A, B and C, have been identified. They all contain a protein kinase domain and a destruction box (D-box) recognised by the multi-subunit E3-ubiquitin ligase anaphase promoting complex/cyclosome (APC/C), which mediates their proteasomal degradation. However, their N-terminal domain share little sequence identity and confer unique protein-protein interaction abilities among the Aurora kinases [
]. They are differentially expressed at high levels in rapidly dividing tissues such as hematopoietic cells (A and B) and germ cells (C only). Their expression is low or absent in most adult tissues due to their lower rates of proliferation [].
This entry represents the receptor-binding domain (RBD) of alpha-2-macroglobulin and related proteins. The RBD is located at the C terminus, its structure having an immunoglobulin-like fold consists of a sandwich of nine strands in two sheets with a Greek-key topology [
,
].The alpha-macroglobulin (aM) family of proteins includes protease inhibitors [
,
], typified by the human tetrameric a2-macroglobulin (a2M); they belong to the MEROPS proteinase inhibitor family I39, clan IL. These protease inhibitors share several defining properties, which include (i) the ability to inhibit proteases from all catalytic classes, (ii) the presence of a 'bait region' and a thiol ester, (iii) a similar protease inhibitory mechanism and (iv) the inactivation of the inhibitory capacity by reaction of the thiol ester with small primary amines. aM protease inhibitors inhibit by steric hindrance []. The mechanism involves protease cleavage of the bait region, a segment of the aM that is particularly susceptible to proteolytic cleavage, which initiates a conformational change such that the aM collapses about the protease. In the resulting aM-protease complex, the active site of the protease is sterically shielded, thus substantially decreasing access to protein substrates. Two additional events occur as a consequence of bait region cleavage, namely (i) the h-cysteinyl-g-glutamyl thiol ester becomes highly reactive and (ii) a major conformational change exposes a conserved COOH-terminal receptor binding domain [] (RBD). RBD exposure allows the aM protease complex to bind to clearance receptors and be removed from circulation []. Tetrameric, dimeric, and, more recently, monomeric aM protease inhibitors have been identified [,
].
This entry includes the R3H domain of the NF-kappaB-repression factor (NRF). NRF is a nuclear inhibitor of NF-kappaB proteins that can silence the IFNbeta promoter via binding to a negative regulatory element (NRE) [
,
]. Besides the R3H domain, NRF also contains a G-patch domain [].The R3H domain is a conserved sequence motif found in proteins from a diverse range of organisms including eubacteria, green plants, fungi and various groups of metazoans, but not in archaea and Escherichia coli. The domain is named R3H because it contains an invariant arginine and a highly conserved histidine, that are separated by three residues. It also displays a conserved pattern of hydrophobic residues, prolines and glycines. It can be found alone, in association with AAA domain or with various DNA/RNA binding domains like DSRM, KH, G-patch, PHD, DEAD box, or RRM. The functions of these domains indicate that the R3H domain might be involved in polynucleotide-binding, including DNA, RNA and single-stranded DNA [
].The 3D structure of the R3H domain has been solved. The fold presents a small motif, consisting of a three-stranded antiparallel β-sheet, against which two α-helices pack from one side. This fold is related to the structures of the YhhP protein and the C-terminal domain of the translational initiation factor IF3. Three conserved basic residues cluster on the same face of the R3H domain and could play a role in nucleic acid recognition. An extended hydrophobic area at a different site of the molecular surface could act as a protein-binding site [
].
This entry represents the dimerization/docking (D/D) domain of RIbeta, the Type I beta Regulatory subunit of cAMP-dependent protein kinase. RIbeta is expressed highly in the brain and is associated with hippocampal function. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence, and two c-AMP binding domains. The D/D domain dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell [
,
]. cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalysing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. There are two classes of R subunits, RI and RII; each exists as two isoforms (alpha and beta) from distinct genes. These functionally non-redundant R isoforms allow for specificity in PKA signalling. RI subunits are pseudo-substrates as they do not contain a phosphorylation site in their inhibitory site unlike RII subunits. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis [,
].
Metallothioneins (MT) are small proteins that bind heavy metals, such as zinc, copper, cadmium, nickel, etc. They have a high content of cysteine residues that bind the metal ions through clusters of thiolate bonds [
,
]. An empirical classification into three classes has been proposed by Fowler and coworkers [] and Kojima []. Members of class I are defined to include polypeptides related in the positions of their cysteines to equine MT-1B, and include mammalian MTs as well as from crustaceans and molluscs. Class II groups MTs from a variety of species, including sea urchins,fungi, insects and cyanobacteria. Class III MTs are atypical polypeptides composed of gamma-glutamylcysteinyl units [
].This original classification system has been found to be limited, in the sense that it does not allow clear differentiation of patterns of structural similarities, either between or within classes. Subsequently, a new classification was proposed on the basis of sequence similarity derived from phylogenetic relationships, which basically proposes an MT family for each main taxonomic group of organisms [
]. Echinoidea (sea urchin, family 4) MTs are 64-67 residue proteins. Members of this family are recognised by the sequence pattern P-D-x-K-C-[V,F]-C-C-x(5)-C-x-C-x(4)-C-C-x(4)-C-C-x(4,6)-C-C located near the N terminus. The taxonomic range of the members extends to sea urchins (echinodea).
The protein sequence is divided into two structural domains, each containing 9 and 11 Cys residues binding 3 and 4 bivalent metal ions, respectively.Family 4 includes subfamilies: e1, e2, they are separate phylogenetic groups. This entry includes the sea urchin proteins, and related sequences from worms.
The yeast fatty acid synthase (FAS) is a hexameric complex (alpha 6 beta 6) of two multifunctional proteins, alpha and beta [
]. The alpha subunit contains two of the seven enzymatic activities required for the synthesis of fatty acids, together with the site for attachment of the prosthetic group 4'-phosphopantetheine. The beta subunit contains the remaining five enzyme domains: acetyltransferase and malonyltransferase, s-acyl fatty acid synthase thioesterase, enoyl-[acyl-carrier protein]reductase, and 3-hydroxypalmitoyl-[acyl-carrier protein] dehydratase.The sequential order of the five FAS1-encoded enzyme domains is co-linear in Yarrowia lipolytica (Candida lipolytica) and Saccharomyces cerevisiae (Baker's yeast), which observation is consistent with evidence that the functional organisation of FAS genes is similar in related organisms but differs between unrelated species [
].Sterigmatocystin (ST) and the aflatoxins (AFs) (related fungal secondary metabolites) are among the most toxic, mutagenic and carcinogenic natural products known [
]. In Emericella nidulans (Aspergillus nidulans), the ST biosynthetic pathway is believed to involve at least 15 enzymatic activities; some Aspergillus parasiticus, Aspergillus flavus and Aspergillus nomius strains contain additional activities that convert ST to AF. A 60kb region of the A. nidulans genome has been characterised and found to contain virtually all of the genes needed for ST biosynthesis [
]. The deduced polypeptide sequences of regions within this cluster share a high degree of similarity with enzymes that have activities predicted for ST/AF biosynthesis, including a polyketide synthase, a fatty acid synthase (alpha and beta subunits), five monooxygenases, four dehydrogenases, an esterase, an 0-methyltransferase, a reductase, an oxidase and a zinc cluster DNA binding protein [].
Bacteria such as Brevundimonas diminuta (Pseudomonas diminuta) harbour a plasmid that carries the gene for phosphotriesterase (PTE also known as parathion hydrolase;
). This enzyme has attracted interest because of its potential use in the detoxification of chemical waste and organophosphate warfare agents such as VX, soman, and sarin, and its ability to degrade agricultural pesticides such as parathion. It acts specifically on synthetic organophosphate triesters and phosphorofluoridates. It does not seem to have a naturally occuring substrate and may thus have optimally evolved for utilising paraoxon.
PTE exists as a homodimer with one active site per monomer. The active site is located next to a binuclear metal centre, at the C-terminal end of a TIM alpha- beta barrel motif. The native enzyme contains two zinc ions at the active site however these can be replaced with other metals such as cobalt, cadmium, nickel or manganese and the enzyme remains active [
,
,
,
,
].PTE belongs to a family [
,
] of enzymes that possess a binuclear zinc metal centre at their active site. The two zinc ions are coordinated by six different residues, six of which being histidines. This family so far includes, in addition to the parathion hydrolase, the following proteins:Sulfolobus solfataricus aryldialkylphosphatase that has a low paraoxonase activity [
].E. coli php (phosphotriesterase homology) protein. The substrate of php is not yet known [
]. Mycobacterium tuberculosis phosphotriesterase homology protein Rv0230C.Phospho-furanose lactonase from Mycoplasma [
]. Animal phosphotriesterase related protein (PTER) (RPR-1).
FGD6 is a member of the FGD family. It has been found to coordinate cell polarity and endosomal membrane recycling in osteoclasts [
]. This entry represent the N-terminal PH domain of FGD6.FGDs have a RhoGEF (DH) domain, followed by an N-terminal PH domain, a FYVE domain and a C-terminal PH domain. All FGDs are guanine nucleotide exchange factors that activates the Rho GTPase Cdc42, an important regulator of membrane trafficking. The RhoGEF domain is responsible for GEF catalytic activity, while the N-terminal PH domain is involved in intracellular targeting of the DH domain [
]. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner [
]. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity []. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane []. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes [].
Hemopexin (
) is a serum glycoprotein that binds haem and transports it to the liver for breakdown and iron recovery, after which the free hemopexin returns to the circulation [
]. Hemopexin prevents haem-mediated oxidative stress. Structurally hemopexin consists of two similar halves of approximately two hundred amino acid residues connected by a histidine-rich hinge region. Each half is itself formed by the repetition of a basic unit of some 35 to 45 residues. Hemopexin-like domains have been found in two other types of proteins, vitronectin [], a cell adhesion and spreading factor found in plasma and tissues, and matrixins MMP-1, MMP-2, MMP-3, MMP-9, MMP-10, MMP-11, MMP-12, MMP-14, MMP-15 and MMP-16, members of the matrix metalloproteinase family that cleave extracellular matrix constituents []. These zinc endopeptidases, which belong to MEROPS peptidase subfamily M10A, have a single hemopexin-like domain in their C-terminal section. It is suggested that the hemopexin domain facilitates binding to a variety of molecules and proteins, for example the HX repeats of some matrixins bind tissue inhibitor of metallopeptidases (TIMPs).The hemopexin domain exhibits the shape of an oblate ellipsoidal disk. The polypeptide chain is organised in four β-sheet (blades) I to IV, which are almost symmetrically arranged around a central axis in consecutive order, giving rise to the formation of a four-bladed propeller. Each propeller blade or repeat is made up of four antiparallel β-strands connected in a W-like strand topology, and is strongly twisted [,
].This entry represents the repeats found in hemopexin and related domains.
Uridylate kinases (also known as UMP kinases) are key enzymes in the synthesis of nucleoside triphosphates. They catalyse the reversible transfer of the gamma-phosphoryl group from an ATP donor to UMP, yielding UDP, which is the starting point for the synthesis of all other pyrimidine nucleotides. The eukaryotic enzyme has a dual specificity, phosphorylating both UMP and CMP, while the bacterial enzyme is specific to UMP. The bacterial enzyme shows no sequence similarity to the eukaryotic enzyme or other nucleoside monophosphate kinases, but rather appears to be part of the amino acid kinase family. It is dependent on magnesium for activity and is activated by GTP and repressed by UTP [
,
]. In many bacterial genomes, the gene tends to be located immediately downstream of elongation factor T and upstream of ribosome recycling factor. A related protein family, believed to be equivalent in function is found in the archaea and in spirochetes.Structurally, the bacterial and archaeal proteins are homohexamers centred around a hollow nucleus and organised as a trimer of dimers [
,
]. Each monomer within the protein forms the amino acid kinase fold and can be divided into an N-terminal region which binds UMP and mediates intersubunit interactions within the dimer, and a C-terminal region which binds ATP and contains a mobile loop covering the active site. Inhibition of enzyme activity by UTP appears to be due to competition for the binding site for UMP, not allosteric inhibition as was previously suspected.This entry represents the archaeal and spirochete proteins.
Deubiquitinating enzymes (DUB) form a large family of cysteine protease that can deconjugate ubiquitin or ubiquitin-like proteins (see
) from ubiquitin-conjugated proteins. All DUBs contain a catalytic domain surrounded by one or more subdomains, some of which contribute to target recognition. The ~120-residue DUSP (domain present in ubiquitin-specific proteases) domain is one of these specific subdomains. Single or tandem DUSP domains are located both N- and C-terminal to the ubiquitin carboxyl-terminal hydrolase catalytic core domain (see
) [
]. The DUSP domain displays a tripod-like AB3 fold with a three-helix bundle and a three-stranded anti-parallel β-sheet resembling the legs and seat of the tripod. Conserved residues are predominantly involved in hydrophobic packing interactions within the three α-helices. The most conserved DUSP residues, forming the PGPI motif, are flanked by two long loops that vary both in length and sequence. The PGPI motif packs against the three-helix bundle and is highly ordered [
]. The function of the DUSP domain is unknown but it may play a role in protein/protein interaction or substrate recognition. This domain is associated with ubiquitin carboxyl-terminal hydrolase family 2 (
, MEROPS peptidase family C19). They are a family 100 to 200kDa peptides which includes the Ubp1 ubiquitin peptidase from yeast; others include:
Mammalian ubiquitin carboxyl-terminal hydrolase 4 (USP4),Mammalian ubiquitin carboxyl-terminal hydrolase 11 (USP11), Mammalian ubiquitin carboxyl-terminal hydrolase 15 (USP15), Mammalian ubiquitin carboxyl-terminal hydrolase 20 (USP20), Mammalian ubiquitin carboxyl-terminal hydrolase 32 (USP32), Vertebrate ubiquitin carboxyl-terminal hydrolase 33 (USP33), Vertebrate ubiquitin carboxyl-terminal hydrolase 48 (USP48).
Antigenic stimulation of T lymphocytes initiates a complex series of
intracellular signal transduction pathways that leads to the expression of apanel of immunoregulatory genes, whose function is critical to the
initiation and coordination of the immune response. The multi-subunitnuclear factor of activated T cells (NFAT) transcription factor family
plays a pivotal role in this process and is involved in the expression of anumber of immunologically important genes. These include the cytokines IL-2,
IL-3, IL-4, IL-5, granulocyte-macrophage colony-stimulating factor, andtumour necrosis factor-alpha, as well as several cell-surface molecules,
such as CD40L and FasL. Although originally described in T cells, it is nowapparent that NFAT proteins are also expressed in other immune system cells,
including B cells, mast cells, basophils and natural killer cells, as wellas in a variety of non-immune cell types and tissues, such as skeletal
muscle, neurons, heart and adipocytes. However, although NFAT acts as acalcium-dependent transcription factor and serves to couple gene expression
to changes in intracellular calcium levels in most cases, NFAT target geneshave not been identified in these latter cell types.
NFAT proteins appear to be regulated primarily at the level of theirsubcellular localisation [
]. They are found exclusively in the cytoplasm ofresting T cells, and consist of 2 components: a pre-existing cytoplasmic
component that translocates into the nucleus on calcium mobilisation, and aninducible nuclear component comprising members of the activating protein-1
(AP-1) family of transcription factors. In response to antigen receptorsignalling, the calcium-regulated phosphatase calcineurin acts directly to
dephosphorylate NFAT proteins, causing their rapid translocation from thecytoplasm to the nucleus, where they cooperatively bind their target
Deoxyribonuclease I (DNase I) (
) [
] is a vertebrate enzyme which catalyzes the endonucleolytic cleavage of double-stranded DNA to 5'- phosphodinucleotide and 5'-phosphooligonucleotide end-products. DNase I is an enzyme involved in DNA degradation; it is normally secreted outside of the cell but seems to be able to gain access to the nucleus where it is involved in cell death by apoptosis [].As shown in the following schematic representation, DNase I is a glycoprotein of about 260 residues with two conserved disulphide bonds.
+-+ +--------+
| | | |xxxxxxxxxxxxxxxxx#xxxxxxCxCxxxxx#xxxxxxxxxCxxxxxxxxCxxxxxxxxxxxxx
'C': conserved cysteine involved in a disulphide bond.
'#': active site residue.DNase I has a pH-optimum around 7.5 and requires calcium and magnesium for full activity. It causes single strand nicks in duplex DNA. A proton acceptor-donor chain composed of a histidine and a glutamic acid produce a nucleophilic hydroxyl ion from water, which cleaves the 3'-P-O bond []. DNase I forms a 1:1 complex with G-actin, resulting in the inhibition of DNase activity and loss of the ability of G-actin to polymerise into fibres [].DNase I has been used in the treatment of lung problems in patients with cystic fibrosis: here it acts by degrading DNA found in purulent lung secretions, reducing their viscosity and making it easier for the patient to breathe [
].The sequence of DNase I is evolutionary related to that of human muscle-specific DNase-like protein and human proteins DHP1 and DHP2. However, the first disulphide bond of DNase I is not conserved in these proteins.This entry represents the first conserved active site containing a histidine residue [
].
Slm1 is a component of the target of rapamycin complex 2 (TORC2) signaling pathway. It plays a role in the regulation of actin organization and is a target of sphingolipid signaling during the heat shock response [
]. Slm1 contains a single PH domain that binds PtdIns(4,5)P2, PtdIns(4)P, and dihydrosphingosine 1-phosphate (DHS-1P). Slm1 possesses two binding sites for anionic lipids. The non-canonical binding site of the PH domain of Slm1 is used for ligand binding, and it is proposed that beta-spectrin, Tiam1 and ArhGAP9 also have this type of phosphoinositide binding site []. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner [
]. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity []. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane []. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes [].
Myristoylated alanine-rich C-kinase substrate (MARCKS) is a predominent
cellular substrate for protein kinase C (PKC) that has been implicated in the regulation of brain development, macrophage activation, neuro-secretion and growth factor-dependent
mitogenesis [,
]. The N-terminal glycine is the site of myristoylation, which allows effective binding of the protein to the plasma membrane, where
it co-localises with PKC []. MARCKS binds calmodulin in a calcium-dependentmanner; the region responsible for calcium-binding is highly basic, a domain
of about 25 amino acids known as the PSD or effector domain, which also contains the PKCphosphorylation sites and has been shown to contribute to membrane binding. When not phosphorylated, the effector domain can bind
to filamentous actin []. It is believed that MARCKS may be a regulated crossbridge between actin and the plasma membrane; modulation of the actin
cross-linking activity by calmodulin and phosphorylation, represent apotential convergence of the calcium-calmodulin and PKC signal transduction
pathways in regulation of the actin cytoskeleton. MARCKS also contains an MH2 domain of unknown function.MARCKS-related protein (MRP) is similar to MARCKS in terms of properties
such as its myristoylation, phosphorylation and calmodulin-binding, andshares a high degree of sequence similarity. The two regions that show the highest
similarity are the kinase C phosphorylation site domain and the N-terminalregion containing the myristoylation site [
]. MARCKS and MRP amino acid compositions are similar, but the alanine content of the latter is lower. MARCKS proteins appear to adopt a native unfolded conformation i.e. as randomly folded chains arranged in non-classical extended conformations, in common with other substrates of PKC.
The synapsins are a family of neuron-specific phosphoproteins that coat
synaptic vesicles and are involved in the binding between these vesiclesand the cytoskeleton (including actin filaments). The family comprises 5
homologous proteins Ia, Ib, IIa, IIb and III. Synapsins I, II, and III areencoded by 3 different genes. The a and b isoforms of synapsin I and II are
splice variants of the primary transcripts [].Synapsin I is mainly associated with regulation of neurotransmitter release
from presynaptic neuron terminals []. Synapsin II, as well as being involved in neurotransmitter release, has a role in the synaptogenesis and synaptic plasticity responsible for long term potentiation []. Recent studies implicate synapsin III with a developmental role in neurite elongation and synapse formation that is distinct from the functions of synapsins I and II [].Structurally, synapsins are multidomain proteins, of which 3 domains are
common to all the mammalian forms. The N-terminal `A' domain is ~30 residueslong and contains a serine residue that serves as an acceptor site for
protein kinase-mediated phosphorylation. This is followed by the `B' linkerdomain, which is ~80 residues long and is relatively poorly conserved.
Domain `C' is the longest, spanning approximately 300 residues. This domainis highly conserved across all the synapsins (including those from
Drosophila) and is possessed by all splice variants. The remaining sixdomains, D-I, are not shared by all the synapsins and differ both between
the primary transcripts and the splice variants.This entry represent the ATP-grasp fold found in synapsins, which is responsible for Ca dependent ATP binding.
The synapsins are a family of neuron-specific phosphoproteins that coat
synaptic vesicles and are involved in the binding between these vesiclesand the cytoskeleton (including actin filaments). The family comprises 5
homologous proteins Ia, Ib, IIa, IIb and III. Synapsins I, II, and III areencoded by 3 different genes. The a and b isoforms of synapsin I and II are
splice variants of the primary transcripts [].Synapsin I is mainly associated with regulation of neurotransmitter release
from presynaptic neuron terminals []. Synapsin II, as well as being involved in neurotransmitter release, has a role in the synaptogenesis and synaptic plasticity responsible for long term potentiation []. Recent studies implicate synapsin III with a developmental role in neurite elongation and synapse formation that is distinct from the functions of synapsins I and II [].Structurally, synapsins are multidomain proteins, of which 3 domains are
common to all the mammalian forms. The N-terminal `A' domain is ~30 residueslong and contains a serine residue that serves as an acceptor site for
protein kinase-mediated phosphorylation. This is followed by the `B' linkerdomain, which is ~80 residues long and is relatively poorly conserved.
Domain `C' is the longest, spanning approximately 300 residues. This domainis highly conserved across all the synapsins (including those from
Drosophila) and is possessed by all splice variants. The remaining six
domains, D-I, are not shared by all the synapsins and differ both betweenthe primary transcripts and the splice variants.This entry represents a highly conserved stretch of 11 residues located in the centre of the 'C' domain.
The synapsins are a family of neuron-specific phosphoproteins that coat
synaptic vesicles and are involved in the binding between these vesiclesand the cytoskeleton (including actin filaments). The family comprises 5
homologous proteins Ia, Ib, IIa, IIb and III. Synapsins I, II, and III areencoded by 3 different genes. The a and b isoforms of synapsin I and II are
splice variants of the primary transcripts [].Synapsin I is mainly associated with regulation of neurotransmitter release
from presynaptic neuron terminals []. Synapsin II, as well as being involved in neurotransmitter release, has a role in the synaptogenesis and synaptic plasticity responsible for long term potentiation []. Recent studies implicate synapsin III with a developmental role in neurite elongation and synapse formation that is distinct from the functions of synapsins I and II [].Structurally, synapsins are multidomain proteins, of which 3 domains are
common to all the mammalian forms. The N-terminal `A' domain is ~30 residueslong and contains a serine residue that serves as an acceptor site for
protein kinase-mediated phosphorylation. This is followed by the `B' linkerdomain, which is ~80 residues long and is relatively poorly conserved.
Domain `C' is the longest, spanning approximately 300 residues. This domainis highly conserved across all the synapsins (including those from
Drosophila) and is possessed by all splice variants. The remaining sixdomains, D-I, are not shared by all the synapsins and differ both between
the primary transcripts and the splice variants.This entry represents a conserved octapeptide in the immediate N-terminal domain, which contains the phosphorylated serine residue.
This entry represents E3 ubiquitin-protein ligase RNF8, which may be required for proper exit from mitosis after spindle checkpoint activation and may regulate cytokinesis. This enzyme promotes the formation of 'Lys-63'-linked polyubiquitin chains and functions with the specific ubiquitin-conjugating UBC13-MMS2 (UBE2N-UBE2V2) heterodimer [
,
]. Substrates that are poly-ubiquitinated at 'Lys-63' are usually not targeted for degradation. RNF8 acts following DNA double-strand break (DSB) formation, and is recruited to the sites of damage by ATM-phosphorylated MDC1, where it promotes the formation of TP53BP1 and BRCA1 ionizing radiation-induced foci (IRIF) [,
]. It may play a role in the regulation of RXRA-mediated transcriptional activity, but is not involved in RXRA ubiquitination by UBE2E2.Ubiquitinylation is an ATP-dependent process that involves the action of at least three enzymes: a ubiquitin-activating enzyme (E1,
), a ubiquitin-conjugating enzyme (E2,
), and a ubiquitin ligase (E3,
,
), which work sequentially in a cascade. There are many different E3 ligases, which are responsible for the type of ubiquitin chain formed, the specificity of the target protein, and the regulation of the ubiquitinylation process [
]. Ubiquitinylation is an important regulatory tool that controls the concentration of key signalling proteins, such as those involved in cell cycle control, as well as removing misfolded, damaged or mutant proteins that could be harmful to the cell. Several ubiquitin-like molecules have been discovered, such as Ufm1 (), SUMO1 (
), NEDD8, Rad23 (
), Elongin B and Parkin (
), the latter being involved in Parkinson's disease [
].
In the mitochondrion of eukaryotes and in aerobic prokaryotes, cytochrome b is a component of respiratory chain complex III (
) - also known as the bc1 complex or ubiquinol-cytochrome c reductase. In plant chloroplasts and cyanobacteria, there is a analogous protein, cytochrome b6, a component of the plastoquinone-plastocyanin reductase (
), also known as the b6f complex. Both of these complexes are involved in electron transport and the generation of ATP and are therefore vitally important to the cell.
Cytochrome b/b6 [
,
] is an integral membrane protein of approximately 400 amino acid residues that probably has 8 transmembrane segments. In plants and cyanobacteria, cytochrome b6 consists of two subunits encoded by the petB and petD genes. The sequence of petB is colinear with the N-terminal part of mitochondrial cytochrome b, while petD corresponds to the C-terminal part.Cytochrome b/b6 non-covalently binds two haem groups, known as b562 and b566. Four conserved histidine residues are postulated to be the ligands of the iron atoms of these two haem groups.
Apart from regions around some of the histidine haem ligands, there are a few conserved regions in the sequence of b/b6. The best conserved of these regions includes an invariant P-E-W triplet which lies in the loop that separates the fifth and sixth transmembrane segments. It seems to be important for electron transfer at the ubiquinone redox site - called Qz or Qo (where o stands for outside) - located on the outer side of the membrane. This entry is the N terminus of these proteins.Proteins in this entry belong to the PetB family of cytochrome b6.
Signal recognition particle, SRP54 subunit, eukaryotic
Type:
Family
Description:
The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes [
,
]. SRP recognises the signal sequence of the nascent polypeptide on the ribosome. In eukaryotes this retards its elongation until SRP docks the ribosome-polypeptide complex to the RER membrane via the SR receptor []. Eukaryotic SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor []. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane. In archaea, the SRP complex contains 7S RNA like its eukaryotic counterpart, yet only includes two of the six protein subunits found in the eukarytic complex: SRP19 and SRP54 [].This entry represents the 54kDa SRP54 component, a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 of the signal recognition particle has a three-domain structure: an N-terminal helical bundle domain, a GTPase domain, and the M-domain that binds the 7s RNA and also binds the signal sequence. The extreme C-terminal region is glycine-rich and lower in complexity and poorly conserved between species.
Phosphoglucose isomerase (
) (PGI) [
,
] is a dimeric enzyme that catalyses the reversible isomerization of glucose-6-phosphate and fructose-6-phosphate. PGI is involved in different pathways: in most higher organisms it is involved in glycolysis; in mammals it is involved in gluconeogenesis; in plants in carbohydrate biosynthesis; in some bacteria it provides a gateway for fructose into the Entner-Doudouroff pathway. The multifunctional protein, PGI, is also known as neuroleukin (a neurotrophic factor that mediates the differentiation of neurons), autocrine motility factor (a tumour-secreted cytokine that regulates cell motility), differentiation and maturation mediator and myofibril-bound serine proteinase inhibitor, and has different roles inside and outside the cell. In the cytoplasm, it catalyses the second step in glycolysis, while outside the cell it serves as a nerve growth factor and cytokine [
]. PGI from Bacillus stearothermophilus has an open twisted alpha/beta structural motif consisting of two globular domains and two protruding parts. It has been suggested that the top part of the large domain together with one of the protruding loops might participate in inducing the neurotrophic activity [
]. The structure of rabbit muscle phosphoglucose isomerase complexed with various inhibitors shows that the enzyme is a dimer with two alpha/β-sandwich domains in each subunit. The location of the bound D-gluconate 6-phosphate inhibitor leads to the identification of residues involved in substrate specificity. In addition, the positions of amino acid residues that are substituted in the genetic disease nonspherocytic hemolytic anemia suggest how these substitutions can result in altered catalysis or protein stability [,
].This superfamily represents the C-terminal domain of phosphoglucose isomerase. It is alpha helical and not found in archaeal proteins.
Cytochromes c (cytC) can be defined as electron-transfer proteins having one or several haem c groups, bound to the protein by one or, more
generally, two thioether bonds involving sulfhydryl groups of cysteine residues. The fifth haem iron ligand is always provided by a histidine residue. CytC possess a wide range of properties and function in a large number of different redox processes.Ambler [
] recognised four classes of cytC. Class I includes the low-spin soluble cytC of mitochondria and bacteria, with the haem-attachment site towards the N terminus, and the sixth ligand provided by a methionine residue about 40 residues further on towards the C terminus []. On the basis of sequence similarity, class I cytC were further subdivided into five classes, IA to IE. Class IB includes the eukaryotic mitochondrial cyt C and prokaryotic 'short' cyt C2 exemplified by Rhodopila globiformis cyt C2; Class IA includes 'long' cyt C2, such as Rhodospirillum rubrum cyt C2 and Aquaspirillum itersonii cyt C-550, which have several extra loops by comparison with Class IB cyt C.Class I cytC has a characterised fold which comprises 5 α-helices arranged in a unique tertiary structure and a conserved N-terminal sequence -Cys-Xxx-Xxx-Cys-His- where the cysteines mediate the covalent cross-linking of the heme to the protein and the His [
].The 3D structures of a considerable number of class IA and IB cytC have been determined. The proteins consist of 3-6 α-helices; the three most conserved 'core' helices form a 'basket' around the haem group, with one haem edge exposed to the solvent. Most class I cytC have conserved aromatic residues clustered around the haem and axial ligands.
Succinate:quinone oxidoreductase (
) refers collectively to succinate:quinone reductase (SQR, or Complex II) and quinol:fumarate reductase (QFR) [
]. SQR is found in aerobic organisms, and catalyses the oxidation of succinate to fumarate in the citric acid cycle and donates the electrons to quinone in the membrane. QFR can be found in anaerobic cells respiring with fumarate as terminal electron acceptor. SQR and QFR are very similar in composition and structure, despite catalysing opposite reactions in vivo. They are thought to have evolved from a common ancestor, and in Escherichia coli they are capable of functionally replacing each other [
].Succinate:quinone oxidoreductases consist of a peripheral domain, exposed to the cytoplasm in bacteria and to the matrix in mitochondria, and a membrane-integral anchor domain that spans the membrane. The peripheral part, which contains the dicarboxylate binding site, is composed of a flavoprotein subunit, with one covalently bound FAD, and an iron-sulphur protein subunit containing three iron-sulphur clusters. The membrane-integral domain functions to anchor the peripheral domain to the membrane and is required for quinone reduction and oxidation. The anchor domain shows the largest variability in composition and primary sequence, being composed either of one large subunit, or two smaller subunits, which may, or may not, contain protoheme groups.The flavoprotein subunit found in both the SQR and QFR enzymes contains an N-terminal domain which binds the FAD cofactor, a central catalytic domain with an unsual fold, and a C-terminal domain whose role is unclear [
,
,
]. The dicarboxylate binding site is located between the FAD and catalytic domains.This superfamily represents the catalytic domain of the flavoprotein subunit.
This group of cysteine peptidases belong to MEROPS peptidase family C1, sub-family C1A (papain family, clan CA). It includes related cysteine proteinases such as actinidin [
]. This entry also includes proteins classed as non-peptidase homologues such as the catalytically inactive tubulointerstitial nephritis antigen (TIN-Ag) []. These have either been shown experimentally to lack peptidase activity or lack one or more of the active site residues.The papain family has a wide variety of activities, including broad-range (papain) and narrow-range endo-peptidases, aminopeptidases, dipeptidyl peptidases and enzymes with both exo- and endo-peptidase activity [
]. Members of the papain family are widespread, found in baculovirus [], eubacteria, yeast, and practically all protozoa, plants and mammals []. The proteins are typically lysosomal or secreted, and proteolytic cleavage of the propeptide is required for enzyme activation, although bleomycin hydrolase is cytosolic in fungi and mammals []. Papain-like cysteine proteinases are essentially synthesised as inactive proenzymes (zymogens) with N-terminal propeptide regions. The activation process of these enzymes includes the removal of propeptide regions. The propeptide regions serve a variety of functions in vivo and in vitro. The pro-region is required for the proper folding of the newly synthesised enzyme, the inactivation of the peptidase domain and stabilisation of the enzyme against denaturing at neutral to alkaline pH conditions. Amino acid residues within the pro-region mediate their membrane association, and play a role in the transport of the proenzyme to lysosomes. Among the most notable features of propeptides is their ability to inhibit the activity of their cognate enzymes and that certain propeptides exhibit high selectivity for inhibition of the peptidases from which they originate [].
Sodium proton exchangers (NHEs) constitute a large family of integral membrane protein transporters that are responsible for the counter-transport of protons and sodium ions across lipid bilayers [
,
]. These proteins are found in organisms across all domains of life. In archaea, bacteria, yeast and plants, these exchangers provide increased salt tolerance byremoving sodium in exchanger for extracellular protons. In mammals they participate in the regulation of cell pH, volume, and intracellular sodium concentration, as well as for the reabsorption of NaCl across renal, intestinal, and other epithelia [
,
,
,
]. Human NHE is also involved in heart disease, cell growth and in cell differentiation []. The removal of intracellular protons in exchange for extracellular sodium effectively eliminates excess acid from actively metabolising cells. In mammalian cells, NHE activity is found in both the plasma membrane and inner mitochondrial membrane. To date, nine mammalian isoforms have been identified (designated NHE1-NHE9) [,
]. These exchangers are highly-regulated (glyco)phosphoproteins, which, based on their primary structure, appear to contain 10-12 membrane-spanning regions (M) at the N terminus and a large cytoplasmic region at the C terminus. The transmembrane regions M3-M12 share identity with other members of the family. The M6 and M7 regions are highly conserved. Thus, this is thought to be the region that is involved in the transport of sodium and hydrogen ions. The cytoplasmic region has little similarity throughout the family. There is some evidence that the exchangers may exist in the cell membrane as homodimers, but little is currently known about the mechanism of their antiport [].This entry represents bacterial Na+/H+ exchanger proteins such as YjcE from Escherichia coli [
].
Viruses in the order Picornavirales infect different vertebrate, invertebrate, and plant hosts and are responsible for a variety of human, animal, and plant diseases. These viruses have a single-stranded, positive sense RNA genome that generally translates a large precursor polyprotein which is proteolytically cleaved after translation to generate mature functional viral proteins. This process is usually mediated by (more than one) proteases, and a 3C (for the family Picornaviridae) or 3C-like (3CL) protease (for other families) plays a central role in the cleavage of the viral precursor polyprotein. In addition to this key role, 3C/3C-like protease is able to cleave a number of host proteins to remodel the cellular environment for virus reproduction [
,
,
,
,
,
]. The Picornavirales 3C/3C-like protease domain forms the MEROPS peptidase family C3 (picornain family) of clan PA.The 3C/3CL protease domain adopts a chymotrypsin-like fold with a cysteine nucleophile in place of a commonly found serine which suggests that the cysteine and serine perform an analogous catalytic function. The catalytic triad is made of a histidine, an aspartate/glutamate and the conserved cysteine in this sequential order. The 3C/3CL protease domain folds into two antiparallel beta barrels that are linked by a loop with a short α-helix in its middle, and flanked by two other α-helices at the N- and C-terminal. The two barrels are topologically equivalent and are formed by six antiparallel beta strands with the first four organised into a Greek key motif. The active-site residues are located in the cleft between the two barrels with the nucleophilic Cys from the C-terminal barrel and the general acid base His-Glu/Asp from the N-terminal barrel [,
,
].
This entry represents the N-terminal domain of the aspartic peptidases. Aspartic peptidase, also known as aspartyl proteases ([intenz:3.4.23.-]) are a widely distributed family of proteolytic enzymes [,
,
] known to exist in vertebrates, fungi, plants, retroviruses and some plant viruses. Aspartate proteases of eukaryotes are monomeric enzymes which consist of two domains. Each domain contains an active site centred on a catalytic aspartyl residue. The two domains most probably evolved from the duplication of an ancestral gene encoding a primordial domain. Currently known eukaryotic aspartyl proteases are:Vertebrate gastric pepsins A and C (also known as gastricsin).
Vertebrate chymosin (rennin), involved in digestion and used for making cheese.Vertebrate lysosomal cathepsins D (EC 3.4.23.5) and E (EC 3.4.23.34).Mammalian renin (EC 3.4.23.15) whose function is to generate angiotensin I from angiotensinogen in the plasma.Fungal proteases such as aspergillopepsin A (EC 3.4.23.18), candidapepsin (EC 3.4.23.24), mucoropepsin (EC 3.4.23.23) (mucor rennin), endothiapepsin (EC 3.4.23.22), polyporopepsin (EC 3.4.23.29), and rhizopuspepsin (EC 3.4.23.21).Yeast saccharopepsin (EC 3.4.23.25) (proteinase A) (gene PEP4). PEP4 is implicated in posttranslational regulation of vacuolar hydrolases.Yeast barrierpepsin (EC 3.4.23.35) (gene BAR1); a protease that cleaves alpha-factor and thus acts as an antagonist of the mating pheromone.Fission yeast sxa1 which is involved in degrading or processing the mating pheromones.Most retroviruses and some plant viruses, such as badnaviruses, encode for an aspartyl protease which is an homodimer of a chain of about 95 to 125 amino acids. In most retroviruses, the protease is encoded as a segment of a polyprotein which is cleaved during the maturation process of the virus. It is generally part of the pol polyprotein and, more rarely, of the gag polyprotein.
Integrase comprises three domains capable of folding independently and whose three-dimensional structures are known. However, the manner in which the N-terminal, catalytic core, and C-terminal domains interact in the holoenzyme remains obscure. Numerous studies indicate that the enzyme functions as a multimer, minimally a dimer. The integrase proteins from Human immunodeficiency virus 1 (HIV-1) and Avian sarcoma virus (have been studied most carefully with respect to the structural basis of catalysis. Although the active site of avian virus integrase does not undergo significant conformational changes on binding the required metal cofactor, that of HIV-1 does. This active site-mediated conformational change in HIV-1 reorganises the catalytic core and C-terminal domains and appears to promote an interaction that is favourable for catalysis []. Retroviral integrase is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. The presence of retrovirus integrase-related gene sequences in eukaryotes is known. Bacterial transposases involved in the transposition of the insertion sequence also belong to this group. HIV-1 integrase catalyses the incorporation of virally derived DNA into the human genome. This unique step in the virus life cycle provides a variety of points for intervention and hence is an attractive target for the development of new therapeutics for the treatment of AIDS [
]. Substrate recognition by the retroviral integrase enzyme is critical for retroviral integration. To catalyse this recombination event, integrase must recognise and act on two types of substrates, viral DNA and host DNA, yet the necessary interactions exhibit markedly different degrees of specificity [].
RELMs, secreted proteins with roles including insulin resistance and the activation of inflammatory processes, are also known as found in inflammatory zone (FIZZ), and include four members in mouse (RELM-alpha/FIZZ1/HIMF, RELM-beta/FIZZ2, Resistin/FIZZ3, and RELM-gamma/FIZZ4) and two members in human (resistin and RELM-beta). RELMs are potentially implicated in a wide range of physiological and pathological processes including obesity-associated diabetes, cardiovascular system function, cancer development and metastasis [
,
,
,
]. There are significant differences between human and rodent RELMs with respect to gene and protein structure, differential gene regulation, different tissue distribution profiles, and insulin resistance induction. Resistin appears to convey insulin resistance in rodents, and to instigate inflammatory processes in humans. In the pathophysiology of obesity-associated diabetes, mouse resistin is secreted by adipocytes and increases hepatic gluconeogenesis, thereby promoting insulin resistance, human resistin is secreted by macrophages and may play a role through inflammatory contributions [
,
]. Elevated levels of human resistin have been reported in various cancers including colorectal, endometrial, and postmenopausal breast cancers, and may initiate the production of further inflammatory cytokines, to promote tumor cell progression [
]. Resistin has also been shown to cause G1 arrest in colon cancer cells. However, resistin may interfere with chemotherapy []. Resistin contains an N-terminal signal sequence, a variable middle section, and a conserved C-terminal domain. The C-terminal domain is comprised of a cysteine signature motif sequence shared by all RELM family members, which is proposed to be critical for disulfide bond formation and protein folding [
]. Resistin circulates as hexamers and trimers; structural similarity has been noted between the resistin homotrimer and the proprotein convertase subtilisin/kexin type 9, C-terminal cysteine-rich domain [
,
].
This entry represents the first LIM domain of Lmx1a. Lmx1a belongs to the LHX protein family, which features two tandem N-terminal LIM domains and a C-terminal DNA binding homeodomain. Members of LHX family are found in the nucleus and act as transcription factors or cofactors [
]. LHX proteins are critical for the development of specialized cells in multiple tissue types, including the nervous system, skeletal muscle, the heart, the kidneys, and endocrine organs, such as the pituitary gland and the pancreas []. Mouse Lmx1a is expressed in multiple tissues, including the roof plate of the neural tube, the developing brain, the otic vesicles, the notochord, and the pancreas []. Human Lmx1a can be found in pancreas, skeletal muscle, adipose tissue, developing brain, mammary glands, and pituitary. The functions of Lmx1a in the developing nervous system were revealed by studies of mutant mouse. In mouse, mutations in Lmx1a result in failure of the roof plate to develop. Lmx1a may act upstream of other roof plate markers such as MafB, Gdf7, Bmp 6, and Bmp7. Further characterization of these mice reveals numerous defects including disorganized cerebellum, hippocampus, and cortex; altered pigmentation; female sterility; skeletal defects; and behavioural abnormalities [,
,
]. Within pancreatic cells, the Lmx1a protein interacts synergistically with the bHLH transcription factor E47 to activate the insulin gene enhancer/promoter [].As in other LIM domains, this domain family is 50-60 amino acids in size and shares two characteristic zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein [
].
Methyl-accepting chemotaxis proteins (MCPs) are a family of bacterial receptors that mediate chemotaxis to diverse signals, responding to changes in the concentration of attractants and repellents in the environment by altering swimming behaviour [
]. Environmental diversity gives rise to diversity in bacterial signalling receptors, and consequently there are many genes encoding MCPs []. For example, there are four well-characterised MCPs found in Escherichia coli: Tar (taxis towards aspartate and maltose, away from nickel and cobalt), Tsr (taxis towards serine, away from leucine, indole and weak acids), Trg (taxis towards galactose and ribose) and Tap (taxis towards dipeptides). MCPs share similar topology and signalling mechanisms. MCPs either bind ligands directly or interact with ligand-binding proteins, transducing the signal to downstream signalling proteins in the cytoplasm. MCPs undergo two covalent modifications: deamidation and reversible methylation at a number of glutamate residues. Attractants increase the level of methylation, while repellents decrease it. The methyl groups are added by the methyl-transferase cheR and are removed by the methylesterase cheB. Most MCPs are homodimers that contain the following organisation: an N-terminal signal sequence that acts as a transmembrane domain in the mature protein; a poorly-conserved periplasmic receptor (ligand-binding) domain; a second transmembrane domain; and a highly-conserved C-terminal cytoplasmic domain that interacts with downstream signalling components. The C-terminal domain contains the glycosylated glutamate residues. This entry represents the ligand-binding domain found in a number of methyl-accepting chemotaxis receptors, such as E.coli Tar (taxis to aspartate and repellents), which is a receptor for the attractant L-aspartate [
,
]. It is a homodimeric receptor that contains an N-terminal periplasmic ligand binding domain, a transmembrane region, a HAMP domain and a C-terminal cytosolic signaling domain [].
Interleukin-4 receptor is a type I transmembrane protein that can bind interleukin 4 and interleukin 13 to regulate IgE antibody production in B cells. Among T cells, the encoded protein also can bind interleukin 4 to promote differentiation of Th2 cells. A soluble form of the encoded protein can be produced by an alternate splice variant or by proteolysis of the membrane-bound protein, and this soluble form can inhibit IL4-mediated cell proliferation and IL5 upregulation by T-cells. Allelic variations in this gene have been associated with atopy, a condition that can manifest itself as allergic rhinitis, sinusitis, asthma, or eczema. The binding of IL-4 or IL-13 to the IL-4 receptor on the surface of macrophages results in the alternative activation of those macrophages. Alternatively activated macrophages downregulate inflammatory mediators such as during immune responses, particularly with regards to helminth infections. This entry represents the N-terminal (extracellular) portion of interleukin-4 receptor alpha, it is related in overall topology to fibronectin type III modules and folds into a sandwich comprising seven antiparallel beta sheets arranged in a three-strand and a four-strand β-pleated sheet. They are required for binding of interleukin-4 to the receptor alpha chain, which is a crucial event for the generation of a Th2-dominated early immune response [
]. Members of this family are related in overall topology to fibronectin type III modules and fold into a sandwich comprising seven antiparallel beta sheets arranged in a three-strand and a four-strand β-pleated sheet. They are required for binding of interleukin-4 to the receptor alpha chain, which is a crucial event for the generation of a Th2-dominated early immune response [].
Integrase comprises three domains capable of folding independently and whose three-dimensional structures are known. However, the manner in which the N-terminal, catalytic core, and C-terminal domains interact in the holoenzyme remains obscure. Numerous studies indicate that the enzyme functions as a multimer, minimally a dimer. The integrase proteins from Human immunodeficiency virus 1 (HIV-1) and Avian sarcoma virus (have been studied most carefully with respect to the structural basis of catalysis. Although the active site of avian virus integrase does not undergo significant conformational changes on binding the required metal cofactor, that of HIV-1 does. This active site-mediated conformational change in HIV-1 reorganises the catalytic core and C-terminal domains and appears to promote an interaction that is favourable for catalysis [
]. Retroviral integrase is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. The presence of retrovirus integrase-related gene sequences in eukaryotes is known. Bacterial transposases involved in the transposition of the insertion sequence also belong to this group. HIV-1 integrase catalyses the incorporation of virally derived DNA into the human genome. This unique step in the virus life cycle provides a variety of points for intervention and hence is an attractive target for the development of new therapeutics for the treatment of AIDS [
]. Substrate recognition by the retroviral integrase enzyme is critical for retroviral integration. To catalyse this recombination event, integrase must recognise and act on two types of substrates, viral DNA and host DNA, yet the necessary interactions exhibit markedly different degrees of specificity [].
Cadherins are a group of transmembrane proteins that serve as the major adhesion molecules located within adherens junctions. They can regulate cell-cell adhesion through their extracellular domain and their cytosolic domains connect to the actin cytoskeleton by binding to catenins [
]. These proteins preferentially interact with themselves in a homophilic manner in connecting cells; thus acting as both receptor and ligand. They may play an important role in the sorting of different cell types during morphogenesis, histogenesis and regeneration. They may also be involved in the regulation of tight and gap junctions, and in the control of intercellular spacing. Cadherins are evolutionary related to the desmogleins which are component of intercellular desmosome junctions involved in the interaction of plaque proteins.Structurally, cadherins comprise a number of domains: classically, these include a signal sequence; a propeptide of around 130 residues; a single transmembrane domain and five tandemly repeated extracellular cadherin domains, 4 of which are cadherin repeats, and the fifth contains 4 conserved cysteines and a C-terminal cytoplasmic domain [
]. However, proteins are designated as members of the broadly defined cadherin family if they have one or more cadherin repeats. A cadherin repeat is an independently folding sequence of approximately 110 amino acids that contains motifs with the conserved sequences DRE, DXNDNAPXF, and DXD. Crystal structures have revealed that multiple cadherin domains form Ca2+-dependent rod-like structures with a conserved Ca2+-binding pocket at the domain-domain interface. Cadherins depend on calcium for their function: calcium ions bind to specific residues in each cadherin repeat to ensure its proper folding, to confer rigidity upon the extracellular domain and is essential for cadherin adhesive function and for protection against protease digestion.
Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll 'a' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane [
,
,
]. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection []. This entry represents the low molecular weight transmembrane protein PsbI, which is tightly associated with the D1/D2 heterodimer in PSII. The function of PsbI is unknown, but it may be involved in the assembly, dimerisation or stabilisation of PSII dimers [
].
This entry represents the SKI/SnoN family of proteins, which are the products of the oncogenic sno gene. This gene was identified based on its homology to
v-ski, the transforming component of the Sloan-Kettering virus. Both Ski and SnoN are potent negative regulators of TGF-beta [
]. Overexpression of Ski or SnoN results in oncogenic transformation of avian fibroblasts; however it may also result in terminal differentiation and therefore the Ski/SnoN mechanism of action is thought to be complex [].These proteins do not have catalytic or DNA-binding activity and therefore function primarily through interaction with other proteins, acting as transcriptional cofactors. Despite their lack of DNA-binding ability, their primary function is related to transcriptional regulation, in particular the negative regulation of TGF-beta signalling [
,
]. Ski/SnoN interact concurrently with co-Smad and R-Smad and in doing so block the ability of the Smad complexes to activate transcription of the TGF-beta target genes []. Binding of Ski/SnoN may additionally stabilise the Smad heteromer on DNA, therefore preventing further binding of active Smad complexes []. As Smad complexes critically mediate the inhibitory signals of TGF-beta in epithelial cells, high levels of SKI/SnoN may promote cell proliferation. They repress gene transcription recruiting diverse corepressors and histone deacetylases and stablish cross-regulatory mechanisms with TGF-beta/Smad pathway that control the magnitude and duration of TGF-beta signals. The alteration in regulatory processes may lead to disease development [].High levels of SnoN have been shown to stabilise p53 with a resultant increase in premature senescence. SnoN interacts with the PML protein and is then recruited to the PML nuclear bodies, resulting in stabilisation of p53 and premature senescence [].
The glycine-tyrosine-phenylalanine (GYF) domain is an around 60-amino acid domain which contains a conserved GP[YF]xxxx[MV]xxWxxx[GN]YF motif. It was identified in the human intracellular protein termed CD2 binding protein 2 (CD2BP2), which binds to a site containing two tandem PPPGHR segments within the cytoplasmic region of CD2. Binding experiments and mutational analyses have demonstrated the critical importance of the GYF tripeptide in ligand binding. A GYF domain is also found in several other eukaryotic proteins of unknown function []. It has been proposed that the GYF domain found in these proteins could also be involved in proline-rich sequence recognition [].Resolution of the structure of the CD2BP2 GYF domain by NMR spectroscopy revealed a compact domain with a β-β-α-β-beta topology, where the single α-helix is tilted away from the twisted, anti-parallel β-sheet. The conserved residues of the GYF domain create a contiguous patch of predominantly hydrophobic nature which forms an integral part of the ligand-binding site [
]. There is limited homology within the C-terminal 20-30 amino acids of various GYF domains, supporting the idea that this part of the domain is structurally but not functionally important [].This entry also matches Arabidopsis histone methyltransferases ATXR3/SDG2 and ATXR7/SDG25, which contain two partial GYF domains towards the N terminus [
]. Histone methyltransferase ATXR7 is involved in regulation of flowering time []. It is specifically required for the trimethylation of 'Lys-4' of histone H3 (H3K4me3) at the FLC locus, it prevents the trimethylation on 'Lys-27' (H3K27me3) at the same locus. ATXR3 is also required for H3K4 trimethylation and is crucial for both sporophyte and gametophyte development in plants [,
].
TssC (also known as VipB) is a family of Gram-negative type VI secretion system components of the tail sheath. They have been known as COG3517. These sheath-components, of which there are many copies in the sheath, are also variously referred to as TssC. On contact with another bacterial cell the sheath contracts and pushes the puncturing device and tube through the cell envelope and punches the target bacterial cell []. VipA and VipB (TssB and TssC) proteins were shown to form a cog-wheel like tubular structure in V. cholerae that was noticed to resemble T4 phage gp18 polysheath. Two β-strands of VipA and four β-strands of VipB intertwine forming the middle layer of the sheath. The sheath assembles around an inner Hcp tube and is attached to a structure called a baseplate that spans the bacterial membranes. Importantly, VipA/VipB sheath was shown to form a long contractile organelle in V. cholerae and in E. coli, suggesting that sheath contraction powers the secretion [].This entry includes TssC mostly from Bacteroidetes. The type VI secretion system (T6SS) is a supra-molecular bacterial complex that resembles phage tails. It is a toxin delivery systems which fires toxins into target cells upon contraction of its TssBC sheath [
]. Thirteen essential core proteins are conserved in all T6SSs: the membrane associated complex TssJ-TssL-TssM, the baseplate proteins TssE, TssF, TssG, and TssK, the bacteriophage-related puncturing complex composed of the tube (Hcp), the tip/puncturing device VgrG, and the contractile sheath structure (TssB and TssC). Finally, the starfish-shaped dodecameric protein, TssA, limits contractile sheath polymerization at its distal part when TagA captures TssA [].
Deoxyribonuclease I (DNase I) (
) [
] is a vertebrate enzyme which catalyses the endonucleolytic cleavage of double-stranded DNA to 5'- phosphodinucleotide and 5'-phosphooligonucleotide end-products. DNase I is an enzyme involved in DNA degradation; it is normally secreted outside of the cell but seems to be able to gain access to the nucleus where it is involved in cell death by apoptosis [].As shown in the following schematic representation, DNase I is a glycoprotein of about 260 residues with two conserved disulphide bonds.
+-+ +--------+
| | | |xxxxxxxxxxxxxxxxx#xxxxxxCxCxxxxx#xxxxxxxxxCxxxxxxxxCxxxxxxxxxxxxx
'C': conserved cysteine involved in a disulphide bond.
'#': active site residue.DNase I has a pH-optimum around 7.5 and requires calcium and magnesium for full activity. It causes single strand nicks in duplex DNA. A proton acceptor-donor chain composed of a histidine and a glutamic acid produce a nucleophilic hydroxyl ion from water, which cleaves the 3'-P-O bond [
]. DNase I forms a 1:1 complex with G-actin, resulting in the inhibition of DNase activity and loss of the ability of G-actin to polymerise into fibres [].DNase I has been used in the treatment of lung problems in patients with cystic fibrosis: here it acts by degrading DNA found in purulent lung secretions, reducing their viscosity and making it easier for the patient to breathe [
].The sequence of DNase I is evolutionary related to that of human muscle-specific DNase-like protein and human proteins DHP1 and DHP2. However, the first disulphide bond of DNase I is not conserved in these proteins.This entry represents the DNase I conserved site that is involved in disulphide bond formation. It has the consensus pattern G-D-F-N-A-x-C-[SAK].
This entry represents the DZF domain, which is found
exclusively in the metazoa.The DZF domain (domain associated with zinc fingers) is a dimerisation domain
found in [,
,
]:Vertebrate nuclear factor 90 (NF90, also known as ILF3, DRBP76 or NFAR-1),
contains two double-stranded RNA-binding motifs (dsRBMs)and interacts with highly structured RNAs as well as the dsRNA-activated
protein kinase (PKR).Metazoan NF45 (also known as ILF2), appears to function predominantly as a
heterodimeric complex with NF90.Vertebrate spermatid perinuclear RNA-binding protein (SPNR, also known as
STRBP), a testes specific paralogue of NF90.Metazoan Zinc-finger protein associated with RNA (Zfr).Nuclear factors NF90 and NF45 form a protein complex involved in a variety of
cellular processes and are thought to affect gene expression both at thetranscriptional and translational level. In addition, this complex affects the
replication of several viruses through direct interactions with viral RNA.NF90 and NF45 dimerize through their common DZF domain. The DZF domain shows
structural similarity to the template-free nucleotidyltransferase family ofRNA modifying enzymes. However, the lack of conserved catalytic residues
suggests that the DZF domain encodes a 'pseudotransferase' that is no longerable to catalyze transfer of nucleotides.The DZF dimerisation domain form an oblong structure with a flat face on one
side and a curved face on the other. The DZF domain is bipartite andcharacterised by an N-terminal mixed α-β region that contains a central
anti-parallel β-sheet and a C-terminal α-helical region. The overall structure has a pseudo two-fold rotational symmetry.The central β-sheet forms the base of a cleft between the N- and C-terminal
halves while dimerization is mediated by the α-helices at the C terminus[
].
Sodium proton exchangers (NHEs) constitute a large family of integral membrane protein transporters that are responsible for the counter-transport of protons and sodium ions across lipid bilayers [
,
]. These proteins are found in organisms across all domains of life. In archaea, bacteria, yeast and plants, these exchangers provide increased salt tolerance by removing sodium in exchanger for extracellular protons. In mammals they participate in the regulation of cell pH, volume, and intracellular sodium concentration, as well as for the reabsorption of NaCl across renal, intestinal, and other epithelia [,
,
,
]. Human NHE is also involved in heart disease, cell growth and in cell differentiation []. The removal of intracellular protons in exchange for extracellular sodium effectively eliminates excess acid from actively metabolising cells. In mammalian cells, NHE activity is found in both the plasma membrane and inner mitochondrial membrane. To date, nine mammalian isoforms have been identified (designated NHE1-NHE9) [,
]. These exchangers are highly-regulated (glyco)phosphoproteins, which, based on their primary structure, appear to contain 10-12 membrane-spanning regions (M) at the N terminus and a large cytoplasmic region at the C terminus. The transmembrane regions M3-M12 share identity with other members of the family. The M6 and M7 regions are highly conserved. Thus, this is thought to be the region that is involved in the transport of sodium and hydrogen ions. The cytoplasmic region has little similarity throughout the family. There is some evidence that the exchangers may exist in the cell membrane as homodimers, but little is currently known about the mechanism of their antiport [].This entry represents a conserved region found in Na+/H+ exchanger protein isoforms 3 and 5.
Chemotaxis methyl-accepting receptor, methyl-accepting site
Type:
PTM
Description:
Methyl-accepting chemotaxis proteins (MCPs) are a family of bacterial receptors that mediate chemotaxis to diverse signals, responding to changes in the concentration of attractants and repellents in the environment by altering swimming behaviour [
]. Environmental diversity gives rise to diversity in bacterial signalling receptors, and consequently there are many genes encoding MCPs []. For example, there are four well-characterised MCPs found in Escherichia coli: Tar (taxis towards aspartate and maltose, away from nickel and cobalt), Tsr (taxis towards serine, away from leucine, indole and weak acids), Trg (taxis towards galactose and ribose) and Tap (taxis towards dipeptides). MCPs share similar topology and signalling mechanisms. MCPs either bind ligands directly or interact with ligand-binding proteins, transducing the signal to downstream signalling proteins in the cytoplasm. MCPs undergo two covalent modifications: deamidation and reversible methylation at a number of glutamate residues. Attractants increase the level of methylation, while repellents decrease it. The methyl groups are added by the methyl-transferase cheR and are removed by the methylesterase cheB. Most MCPs are homodimers that contain the following organisation: an N-terminal signal sequence that acts as a transmembrane domain in the mature protein; a poorly-conserved periplasmic receptor (ligand-binding) domain; a second transmembrane domain; and a highly-conserved C-terminal cytoplasmic domain that interacts with downstream signalling components. The C-terminal domain contains the glycosylated glutamate residues. The methyl-accepting sites are specific glutamate residues (some of these sites are translated as glutamine but are irreversibly deamidated by cheB). They are clustered in two regions of the cytoplasmic domain that interacts with downstream signalling components. This entry represents the first of these two methyl-accepting regions.
MTMR1 (myotubularin-related protein 1) is a lipid phosphatase that uses phosphatidylinositol 3-phosphate (PtdIns3P) and phosphatidylinositol 3,5-bisphosphate [PtdIns(3,5)P2] as substrates []. MTMR1 is abnormally expressed in myotonic dystrophy type1 (DM1) and in myotonic dystrophy type 2 (DM2), in correlation with muscle pathological features [].The myotubularin family constitutes a large group of conserved proteins, with 14 members in humans consisting of myotubularin (MTM1) and 13 myotubularin-related proteins (MTMR1-MTMR13). Orthologues have been found throughout the eukaryotic kingdom, but not in bacteria. MTM1 dephosphorylates phosphatidylinositol 3-monophosphate (PI3P) to phosphatidylinositol and phosphatidylinositol 3,5-bisphosphate [PI(3,5)P2] to phosphatidylinositol 5-monophosphate (PI5P) [,
]. The substrate phosphoinositides (PIs) are known to regulate traffic within the endosomal-lysosomal pathway []. MTMR1, MTMR2, MTMR3, MTMR4, and MTMR6 have also been shown to utilise PI(3)P as a substrate, suggesting that this activity is intrinsic to all active family members. On the other hand, six of the MTM family members encode for catalytically inactive phosphatases. Inactive myotubularin phosphatases contain substitutions in the Cys and Arg residues of the Cys-X5-Arg motif. MTM pseudophosphatases have been found to interact with MTM catalytic phosphatases []. The myotubularin family includes several members mutated in neuromuscular diseases or associated with metabolic syndrome, obesity, and cancer [].Most MTMRs contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, a PTP domain (which may be active or inactive), a SET-interaction domain, and a C-terminal coiled-coil region. In addition some members contain DENN domain N-terminal to the PH-GRAM domain and FYVE, PDZ, and PH domains C-terminal to the coiled-coil region [
]. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold [].
Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll 'a' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane [
,
,
]. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection []. This entry represents the low molecular weight transmembrane protein PsbI, which is tightly associated with the D1/D2 heterodimer in PSII. The function of PsbI is unknown, but it may be involved in the assembly, dimerisation or stabilisation of PSII dimers [
].
Cadherins are a group of transmembrane proteins that serve as the major adhesion molecules located within adherens junctions. They can regulate cell-cell adhesion through their extracellular domain and their cytosolic domains connect to the actin cytoskeleton by binding to catenins [
]. These proteins preferentially interact with themselves in a homophilic manner in connecting cells; thus acting as both receptor and ligand. They may play an important role in the sorting of different cell types during morphogenesis, histogenesis and regeneration. They may also be involved in the regulation of tight and gap junctions, and in the control of intercellular spacing. Cadherins are evolutionary related to the desmogleins which are component of intercellular desmosome junctions involved in the interaction of plaque proteins.Structurally, cadherins comprise a number of domains: classically, these include a signal sequence; a propeptide of around 130 residues; a single transmembrane domain and five tandemly repeated extracellular cadherin domains, 4 of which are cadherin repeats, and the fifth contains 4 conserved cysteines and a C-terminal cytoplasmic domain []. However, proteins are designated as members of the broadly defined cadherin family if they have one or more cadherin repeats. A cadherin repeat is an independently folding sequence of approximately 110 amino acids that contains motifs with the conserved sequences DRE, DXNDNAPXF, and DXD. Crystal structures have revealed that multiple cadherin domains form Ca2+-dependent rod-like structures with a conserved Ca2+-binding pocket at the domain-domain interface. Cadherins depend on calcium for their function: calcium ions bind to specific residues in each cadherin repeat to ensure its proper folding, to confer rigidity upon the extracellular domain and is essential for cadherin adhesive function and for protection against protease digestion.
The synapsins are a family of neuron-specific phosphoproteins that coat
synaptic vesicles and are involved in the binding between these vesiclesand the cytoskeleton (including actin filaments). The family comprises 5
homologous proteins Ia, Ib, IIa, IIb and III. Synapsins I, II, and III areencoded by 3 different genes. The a and b isoforms of synapsin I and II are
splice variants of the primary transcripts [].Synapsin I is mainly associated with regulation of neurotransmitter release
from presynaptic neuron terminals []. Synapsin II, as well as being involved in neurotransmitter release, has a role in the synaptogenesis and synaptic plasticity responsible for long term potentiation []. Recent studies implicate synapsin III with a developmental role in neurite elongation and synapse formation that is distinct from the functions of synapsins I and II [].Structurally, synapsins are multidomain proteins, of which 3 domains are
common to all the mammalian forms. The N-terminal `A' domain is ~30 residueslong and contains a serine residue that serves as an acceptor site for
protein kinase-mediated phosphorylation. This is followed by the `B' linkerdomain, which is ~80 residues long and is relatively poorly conserved.
Domain `C' is the longest, spanning approximately 300 residues. This domainis highly conserved across all the synapsins (including those from
Drosophila) and is possessed by all splice variants. The remaining sixdomains, D-I, are not shared by all the synapsins and differ both between
the primary transcripts and the splice variants.This entry represents the pre-ATP-grasp structural domain found in synapsins, which precedes the ATP-grasp domain. The structure of the pre-ATP-grasp domain consists of alpha/beta/alpha in three layers, and is possibly a rudiment form of the Rossmann-fold. This domain can have a substrate-binding function.
Apoptosis, or programmed cell death (PCD), is a common and evolutionarily conserved property of all metazoans [
]. In many biological processes, apoptosis is required to eliminate supernumerary or dangerous (such as pre-cancerous) cells and to promote normal development. Dysregulation of apoptosis can, therefore, contribute to the development of many major diseases including cancer, autoimmunity and neurodegenerative disorders. In most cases, proteins of the caspase family execute the genetic programme that leads to cell death.The protein harakiri (Hrk, also called death protein 5, DP5) is a pro-apoptotic Bcl-2 homology domain 3-only (BH3-only) member protein, which belongs to the Bcl-2 family. Hrk is associated to the mitochondrial outer membrane via a putative trans-membrane domain at the C-terminal, which adopts a predominantly α-helical structure. This domain is able to insert itself into membranes where it perturbs the physical properties of the membrane considerably [
].It can be activated by a diverse array of developmental cues or experimentally applied stress stimuli. Hrk contributes to apoptosis signalling elicited by trophic factor withdrawal in certain neuronal cells but is not critical for apoptosis of haematopoietic cells [
]. DP5 is important in neuronal cell death that can be induced by axotomy and neuronal growth factor (NGF) deprivation. It acts by regulating the mitochondrial function and caspase-3 activation []. Apoptosis regulation is a main cause of epithelial dysfunction in patients with ulcerative colitis. Six genes were found to be highly expressed in epithelial cells from people with and without ulcerative colitis, one of which is Hrk [
]. Hrk is also up-regulated in a JNK-dependent manner during apoptosis induced by potassium deprivation in cerebellar granule neurons [].
Neuropeptide FF receptors [
] belong to a family of neuropeptides containing an RF-amide motif at their C terminus which have a high affinity for the pain modulatory peptide neuropeptide NPFF (NPFF) []. Neuropeptide FF (NPFF) receptors have two subtypes, neuropeptide FF receptor type 1 (NPFF1) and neuropeptide FF receptor type 2 (NPFF2), they are members of rhodopsin G protein-coupled receptor family. The neuropeptide FF is found at high concentrations in the posterior pituitary, spinal cord, hypothalamus and medulla and is believed to be involved in pain modulation, opioid tolerance, cardiovascular regulation, memory and neuroendocrine regulation [,
,
,
].Comparing the distribution of NPFF1 and NPFF2 receptors in different species reveals important species differences [
]. The NPFF1 receptor is broadly distributed in the central nervous system with the highest levels found in the limbic system and the hypothalamus, is thought to participate in neuroendocrine functions. Whereas as the NPFF2 receptor is present in high density, particularly in mammals in the superficial layers of the spinal cord [] where it is involved in nociception and modulation of opioid functions [], consistent with a potential role of NPFF in the modulation of sensory inputs, like pain responses [,
,
].This entry represents NPFF1 receptor, which is expressed at highest levels in the hypothalamus, with moderate expression in the thalamus, midbrain, medulla oblongata, testis and eye [
]. NPFF1 receptor has been found to regulate adenylyl cyclase in some recombinant cell lines [,
]. It also couples with Gi protein to inhibit adenylyl cyclase (AC) [], and reduces the activities of cAMP-dependent protein kinase (PKA) and mitogen-activated protein kinase (MAPK) signaling cascade.
Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll 'a' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane [
,
,
]. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection [
]. The low molecular weight transmembrane protein PsbX found in PSII is associated with the oxygen-evolving complex. Its expression is light-regulated. PsbX appears to be involved in the regulation of the amount of PSII [
], and may be involved in the binding or turnover of quinone molecules at the Qb (PsbA) site [].
RNA (C5-cytosine) methyltransferases (RCMTs) catalyse the transfer of a methyl group to the 5th carbon of a cytosine base in RNA sequences to produce C5-methylcytosine. RCMTs use the cofactor S-adenosyl-L-methionine (SAM) as a methyl donor [
]. The catalytic mechanism of RCMTs involves an attack by the thiolate of a Cys residue on position 6 of the target cytosine base to form a covalent link, thereby activating C5 for methyl-group transfer. Following the addition of the methyl group, a second Cys residue acts as a general base in the beta-elimination of the proton from the methylated cytosine ring. The free enzyme is restored and the methylated product is released [].Numerous putative RCMTs have been identified in archaea, bacteria and eukaryota [
,
]; most are predicted to be nuclear or nucleolar proteins []. The Escherichia coli Ribosomal RNA Small-subunit Methyltransferase Beta (RSMB) FMU (FirMicUtes) represents the first protein identified and characterised as a cytosine-specific RNA methyltransferase. RSMB was reported to catalyse the formation of C5-methylcytosine at position 967 of 16S rRNA [,
].A classification of RCMTs has been proposed on the basis of sequence similarity [
]. According to this classification, RCMTs are divided into 8 distinct subfamilies []. Recently, a new RCMT subfamily, termed RCMT9, was identified []. Members of the RCMT contain a core domain, responsible for the cytosine-specific RNA methyltransferase activity. This 'catalytic' domain adopts the Rossman fold for the accommodation of the cofactor SAM []. The RCMT subfamilies are also distinguished by N-terminal and C-terminal extensions, variable both in size and sequence [].Proteins related to the RsmB subfamily of RCMTs have been detected in the genomes of Viridiplantae [
]. They were provisionally assigned to the RsmB subfamily [], which hitherto was considered to be restricted to Eubacteria, based solely on similarity to the prototypic member of this subfamily, the E.coli protein [,
].
RNA (C5-cytosine) methyltransferases (RCMTs) catalyse the transfer of a methyl group to the 5th carbon of a cytosine base in RNA sequences to produce C5-methylcytosine. RCMTs use the cofactor S-adenosyl-L-methionine (SAM) as a methyl donor [
]. The catalytic mechanism of RCMTs involves an attack by the thiolate of a Cys residue on position 6 of the target cytosine base to form a covalent link, thereby activating C5 for methyl-group transfer. Following the addition of the methyl group, a second Cys residue acts as a general base in the beta-elimination of the proton from the methylated cytosine ring. The free enzyme is restored and the methylated product is released [].Numerous putative RCMTs have been identified in archaea, bacteria and eukaryota [
,
]; most are predicted to be nuclear or nucleolar proteins []. The Escherichia coli Ribosomal RNA Small-subunit Methyltransferase Beta (RSMB) FMU (FirMicUtes) represents the first protein identified and characterised as a cytosine-specific RNA methyltransferase. RSMB was reported to catalyse the formation of C5-methylcytosine at position 967 of 16S rRNA [,
].A classification of RCMTs has been proposed on the basis of sequence similarity [
]. According to this classification, RCMTs are divided into 8 distinct subfamilies []. Recently, a new RCMT subfamily, termed RCMT9, was identified []. Members of the RCMT contain a core domain, responsible for the cytosine-specific RNA methyltransferase activity. This 'catalytic' domain adopts the Rossman fold for the accommodation of the cofactor SAM []. The RCMT subfamilies are also distinguished by N-terminal and C-terminal extensions, variable both in size and sequence [].As mentioned above, RCMT9 is a novel subtype of RCMT-related proteins. Putative orthologues of this subfamily have been detected only in Viridiplantae, Alveolata, Euglenozoa and Mycetozoa taxa. Members of this group are distantly related to the Nuclear protein 1 (NCL1) subfamily [
].
This entry represents the complete catalytic core domain of sirtuin proteins.The sirtuin (also known as Sir2) family is broadly conserved from bacteria to human. Yeast Sir2 (silent mating-type information regulation 2),
the founding member, was first isolated as part of the SIR complex required for maintaining a modified chromatin structure at telomeres. Sir2 functionsin transcriptional silencing, cell cycle progression, and chromosome stability [
]. Although most sirtuins in eukaryotic cells are located in the nucleus, others are cytoplasmic or mitochondrial.This family is divided into five classes (I-IV and U) on the basis of a phylogenetic analysis of 60 sirtuins from a wide array of organisms [
]. Class I and class IV are further divided into three and two subgroups, respectively. The U-class sirtuins are found only in Gram-positive bacteria []. The S. cerevisiae genome encodes five sirtuins, Sir2 and four additional proteins termed 'homologues of sir two' (Hst1p-Hst4p) []. The human genome encodes seven sirtuins, with representatives from classes I-IV [,
].Sirtuins are responsible for a newly classified chemical reaction, NAD-dependent protein deacetylation. The final products of the reaction are the
deacetylated peptide and an acetyl ADP-ribose []. In nuclear sirtuins this deacetylation reaction is mainly directed against histones acetylated lysines [].Sirtuins typically consist of two optional and highly variable N- and C-terminal domain (50-300 aa) and a conserved catalytic core domain (~250 aa). Mutagenesis experiments suggest that the N- and C-terminal regions help direct catalytic core domain to different targets [
,
].The 3D-structure of an archaeal sirtuin in complex with NAD reveals that the protein consists of a large domain having a Rossmann fold and a small domain containing a three-stranded zinc ribbon motif. NAD is bound in a pocket between the two domains [
].