ARHGEF12 (also called LARG) is a Rho guanine-nucleotide exchange factor that are RhoA-selective and directly activated by the Galpha12/13 family of heterotrimeric G proteins. LARG contains a regulator of G protein signaling (RGS) homology (RH) domain, the catalytic Dbl homology (DH) domain and the pleckstrin homology (PH) domain. The DH and PH domains bind RhoA and catalyze the exchange of GDP for GTP on RhoA. The active site of RhoA adopts two distinct GDP-excluding conformations among the four unique complexes in the asymmetric unit. The LARG PH domain also contains a potential protein-docking site. LARG forms a homotetramer via its DH domains []. This entry represents the PH domain of ARHGEF12. It has an exposed hydrophobic patch that could interface with other domains of LARG or other regulatory proteins [
].
This endosulphine family includes cAMP-regulated phosphoprotein 19 (ARPP-19), alpha endosulphine and protein Igo1. No function has yet been assigned to ARPP-19 [
]. Endosulphine is the endogenous ligand for the ATP-dependent potassium channels which occupy a key position in the control of insulin release from the pancreatic beta cell by coupling cell polarity to metabolism [].Igo1 is required for initiation of G0 program. In the absence of stimulatory signals, cells may enter into a reversible quiescence (or G0) state that is typically characterised by low metabolic activity, including low rates of protein synthesis and transcription. Igo proteins associate with the mRNA decapping activator Dhh1, sheltering newly expressed mRNAs from degradation via the 5'-3' mRNA decay pathway, and thereby enabling their proper translation during initiation of the G0 program [].
The CABIT domain (for 'cysteine-containing, all- in Themis') is found twice in a newly identified gene family that has three mammalian homologues (Themis, Icb1 and 9130404H23Rik). These proteins function downstream of tyrosine kinase signalling and interact with GRB2. In addition to their CABIT domains, the proteins also share a highly conserved proline-rich region [
]. In contrast, Fam59A, Fam59B and related proteins from mammals to cnidarians (including the insect Serrano proteins), have a single copy of the CABIT domain, a proline-rich region and often a C-terminal SAM (sterile-motif) domain. Multiple-sequence alignment has predicted that the CABIT domain adopts an all-strand structure with at least 12 strands, ie a dyad of six-stranded β-barrel units. The CABIT domain contains a nearly absolutely conserved cysteine residue which is likely to be central to its function.
Protein containing this domain are highly divergent in their overall sequence, however, they share a common region of roughly 200 amino acids known as the SEC7 domain [[cite27373159], ]. The 3D structure of the domain displays several α-helices []. It was found to be associated with other domains involved in guanine nucleotide exchange (e.g., CDC25, Dbl) in mammalian guanine-nucleotide-exchange factors [].SEC7 domain containing proteins are guanine nucleotide exchange factors (GEFs) specific for the ADP-rybosylation factors (ARF), a Ras-like GTPases which is important for vesicular protein trafficking. These proteins can be divided into five families, based on domain organisation and conservation of primary amino acid sequence: GBF/BIG, cytohesins,eFA6, BRAGs, and F-box [
]. They are found in all eukaryotes, and are involved in membrane remodeling processes throughout the cell [].
Apo L belongs to the high density lipoprotein family that plays a central role in cholesterol transport. The cholesterol content of membranes is important in cellular processes such as modulating gene transcription and signal transduction both in the adult brain and during neurodevelopment. There are six apo L genes located in close proximity to each other on chromosome 22q12 in humans. 22q12 is a confirmed high-susceptibility locus for schizophrenia and close to the region associated with velocardiofacial syndrome that includes symptoms of schizophrenia [
].The various functions of apoL are still not entirely clear. Apolipoprotein L-I has been identified as a trypanolytic agent [
] and displays similar phylogenetic distribution to the programmed cell death protein Bcl-2 and BH-3 domain-containing proteins, suggesting a possible role in apoptosis [].
This entry represents the C-terminal domain of fatty acyl CoA reductases, a family of SDR-like proteins. SDRs or short-chain dehydrogenases/reductases are Rossmann-fold NAD(P)H-binding proteins. Many proteins with this domain may function as fatty acyl-CoA reductases (FARs), acting on medium and long chain fatty acids, and have been reported to be involved in diverse processes such as the biosynthesis of insect pheromones [,
,
], plant cuticular wax production [], and mammalian wax biosynthesis []. In Arabidopsis thaliana, proteins with this particular architecture have also been identified as the MALE STERILITY 2 (MS2) gene product, which is implicated in male gametogenesis. Mutations in MS2 inhibit the synthesis of exine (sporopollenin), rendering plants unable to reduce pollen wall fatty acids to corresponding alcohols [,
,
,
]. The function of this C-terminal domain is unclear.
Candida albicans is the most prevalent fungal pathogen in humans and a major source of life-threatening nosocomial infections [
]. It codes for several cell surface Als (agglutinin-like sequence) glycoproteins (Als1-Als7 and Als9) which are key virulence factors and have been associated with binding of host-cell surface proteins and small peptides of random sequence, the formation of biofilms and amyloid fibres. The N-terminal domain of Als proteins (NT-Als) is involved in adhesive function and comprises two tandem DEv-IgG type immunoglobulin domains, N1 and N2, arranged in a fold similar to that of the MSCRAMM (Microbial Surface Components Recognizing Adhesive Matrix Molecules) domain []. It also possesses a peptide-binding cavity (PBC) that can bind peptides with broad specificity.This superfamily represents the N2 subdomain found in the N-terminal of Als proteins.
Proteins included in this entry are conserved from plants and fungi to humans. Erv46 (ERGIC3) works in close conjunction with Erv41 (ERGIC2) and together they form a complex which cycles between the endoplasmic reticulum and Golgi complex. Erv46-41 interacts strongly with the endoplasmic reticulum glucosidase II. Mammalian glucosidase II comprises a catalytic alpha-subunit and a 58kDa beta subunit, which is required for ER localisation. All proteins identified biochemically as Erv41p-Erv46p interactors are localised to the early secretory pathway and are involved in protein maturation and processing in the ER and/or sorting into COPII vesicles for transport to the Golgi [
].This entry also includes disulfide isomerase (PDI)-C subfamily members from Arabidopsis. They are chimeric proteins containing the thioredoxin (Trx) domain of PDIs, and the conserved N- and C-terminal domains of Erv cargo receptors [
].
The Gag polyprotein from retroviruses is processed by viral protease to produce the major structural proteins, including the capsid protein. The newly formed capsid protein rearranges to form the capsid core particle that surrounds the viral genome of the mature virus. The capsid is composed of two domains, the N-terminal domain (NTD), which contributes to viral core formation, and the C-terminal domain (CTD), which is required for capsid dimerisation, Gag oligomerization and viral formation. The CTD contains the major homology region (MHR), a stretch of 20 amino acids that is conserved across retroviruses and is essential for viral assembly, maturation and infectivity. The CTD is composed of a bundle of four helices in an up and down arrangement, where helix 3 is shorter than the others [
,
,
].
DNAJB2 (also known as HSJ-1) contains an N-terminal J-domain and the C-terminal UIMs (ubiquitin (Ub)-interacting motifs)-containing domain. It binds and stimulates ATPase activity of HSP70 through its J-domain [
]. It also binds to polyUb chains through its UIM domain and contributes to the ubiquitin-dependent proteasomal degradation of misfolded proteins [,
].DNAJB6 (also known as HSJ-2) plays an indispensable role in the organization of KRT8/KRT18 filaments. It acts as an endogenous molecular chaperone for neuronal proteins including huntingtin. DNAJB6 is able to suppress aggregation and toxicity of polyglutamine-containing, aggregation-prone proteins. It also has a stimulatory effect on the ATPase activity of HSP70 in a dose-dependent and time-dependent manner and hence acts as a co-chaperone of HSP70 [
,
]. DNAJB8 is also a suppressor of aggregation and toxicity of disease-associated polyglutamine proteins [
].
Hybrid cluster proteins (HCP, or Prismane) have been identified in bacteria, archaea and eukaryotic protozoa. No specific function has yet been assigned to these proteins, but it may involve oxidoreductase enzymatic activity. These proteins contain one 4Fe-4S cluster, and one hybrid 4Fe-2O-2S cluster, the latter being similar to the Ni-Fe-S cluster found in carbon monoxide dehydrogenase enzymes (
) [
,
].This subfamily is heterogeneous with respect to the presence or absence of a region of about 100 amino acids not far from the N terminus of the protein. Members have been described as monomeric.
The general function is unknown, although members from E. coli and several other species have hydroxylamine reductase activity. Members are found in various bacteria, in Archaea, and in several parasitic eukaryotes: Giardia intestinalis, Trichomonas vaginalis, and Entamoeba histolytica.
This entry represents the BAR domain found at the N-terminal of OPHN1. This domain is normally associated with dimerisation, induction of membrane bending and curvature, and protein-protein interactions. Oligophrenin-1 (OPHN1) is a GTPase activating protein (GAP) with activity towards RhoA, Rac, and Cdc42, that is expressed in developing spinal cord and in adult brain areas with high plasticity. It plays a role in regulating the actin cytoskeleton as well as morphology changes in axons and dendrites, and may also function in modulating neuronal connectivity. Mutations in the OPHN1 gene cause X-linked mental retardation associated with cerebellar hypoplasia, lateral ventricle enlargement and epilepsy [
,
,
,
,
]. This protein also contains a Pleckstrin homology (PH) domain (), and a Rho GAP domain (
) [
,
,
].
This entry represents the RGS domain of RGS6 (regulator of G-protein signaling 6). RGS6 is a member of the R7 RGS protein subfamily. Other members of the R7 subfamily (Neuronal RGS) include: RGS7, RGS9, and RGS11, all of which are expressed predominantly in the nervous system, form an obligatory complex with G-beta-5, and play important roles in the regulation of crucial neuronal processes such as vision and motor control [
]. RGS6 exists in multiple splice isoforms with identical RGS domains, but possess complete or incomplete GGL domains and distinct N- and C-terminal domains [
]. RGS6 interacts with SCG10, a neuronal growth-associated protein and therefore regulates neuronal differentiation []. Another RGS6-binding protein is DMAP1, a component of the Dnmt1 complex involved in repression of newly replicated genes [].
This small family of proteins from plants includes proteinase inhibitors A and B from Sagittaria sagittifolia (Arrowhead) and a trypsin/chymotrypsin inhibitor from Alocasia macrorrhizos (Giant taro) [
]. They belong to the MEROPS inhibitor family I3, clan IC. The arrowhead proteinase inhibitor A and B (APIA and APIB) are double-headed and multifunctional, consisting of 179 amino acid residues with three disulphide bridges. The reactive site residues Lys(44), Arg(76) and Arg(87) of APIB are predicted by sequence comparison to other proteinase inhibitors [
]. The two previously predicted reactive site residues, Lys-44 and Arg-76 of inhibitor B, were confirmed by site-directed mutagenesis. The two predicted active sites of APIA, namely, Ser-82 and Leu-87, leu and Arg in APIB respectively, were substituted by these two corresponding residues, which confirmed their importance as active site residues [].
This entry represents receptor tyrosine-protein phosphatases (PTP) (
), including both alpha type and epsilon type. PTP catalyses the dephosphorylation of protein tyrosine phosphate to protein tyrosine, and appear to play a pivotal role in insulin receptor signalling. It can exist as a single-pass membrane protein or in the cytoplasm. PTP-alpha is as a positive regulator of Src and Src family kinases, acting to dephosphorylate and activate Src. As such, PTP-alpha affects transformation and tumourigenesis, inhibition of proliferation and cell cycle arrest, mitotic activation of Src, integrin signalling, neuronal differentiation and outgrowth, and ion channel activity [
]. PTP-epsilon is a negative regulator of insulin signalling, where the cytosolic form has been shown to act in skeletal muscle [] and the receptor-type enzyme has been shown to act in primary hepatocytes and liver [].
This entry represents the first K homology domain (KH1) in the KRR1 and Dim2 (Pno1) proteins.Krr1 and Dim2 are structurally related ribosomal assembly factors, present on the 90S pre-ribosome [
]. They both belong to the family of RNA binding domains that contain K homology (KH) domains. Dim2 and Krr1 each contain two sequential KH domains (KH1 and KH2), but the N and C-terminal extensions differ between the two proteins []. The two KH domains are evident in the structure of KKr1 from Saccharomyces Cerevisiae [].The KH1 domains in Krr1 and Dim2 lack the typical GXXG RNA binding motif and are involved, instead, in protein-protein interactions. The KH1 domain in Krr1 interacts with the nucleolar assembly factor Kri1 and the KH1 domain of Dim2 interacts with the endonuclease Nob1 [
].
The EsV-1-7 repeat is a cysteine-rich motif of unknown function. The motif was originally identified in the Ectocarpus "immediate upright"protein, which has an EsV-1-7 domain that contains five EsV-1-7 repeats [
]. The name is derived from the Ectocarpus virus EsV-1 protein EsV-1-7, which possesses six EsV-1-7 repeats. Ectocarpus has a large family of EsV-1-7 domain proteins with between one and 19 copies of the motif (C-X4-C-X16-C-X2-H-X12). In addition to brown algae, EsV-1-7 domain proteins have been found in eustigmatophytes, oomycetes, cryptophytes, two families of green algae (Coccomyxaceae and Selenastraceae) and also in viral genomes, such as Emiliania huxleyi virus PS401 and Pithovirus sibericum. Based on this unusual distribution, it has been proposed that EsV-1-7 domain genes have been exchanged between lineages by horizontal gene transfer during evolution [,
].
This is the C-terminal domain of Ubiquitin-like protein 4A, an orthologue of yeast Get5. In budding yeasts, Get proteins directly mediate the insertion of newly synthesized TA proteins into endoplasmic reticulum membranes. Similarly, mammalian BAG6, Ubl4a, and SGTA make up a trimeric complex that binds TA proteins post-translationally and then loads them onto the cytosolic ATPase TRC40, which in turn targets them to the endoplasmic reticulum. Structural studies show that this C-terminal TUGS domain of Ubl4a is essential for BAG6 tethering. Given that BAG6 mediates oligomeric complex formation of Ubl4a, TRC35, and TRC40 (mammalian counterparts of Get5, Get4, and Get3, respectively), the C-terminal TUGS domain might be crucial for supporting BAG6-mediated Ubl4a-TRC35 complex formation in humans as an alternative to the direct Get5-Get4 interaction in yeast [
,
].
The desulforedoxin domain is a small non-haem iron domain present in the desulforedoxin (Dx) and desulfoferrodoxin (Superoxide reductase or Dfx) proteins of some archeael and bacterial methanogens and sulfate/sulfur reducers. It constitutes essentially the full length of desulforedoxin, and the N-terminal domain of Desulfoferrodoxin.Desulforedoxin is a small, single-domain homodimeric protein. Each subunit (of 36 amino acid residues) contains an iron atom bound to four cysteinyl sulfur atoms, Fe(S-Cys)4, in a distorted tetrahedral coordination. Its metal centre is similar to that found in rubredoxin type proteins [
]. Desulfoferrodoxin forms a homodimeric protein, with each protomer comprised of two domains, the N-terminal DSRD domain and C-terminal superoxide reductase-like (SORL) domain. Each domain has a distinct iron centre: the DSRD iron centre I, Fe(S-Cys)4; and the SORL iron centre II, Fe[His4Cys(Glu)] [].
CtIP is predominantly a nuclear protein that complexes with both BRCA1 and the BRCA1-associated RING domain protein (BARD1). At the protein level, CtIP expression varies with cell cycle progression in a pattern identical to that of BRCA1. Thus, the steady-state levels of CtIP polypeptides, which remain low in resting cells and G1 cycling cells, increase dramatically as Dividing cells traverse the G1/S boundary. CtIP can potentially modulate the functions ascribed to BRCA1 in transcriptional regulation, DNA repair, and/or cell cycle checkpoint control [
]. This N-terminal domain carries a coiled-coil region and is essential for homodimerisation of the protein []. The C-terminal domain is family CtIP_C and carries functionally important CxxC and RHR motifs, absence of which lead cells to grow slowly and show hypersensitivity to genotoxins [].
The bHLH (basic Helix-Loop-Helix) proteins contain the bHLH domain that is approximately 60 amino acids long and consists of a DNA-binding basic region followed by two α-helices separated by a variable loop region (HLH). The HLH domain promotes dimerisation, allowing the formation of homo- or heterodimeric complexes between different family members. Many bHLH proteins have been shown to act as transcriptional regulators [
].Transcription factor ATOH7 (previously known as Protein atonal homologue 7, Atoh7, and Math5) is a basic helix-loop-helix transcription factor that is involved in early stages of retinal neurogenesis [
,
,
]. It is specifically expressed in the embryonic neural retina and is required for the genesis of retinal ganglion cells (RGCs) and optic nerves [
].This entry represents the bHLH domain of ATOH7 and similar proteins from chordates.
This superfamily represents the chitinase insertion domain (CID) that is found in chitinases or chitinase-like proteins. It is composed of five or six anti-parallel β-strands and one α-helix and it inserts between the seventh α-helix and seventh β-strand of the TIM barrel []. The CID domain forms a wall alongside the TIM barrel substrate-binding cleft of chitinase which increases the depth of the cleft. Family 18 chitinases (also known as glycosyl hydrolase 18 family) can be classified into three subfamilies: A, B, and C. The CID domain can be found in the subfamily A, but is absent in the subfamily B []. Some mammalian glycoproteins with various functions also consist of a TIM domain and a CID domain, such as human cartilage glycoprotein-39 (HCgp-39), also known as chitinase-3-like protein 1 [
].
This entry represents the N-terminal of the DjlA protein. This domain can also be found in the tellurium resistance protein TerB. DjlA is a inner membrane protein which belongs to the DnaJ co-chaperone family. Direct interaction between DnaK and DjlA is needed for the induction of the wcaABCDE operon which is involved in the synthesis of a colanic acid polysaccharide capsule. The colanic acid capsule may help the bacterium survive conditions outside the host [
,
].DjlA contains the highly conserved J-domain which characterises the DnaJ/Hsp40 family and is essential for interaction with DnaK. The J-domain of DjlA is located at the C terminus of the protein, and at its N terminus is a transmembrane (TM) domain that promotes its insertion into the membrane. The A-domain separates the TM domain from the J-domain [
].
This entry includes Bacillus subtilis AmhX and related proteins that form a subfamily of the peptidase M20 family [
]. They are predicted amidohydrolases. These are a class of zinc binding homodimeric enzymes involved in hydrolysis of N-acetylated proteins []. N-terminal acetylation of proteins is a widespread and highly conserved process that is involved in protection and stability of proteins. Several types of aminoacylases can be distinguished on the basis of substrate specificity. Aminoacylase 1 (ACY1) breaks down cytosolic aliphatic N-acyl-alpha-amino acids (except L-aspartate), especially N-acetyl-methionine and acetyl-glutamate into L-amino acids and an acyl group. However, ACY1 can also catalyze the reverse reaction, the synthesis of acetylated amino acids. ACY1 may also play a role in xenobiotic bioactivation as well as the inter-organ processing of amino acid-conjugated xenobiotic derivatives (S-substituted-N-acetyl-L-cysteine) [].
The Endosomal Sorting Complex Required for Transport (ESCRT) complexes form the machinery driving protein sorting from endosomes to lysosomes.Saccharomyces cerevisiae ESCRT-I is a heterotrimeric complex of Vps23, Vps28, Vps37 and Mvb12 [
].The multivesicular body (MVB) protein-sorting pathway targets transmembrane
proteins either for degradation or for function in the vacuole/lysosomes. Thesignal for entry into this pathway is monoubiquitination of protein cargo,
which results in incorporation of cargo into luminal vesicles at lateendosomes. Another crucial player is phosphatidylinositol 3-phosphate
(PtdINS(3)P), which is enriched on early endosomes and on the luminal vesiclesof MVBs. ESCRT (endosomal sorting complex required for transport)-I, -II and -III complexes are critical for MVB budding and sorting of
monoubiquitinated cargo into the luminal vesicles []. Various Ub-binding domains(UBDs), such as UIM, UEV and NZF are found in such
machineries [,
].
This group of sequences represent the p10 subunit found in caspases. Caspases (Cysteine-dependent ASPartyl-specific proteASE) are cysteine peptidases that belong to the MEROPS peptidase family C14 (caspase family, clan CD) based on the architecture of their catalytic dyad or triad [
]. Caspases are tightly regulated proteins that require zymogen activation to become active, and once active can be regulated by caspase inhibitors. Activated caspases act as cysteine proteases, using the sulphydryl group of a cysteine side chain for catalysing peptide bond cleavage at aspartyl residues in their substrates. The catalytic cysteine and histidine residues are on the p20 subunit after cleavage of the p45 precursor.Caspases are mainly involved in mediating cell death (apoptosis) [
,
,
]. They have two main roles within the apoptosis cascade: as initiators that trigger the cell death process, and as effectors of the process itself. Caspase-mediated apoptosis follows two main pathways, one extrinsic and the other intrinsic or mitochondrial-mediated. The extrinsic pathway involves the stimulation of various TNF (tumour necrosis factor) cell surface receptors on cells targeted to die by various TNF cytokines that are produced by cells such as cytotoxic T cells. The activated receptor transmits the signal to the cytoplasm by recruiting FADD, which forms a death-inducing signalling complex (DISC) with caspase-8. The subsequent activation of caspase-8 initiates the apoptosis cascade involving caspases 3, 4, 6, 7, 9 and 10. The intrinsic pathway arises from signals that originate within the cell as a consequence of cellular stress or DNA damage. The stimulation or inhibition of different Bcl-2 family receptors results in the leakage of cytochrome c from the mitochondria, and the formation of an apoptosome composed of cytochrome c, Apaf1 and caspase-9. The subsequent activation of caspase-9 initiates the apoptosis cascade involving caspases 3 and 7, among others. At the end of the cascade, caspases act on a variety of signal transduction proteins, cytoskeletal and nuclear proteins, chromatin-modifying proteins, DNA repair proteins and endonucleases that destroy the cell by disintegrating its contents, including its DNA. The different caspases have different domain architectures depending upon where they fit into the apoptosis cascades, however they all carry the catalytic p10 and p20 subunits.Caspases can have roles other than in apoptosis, such as caspase-1 (interleukin-1 beta convertase) (
), which is involved in the inflammatory process. The activation of apoptosis can sometimes lead to caspase-1 activation, providing a link between apoptosis and inflammation, such as during the targeting of infected cells. Caspases may also be involved in cell differentiation [
].
Helicases have been classified in 5 superfamilies (SF1-SF5). All of the
proteins bind ATP and, consequently, all of them carry the classical Walker A(phosphate-binding loop or P-loop) and Walker B
(Mg2+-binding aspartic acid) motifs. For the two largest groups, commonlyreferred to as SF1 and SF2, a total of seven characteristic motifs have been
identified [] which are distributed over two structural domains, anN-terminal ATP-binding domain and a C-terminal domain. UvrD-like DNA helicases
belong to SF1, but they differ from classical SF1/SF2 by alarge insertion in each domain. UvrD-like DNA helicases unwind DNA with a
3'-5' polarity [].Crystal structures of several uvrD-like DNA helicases have been solved [
,
,
]. They are monomeric enzymes consisting of twodomains with a common α-β RecA-like core. The ATP-binding site is
situated in a cleft between the N terminus of the ATP-binding domain and thebeginning of the C-terminal domain. The enzyme crystallizes in two different
conformations (open and closed). The conformational difference between the twoforms comprises a large rotation of the end of the C-terminal domain by
approximately 130 degrees. This "domain swiveling"was proposed to be an important
aspect of the mechanism of the enzyme [].Some proteins that belong to the UvrD-like DNA helicase family are listed
below:Bacterial UvrD helicase. It is involved in the post-incision events of
nucleotide excision repair and methyl-directed mismatch repair. It unwindsDNA duplexes with 3'-5' polarity with respect to the bound strand and
initiates unwinding most effectively when a single-stranded region ispresent.Gram-positive bacterial pcrA helicase, an essential enzyme involved in DNA
repair and rolling circle replication. The Staphylococcus aureus pcrAhelicase has both 5'-3' and 3'-5' helicase activities.Bacterial rep proteins, a single-stranded DNA-dependent ATPase involved in
DNA replication which can initiate unwinding at a nick in the DNA. It bindsto the single-stranded DNA and acts in a progressive fashion along the DNA
in the 3' to 5' direction.Bacterial helicase IV (helD gene product). It catalyzes the unwinding of
duplex DNA in the 3'-5' direction.Bacterial recB protein. RecBCD is a multi-functional enzyme complex that
processes DNA ends resulting from a double-strand break. RecB is a helicasewith a 3'-5' directionality.Fungal srs2 proteins, an ATP-dependent DNA helicase involved in DNA repair. The polarity of the helicase activity was determined to be 3'-5'.This domain is also found bacterial helicase-nuclease complex AddAB, both in subunit AddA and AddB. The AddA subunit is responsable for the helicase activity. AddB also harbors a putative ATP-binding domain which does not play a role as a secondary DNA motor, but that it may instead facilitate the recognition of the recombination hotspot sequences [
].This entry represents the ATP-binding domain found in AddA, AddB and UvrD-like helicases.
Xeroderma pigmentosum (XP) [
] is a human autosomal recessive disease, characterised by a high incidence of sunlight-induced skin cancer. People's skin cells with this condition are hypersensitive to ultraviolet light, due to defects in the incision step of DNA excision repair. There are a minimum of seven genetic complementation groups involved in this pathway: XP-A to XP-G. XP-G is one of the most rare and phenotypically heterogeneous of XP, showing anything from slight to extreme dysfunction in DNA excision repair [,
]. XP-G can be corrected by a 133 Kd nuclear protein, XPGC []. XPGC is an acidic protein that confers normal UV resistance in expressing cells []. It is a magnesium-dependent, single-strand DNA endonuclease that makes structure-specific endonucleolytic incisions in a DNA substrate containing a duplex region and single-stranded arms [,
]. XPGC cleaves one strand of the duplex at the border with the single-stranded region [].XPG (ERCC-5) belongs to a family of proteins that includes RAD2 from Saccharomyces cerevisiae (Baker's yeast) and rad13 from Schizosaccharomyces pombe (Fission yeast), which are single-stranded DNA endonucleases [,
,
]; mouse and human FEN-1, a structure-specific endonuclease; RAD2 from fission yeast and RAD27 from budding yeast; fission yeast exo1, a 5'-3' double-stranded DNA exonuclease that may act in a pathway that corrects mismatched base pairs; yeast DHS1, and yeast DIN7. Sequence alignment of this family of proteins reveals that similarities are largely confined to two regions. The first is located at the N-terminal extremity (N-region) and corresponds to the first 95 to 105 amino acids. The second region is internal (I-region) and found towards the C terminus; it spans about 140 residues and contains a highly conserved core of 27 amino acids that includes a conserved pentapeptide (E-A-[DE]-A-[QS]). It is possible that the conserved acidic residues are involved in the catalytic mechanism of DNA excision repair in XPG. The amino acids linking the N- and I-regions are not conserved.The XP group D gene product (XPD) is a helicase that is required for nucleotide excision repair, and is also one of the components of basal transcription factor TFIIH [
,
]. DNA repair defects in the XPD group are associated with the clinical features of XP and trichothiodystrophy (TTD), which is characterised by sulphur-deficient brittle hair and a variety of other associated abnormalities, but no skin cancer [].XPD belongs to a family of ATP-dependent helicases that are characterised by a 'D-E-A-H' motif [
]. This resembles the 'D-E-A-D-box' of other known helicases, which represents a special version of the B motif of ATP-binding proteins. In XPD, His replaces the second Asp.
Von Ebner's gland protein (VEGP), a protein highly expressed by the small
acinar von Ebner's salivary glands of the tongue, but not in the secretory duct, undertakes the selective binding of sapid chemicals and their transport
to taste receptors [] in salivary secretions. VEGP can help to clear the bitter-tasting compound denatonium benzoate
in vivo[
], suggesting a possible clearance function in taste reception, although it fails to bind
other bitter compounds []. VEGP is also secreted by the lachrymal gland into tear fluid, where, historically, it has been called tear prealbumin [
].Together with lysozyme and lactoferrin, VEGP forms 70-80% of total tear
protein, although diseases affecting the lachrymal gland decrease this. TearVEGP has been suggested to enhance the bactericial activity of lysozyme and
to have an anti-microbial function, perhaps through transported compoundswith anti-bacterial properties [
]. VEGP has been shown to bind retinol [],and can be co-extracted with fatty acids, particularly stearate and
palmitate, phospholipids, glycolipids and fatty alcohols (including cholesterol) [
]. VEGP may act as a transporter of lipids, synthesised in the dorsal, or meibomian, glands of the eyelid, to the thin film they form
at the tear-fluid/air interface. Recently, two lipocalins, specifically expressed in the posterior and
vomeronasal glands of the mouse nasal septum, have been identified and weresuggested to act in the chemoreception of, as yet-unidentified, small
lipophilic pheromones []. One of these proteins was immunolocalised on thevomeronasal sensory epithelium, the site of primary pheromone reception, and
the immunoreactivity was greatest during periods when contact between animals plays an important role in modulating behaviour.
Canis familiaris (dog) allergen 1 (Can f1) is the major allergen present in dogdander and is produced by tongue epithelial tissue [
]. Some of the proteins in this family are allergens. Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans [WHO/IUIS Allergen Nomenclature Subcommittee
King T.P., Hoffmann D., Loewenstein H., Marsh D.G., Platts-Mills T.A.E.,Thomas W. Bull. World Health Organ. 72:797-806(1994)]. This nomenclature system is defined by a designation that is composed ofthe first three letters of the genus; a space; the first letter of the
species name; a space and an arabic number. Inthe event that two species
names have identical designations, they are discriminated from one anotherby adding one or more letters (as necessary) to each species designation.The allergens in this family include allergens with the following designations: Bos d 2 and Can f 2.
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice [
]. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Several 7TM receptors have been cloned but their endogenous ligands are
unknown; these have been termed orphan receptors. A GPCR similar to thereceptor for the blood clotting enzyme thrombin has been cloned [
]. Likethe thrombin receptor, this receptor is activated by N-terminal proteolytic
cleavage. Thus, because the physiological agonist at the receptor isunknown, it has been provisionally named proteinase-activated receptor 2
(PAR-2) []. Human PAR-2 (hPAR-2) resides both on the plasma membrane andin the Golgi apparatus [
]. hPAR-2 mRNA is highly expressed in humanpancreas, kidney, colon, liver and small intestine, and by A549 lung and
SW480 colon adenocarcinoma cells []. Hybridisation in situ reveals highexpression in intestinal epithelial cells throughout the gut [
], where itis thought that PAR-2 may serve as a trypsin sensor [
]. Its expressionby cells and tissues not normally exposed to pancreatic trypsin suggests
that other proteases could serve as physiological activators [].
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [,
,
].Vasopressin and oxytocin are members of the neurohypophyseal hormone family
found in all mammalian species. They are present at high levels in theposterior pituitary. Vasopressin has an essential role in the control of
the water content of the body, acting in the kidney to increase water andsodium absorption. In higher concentrations, vasopressin stimulates
contraction of vascular smooth muscle, stimulates glycogen breakdown in theliver, induces platelet activation, and evokes release of corticotrophin
from the anterior pituitary. Vasopressin and its analogues are usedclinically to treat diabetes insipidus.The V2 receptor is found in high levels in the osmoregulatory epithelia of
the terminal urinary tract, where it stimulates water reabsorption. Itis also present in lower levels in the endothelium and blood vessels of some
species, where it induces vasodilation. In the CNS, binding sites arefound in the subiculum, with lower levels in caudate-putamen and islands
of Calleja. The receptor is involved in an effector pathway that formscAMP through activation of G proteins.
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Prostanoids (prostaglandins (PG) and thromboxanes (TX)) mediate a wide variety of actions and play important physiological roles in the cardiovascular and immune systems, and in pain sensation in peripheral systems. PGI2 and TXA2 have opposing actions, involving regulation of the interaction of platelets with the vascular endothelium, while PGE2, PGI2 and PGD2 are powerful vasodilators and potentiate the action of various autocoids to induce plasma extravasation and pain sensation. To date, evidence for at least 5 classes of prostanoid receptor has been obtained. However, identification of subtypes and their distribution is hampered by expression of more than one receptor within a tissue, coupled with poor selectivity of available agonists and antagonists.FP receptors bind prostaglandin F2-alpha and mediate contraction in a wide range of smooth muscle, including
intraocular, myometrial and bronchial tissues, and are potent stimulants ofluteolysis. The receptors activate the phosphoinositide pathway through
a pertussis-toxin-insensitive G-protein, probably of the Gq/G11 class.
This entry represents a FAD-binding domain superfamily. This domain consists of two α+β subdomains. Flavoenzymes have the ability to catalyse a wide range of biochemical reactions. They are involved in the dehydrogenation of a variety of metabolites, in electron transfer from and to redox centres, in light emission, in the activation of oxygen for oxidation and hydroxylation reactions [
]. About 1% of all eukaryotic and prokaryotic proteins are predicted to encode a flavin adenine dinucleotide (FAD)-binding domain [].According to structural similarities and conserved sequence motifs, FAD-binding domains have been grouped in three main families: (i) theferredoxin reductase (FR)-type FAD-binding domain (see
), (ii) the FAD-binding domains that adopt a Rossmann fold and (iii) the p-cresol methylhydroxylase (PCMH)-type FAD-binding domain [
].The FAD cofactor consists of adenosine monophosphate (AMP) linked to flavin mononucleotide (FMN) by a pyrophosphate bond. The AMP moiety is composed of the adenine ring bonded to a ribose that is linked to a phosphate group. The FMN moiety is composed of the isoalloxazine-flavin ring linked to a ribitol, which is connected to a phosphate group. The flavin functions mainly in a redox capacity, being able to take up two electrons from one substrate and release them two at a time to a substrate or coenzyme, or one at a time to an electron acceptor. The catalytic function of the FAD is concentrated in the isoalloxazine ring, whereas the ribityl phosphate and the AMP moiety mainly stabilise cofactor binding to protein residues [
].The PCMH-type FAD-binding domain consists of two α-β subdomains: one is composed of three parallel β-strands (B1-B3) surrounded by α-helices, and is packed against the second subdomain containing five antiparallel β-strands (B4-B8) surrounded by α-helices [
]. The two subdomains accommodate the FAD cofactor between them []. In the PCMH proteins the coenzyme FAD is also covalently attached to a tyrosine located outside the FAD-binding domain in the C-terminal catalytic domain [].This domain is found in:FAD-linked oxidases (N-terminal domain), such as vanillyl-alcohol oxidase (
) [
], flavoprotein subunit of p-cresol methylhydroxylase () [
], D-lactate dehydrogenases (,
-cytochrome) [
], cholesterol oxidases () [
], and cytokinin dehydrogenase 1 () [
].Uridine diphospho-N-acetylenolpyruvylglucosamine reductase (MurB) (N-terminal domain) [
].CO dehydrogenase flavoprotein (N-terminal domain; [
]) family, which includes xanthine oxidase (domain 3) () [
], subunit A of xanthine dehydrogenase (domain 3) () [
], medium subunit of quinoline 2-oxidoreductase (QorM) () [
], and the beta-subunit of 4-hydroxybenzoyl-CoA reductase (HrcB) (N-terminal domain) () [
].
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Prostanoids (prostaglandins (PG) and thromboxanes (TX)) mediate a wide variety of actions and play important physiological roles in the cardiovascular and immune systems, and in pain sensation in peripheral systems. PGI2 and TXA2 have opposing actions, involving regulation of the interaction of platelets with the vascular endothelium, while PGE2, PGI2 and PGD2 are powerful vasodilators and potentiate the action of various autocoids to induce plasma extravasation and pain sensation. To date, evidence for at least 5 classes of prostanoid receptor has been obtained. However, identification of subtypes and their distribution is hampered by expression of more than one receptor within a tissue, coupled with poor selectivity of available agonists and antagonists.DP receptors have a limited distribution. They mediate relaxation in
vascular, gastrointestinal and uterine smooth muscle in human and someother species; they inhibit platelet activation, and modify release of
hypothalamic and pituitary hormones. The receptors activate adenylylcyclase through G proteins.
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Leukotrienes (LT) are potent lipid mediators derived from arachidonic acid metabolism. They can be divided into two classes, based on the presence or absence of a cysteinyl group. Leukotriene B4 (LTB4) does not contain such a group, whereas LTC4, LTD4, LTE4 and LTF4 are cysteinyl leukotrienes.LTB4 is one of the most effective chemoattractant mediators known, and is produced predominantly by neutrophils and macrophages. It is involved in a number of events, including: stimulation of leukocyte migration from the bloodstream; activation of neutrophils; inflammatory pain; host defence against infection; increased interleukin production and transcription [
]. It is found in elevated concentrations in a number of inflammatory and allergic conditions, such as asthma, psoriasis, rheumatoid arthritis and inflammatory bowel disease, and has been implicated in the pathogenesis of these diseases [].Binding sites for LTB4 have been observed in membrane preparations from leukocytes, macrophages and spleen. Two receptors for LTB4 have since been cloned (BLT1 and BLT2); both are members of the rhodopsin-like G-protein-coupled receptor superfamily [
].
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Neurotensin is a 13-residue peptide transmitter, sharing significant
similarity in its 6 C-terminal amino acids with several other neuropeptides,including neuromedin N. This region is responsible for the biological activity, the N-terminal portion having a modulatory role. Neurotensin is distributed throughout the central nervous system, with highest levels in the hypothalamus, amygdala and nucleus accumbens. It induces a variety of effects, including: analgesia, hypothermia and increased locomotor activity. It is also involved in regulation of dopamine pathways. In the periphery, neurotensin is found in endocrine cells of the small intestine, where it leads to secretion and smooth muscle contraction.The existence of 2 neurotensin receptor subtypes, with differing affinities
for neurotensin and differing sensitivities to the antihistamine levocabastine, was originally demonstrated by binding studies in rodent brain. Two neurotensin receptors (NT1 and NT2) with such properties have since been cloned and have been found to be G-protein-coupled receptor family members [].
O-Glycosyl hydrolases (
) are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [,
]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) website.Glycoside hydrolase family 22
comprises enzymes with two known activities; lysozyme type C (
) (also known as 1, 4-beta-N-acetylmuramidase or LYZ) and alpha-lactalbumins (also known as lactose synthase B protein or LA). Asp and/or the carbonyl oxygen of the C-2 acetamido group of the substrate acts as the catalytic nucleophile/base.
Alpha-lactalbumin [
,
] is a milk protein that acts as the regulatory subunit of lactose synthetase, acting to promote the conversion of galactosyltransferase to lactose synthase, which is essential for milk production. In the mammary gland, alpha-lactalbumin changes the substrate specificity of galactosyltransferase from N-acetylglucosamine to glucose.Lysozymes (
) act as bacteriolytic enzymes by hydrolyzing the beta(1->4) bonds between N-acetylglucosamine and N-acetylmuramic acid in the peptidoglycan of prokaryotic cell walls. It has also been recruited for a digestive role in certain ruminants and colobine monkeys [
]. There are at least five different classes of lysozymes []: C (chicken type), G (goose type), phage-type (T4), fungi (Chalaropsis), and bacterial (Bacillus subtilis). There are few similarities in the sequences of the different types of lysozymes.Lysozyme type C and alpha-lactalbumin are similar both in terms of primary sequence and structure, and probably evolved from a common ancestral protein [
]. Around 35 to 40% of the residues are conserved in both proteins as well as the positions of the four disulphide bonds. There is, however, no similarity in function. Another significant difference between the two enzymes is that all lactalbumins have the ability to bind calcium [], while this property is restricted to only a few lysozymes []. The binding site was deduced using high resolution X-ray structure analysis and was shown to consist of three aspartic acid residues. It was first suggested that calcium bound to lactalbumin stabilised the structure, but recently it has been claimed that calcium controls the release of lactalbumin from the golgi membrane and that the pattern of ion binding may also affect the catalytic properties of the lactose synthetase complex.Sperm acrosome membrane-associated protein 3 (SPACA3) is involved in fertilization, probably during the sperm-egg membrane fusion, but despite being homologous to lysosome has no detectable bacteriolytic activity [
].
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].GPCR Fungal pheromone mating factor receptors form a distinct family of G-protein-coupled receptors, and are also known as Class D GPCRs.The Fungal pheromone mating factor receptors STE2 and STE3 are integral membrane proteins that may be involved in the response to mating factors on the cell membrane [
,
,
]. The amino acid sequences of both receptors contain high proportions of hydrophobic residues grouped into 7 domains,in a manner reminiscent of the rhodopsins and other receptors believed tointeract with G-proteins. However, while a similar 3D framework has been proposed to account for this, there is no significant sequence similarity either between STE2 and STE3, or between these and the rhodopsin-type family: the receptors thereofore bear their own unique '7TM' signatures which is why they have been given their own GPCR group: Class D Fungal mating pheromone receptors.
This entry represents the STE3-type family of fungal pheromone mating factor receptors. The STE3 gene of Saccharomyces cerevisiae (Baker's yeast) is the cell-surface receptor that binds the 13-residue lipopeptide a-factor. Several related fungal pheromone receptor sequences are known: these include pheromone B alpha 1 and B alpha 3, and pheromone B beta 1 receptors from Schizophyllum commune; pheromone receptor 1 from Ustilago hordei; and pheromone receptors 1 and 2 from Ustilago maydis. Members of the family share about 20% sequence identity.
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Prostanoids (prostaglandins (PG) and thromboxanes (TX)) mediate a wide variety of actions and play important physiological roles in the cardiovascular and immune systems, and in pain sensation in peripheral systems. PGI2 and TXA2 have opposing actions, involving regulation of the interaction of platelets with the vascular endothelium, while PGE2, PGI2 and PGD2 are powerful vasodilators and potentiate the action of various autocoids to induce plasma extravasation and pain sensation. To date, evidence for at least 5 classes of prostanoid receptor has been obtained. However, identification of subtypes and their distribution is hampered by expression of more than one receptor within a tissue, coupled with poor selectivity of available agonists and antagonists.EP1 receptors mediate contraction of gastrointestinal smooth muscles in
various species, and relaxation of airway and uterine smooth muscles,especially in rodents. The receptors activate the phosphoinositide
pathway via a pertussis-toxin-insensitive G-protein, probably of theGq/G11 class.
Steroid or nuclear hormone receptors constitute an important superfamily of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. The receptors function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner. Nuclear hormone receptors consist of a highly conserved DNA-binding domain that recognises specific sequences, connected via a linker region to a C-terminal ligand-binding domain (
). In addition, certain nuclear hormone receptors have an N-terminal modulatory domain (
). The DNA-binding domain can elicit either an activating or repressing effect by binding to specific regions of the DNA known as hormone-response elements [
,
]. These response elements position the receptors, and the complexes recruited by them, close to the genes of which transcription is affected. The DNA-binding domains of nuclear receptors consist of two zinc-nucleated modules and a C-terminal extension, where residues in the first zinc module determine the specificity of the DNA recognition and residues in the second zinc module are involved in dimerisation. The DNA-binding domain is furthermore involved in several other functions including nuclear localisation, and interaction with transcription factors and co-activators [
].Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [
,
,
,
,
]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents the two C4-type zinc finger modules involved in DNA-binding.
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Neuropeptide receptors are present in very small quantities in the cell
and are embedded tightly in the plasma membrane. The neuropeptides exhibita high degree of functional diversity through both regulation of peptide
production and through peptide-receptor interaction []. The mammaliantachykinin system consists of 3 distinct peptides: substance P, substance
K and neuromedin K. All possess a common spectrum of biological activities,including sensory transmission in the nervous system and contraction/
relaxation of peripheral smooth muscles, and each interacts with aspecific receptor type.
NK3 receptors are distributed widely throughout the rat CNS, and are found
in high levels in cerebral cortex, basal ganglia and dorsal horn of thespinal chord. They have limited distribution in peripheral tissues, and
are found in ganglia (e.g., myenteric plexus), kidney, and in a limitednumber of smooth muscles (e.g., rat portal vein). NK3 receptorsactivate the phosphoinositide pathway through a pertussis-toxin-insensitive
G-protein, probably of the Gq/G11 class.
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].In addition to their role in energy metabolism, purines (especially
adenosine and adenine nucleotides) produce a wide range of pharmacologicaleffects mediated by activation of cell surface receptors. Distinct
receptors exist for adenosine. In the periphery, the main effects ofadenosine include vasodilation, bronchoconstriction, immunosuppresion,
inhibition of platelet aggregation, cardiac depression, stimulation ofnociceptive afferents, inhibition of neurotransmitter release and
inhibition of the release of other factors, e.g. hormones. In the CNS,adenosine exerts a pre- and post-synaptic depressant action, reducing motor
activity, depressing respiration, inducing sleep and relieving anxiety. Thephysiological role of adenosine is thought to be to adjust energy demands
in line with oxygen supply. Many of the clinical actions of methylxanthinesare thought to be mediated through antagonism of adenosine receptors. Four
subtypes of receptor have been identified, designated A1, A2A, A2B and A3.A2A receptors have a limited distribution in the brain and are found in the
striatum, olfactory tubercle and nucleus accumbens. In the periphery, A2receptors mediate vasodilation, immunosuppression, inhibition of platelet
aggregation and gluconeogenesis. The receptors activate adenylyl cyclase through G proteins.
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups []. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Bombesins are peptide neurotransmitters whose biological activity resides
in a common C-terminal sequence, WAXGHXM. In the periphery, bombesin-related peptides stimulate smooth muscle and glandular secretion. In thebrain, these peptides are believed to play a role in homeostasis, thermoregulation and metabolism, and have been reported to elicit analgesia and
excessive grooming, together with central regulation of a variety ofperipheral effects.Mammalian bombesins are encoded by 2 genes. The preproGRP gene transcript
encodes a precursor of 147 amino acids, which gives GRP and GRP18-27. ThepreproNMB gene transcript encodes a precursor of 117 amino acids, which is
metabolised to neuromedin B. Receptors for these peptides have widespreaddistribution in peripheral tissue. High levels are found in smooth muscle
and in the brain.The neuromedin B receptor has been characterised in rat oesophagus and rat
urinary bladder. It is widespread in the CNS, and is found in highlevels in olfactory nucleus and thalamic regions, and in lower levels in
the frontal cortex, dendate gyrus, amygdala and dorsal raphe. Thereceptor activates the phosphoinositide pathway through a pertussis-toxin-insensitive G-protein, probably of the Gq/G11 class.
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Bombesins are peptide neurotransmitters whose biological activity resides
in a common C-terminal sequence, WAXGHXM. In the periphery, bombesin-related peptides stimulate smooth muscle and glandular secretion. In thebrain, these peptides are believed to play a role in homeostasis, thermo-regulation and metabolism, and have been reported to elicit analgesia and
excessive grooming, together with central regulation of a variety ofperipheral effects.Mammalian bombesins are encoded by 2 genes. The preproGRP gene transcript
encodes a precursor of 147 amino acids, which gives GRP and GRP18-27. ThepreproNMB gene transcript encodes a precursor of 117 amino acids, which is
metabolised to neuromedin B. Receptors for these peptides have widespreaddistribution in peripheral tissue. High levels are found in smooth muscle
and in the brain.The recently-identified BRS-3 bombesin receptor subtype is found in germ
cells in testis and in uteri of pregnant animals; it is also present in avariety of lung carcinoma cell lines. The receptor is believed to play
a role in sperm cell division and maturation. Its action is mediated byassociation with G-proteins that activate a phosphatidylinositol-calcium
second messenger system.
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [
]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [,
,
,
,
]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Prostanoids (prostaglandins (PG) and thromboxanes (TX)) mediate a wide variety of actions and play important physiological roles in the cardiovascular and immune systems, and in pain sensation in peripheral systems. PGI2 and TXA2 have opposing actions, involving regulation of the interaction of platelets with the vascular endothelium, while PGE2, PGI2 and PGD2 are powerful vasodilators and potentiate the action of various autocoids to induce plasma extravasation and pain sensation. To date, evidence for at least 5 classes of prostanoid receptor has been obtained. However, identification of subtypes and their distribution is hampered by expression of more than one receptor within a tissue, coupled with poor selectivity of available agonists and antagonists.Prostaglandin E2 receptor EP2, also called prostanoid EP2 receptor, is one of four receptor subtypes whose endogenous physiological ligand is prostaglandin E2 (PGE2). Stimulation of the EP2 receptor by PGE2 causes cAMP accumulation through G(s) protein activation, which subsequently produces smooth muscle relaxation and mediates the systemic vasodepressor response to PGE2.
The type I glycoprotein S of Coronavirus, trimers of which constitute the typical viral spikes, is assembled into virions through noncovalent interactions with the M protein. The spike glycoprotein is translated as a large polypeptide that is subsequently cleaved to S1 and S2 [
]. The cleavage of S can occur at two distinct sites: S2 or S2' []. The S1 subunit is responsible for host-receptor binding while the S2 subunit contains the membrane-fusion machinery [].Both chimeric S proteins appeared to cause cell fusion when expressed individually, suggesting that they were biologically fully active [
]. The spike is a type I membrane glycoprotein that possesses a conserved transmembrane anchor and an unusual cysteine-rich (cys) domain that bridges the putative junction of the anchor and the cytoplasmic tail [].The S2 subunit normally contains multiple key components, including one or more fusion peptides (FP), a second proteolytic site (S2') and two conserved heptad repeats (HRs), driving membrane penetration and virus-cell fusion. The HRs can trimerize into a coiled-coil structure built of three HR1-HR2 helical hairpins presenting as a canonical six-helix bundle and drag the virus envelope and the host cell bilayer into close proximity, preparing for fusion to occur [
]. The fusion core is composed of HR1 and HR2 and at least three membranotropic regions that are denoted as the fusion peptide (FP), internal fusion peptide (IFP), and pretransmembrane domain (PTM). The HR regions are further flanked by the three membranotropic components. Both FP and IFP are located upstream of HR1, while PTM is distally downstream of HR2 and directly precedes the transmembrane domain of SARS-CoV S. All of these three components are able to partition into the phospholipid bilayer to disturb membrane integrity. []. During the pandemic, many conservative amino acid changes in FP segment of SARS-CoV-2 have been reported (i.e., L821I, L822F, K825R, V826L, T827I, L828P, A829T, D830G/A, A831V/S/T, G832C/S, F833S, I834T), although their impact is not known as the active conformation and mode of insertion of SARS-CoV-2 fusion peptide have not been experimentally characterised. Differences in HR1 sequences between SARS-CoV and SARS-CoV-2 suggest that SARS-CoV-2 HR2 makes stronger interactions with HR1. However, the substitutions observed in the solvent accessible surface of the HR1 domain (e.g., D936Y, S943P, S939F) of SARS-CoV-2 do not seem to be involved in stabilizing interactions with HR2. Substitutions in HR2 (e.g., K1073N, V1176F) or the TM or cytoplasmic tail domains have also been observed, but further experimental work is required to determine the effects of these changes [].
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions (including various autocrine, para-crine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence [
]. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family.The rhodopsin-like GPCRs themselves represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Leukotrienes (LT) are potent lipid mediators derived from arachidonic acid metabolism. They can be divided into two classes based on the presence or absence of a cysteinyl group. Leukotriene B4 (LTB4) does not contain such a group, whereas LTC4, LTD4, LTE4 and LTF4 are cysteinyl leukotrienes.Cysteinyl leukotrienes (CysLTs), previously known as the "slow reacting substance of anaphylaxis", are produced predominantly by myeloid cells associated with inflammatory responses [
]. They are the most potent bronchoconstrictors known and also have pro-inflammatory effects, making them important mediators in the pathophysiology of human asthma []. CysLTs have also been implicated in a variety of other diseases, such as allergic rhinitis, inflammatory bowel disease and psoriasis []. Pharmacological studies of the effects of CysLTs have provided evidence for the existence of at least 2 distinct receptor subtypes, belonging to the G protein-coupled receptor family, designated CysLT1 and CysLT2 [,
]. CysLT1 is thought to mediate bronchospasm, plasma exudation, vasoconstriction, mucus secretion and eosinophil recruitment [
]. CysLT2 is less well defined, owing to a lack of specific agonists and antagonists, but is thought to mediate some of the vascular effects attributed to CysLTs [,
]. Both receptor subtypes have now been cloned [,
].CysLT1 has a much higher affinity for LTD4 than for the other cysteinyl leukotrienes and, upon activation, stimulates phosphatidyl inositol hydrolysis and increases in intracellular calcium. The receptor is expressed at highest levels in the spleen and in peripheral blood leukocytes, with lower levels in the lung, placenta, small intestine, pancreas, colon and heart [
,
,
].
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions (including various autocrine, para-crine and endocrine processes). They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups. We use the term clan to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence [
]. The currently known clan members include the rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating pheromone receptors, and the metabotropic glutamate receptor family.The rhodopsin-like GPCRs themselves represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [
,
,
].Leukotrienes (LT) are potent lipid mediators derived from arachidonic acid metabolism. They can be divided into two classes based on the presence or absence of a cysteinyl group. Leukotriene B4 (LTB4) does not contain such a group, whereas LTC4, LTD4, LTE4 and LTF4 are cysteinyl leukotrienes.Cysteinyl leukotrienes (CysLTs), previously known as the "slow reacting substance of anaphylaxis", are produced predominantly by myeloid cells associated with inflammatory responses [
]. They are the most potent bronchoconstrictors known and also have pro-inflammatory effects, making them important mediators in the pathophysiology of human asthma []. CysLTs have also been implicated in a variety of other diseases, such as allergic rhinitis, inflammatory bowel disease and psoriasis []. Pharmacological studies of the effects of CysLTs have provided evidence for the existence of at least 2 distinct receptor subtypes, belonging to the G protein-coupled receptor family, designated CysLT1 and CysLT2 [,
]. CysLT1 is thought to mediate bronchospasm, plasma exudation, vasoconstriction, mucus secretion and eosinophil recruitment []. CysLT2 is less well defined, due to a lack of specific agonists and antagonists, but is thought to mediate some of the vascular effects attributed to CysLTs [,
]. Both receptor subtypes have now been cloned [,
].CysLT2 has highest affinities for LTC4 and LTD4 and, upon activation, stimulates phosphatidyl inositol hydrolysis leading to increased intracellular calcium concentration. The receptor is expressed widely, with highest levels in the heart, placenta, spleen, peripheral blood leukocytes and adrenal gland [
,
,
].
Photosystem I (PSI) [
] is an integral membrane protein complex that uses light energy to mediate electron transfer from plastocyanin to ferredoxin. It is found in the chloroplasts of plants and cyanobacteria. PSI is composed of at least 14 different subunits, two of which are small hydrophobic proteins of about 7 to 9 Kd and evolutionary related, PsaG (also known as PSI-G) and PsaK (also known as PSI-K), both integral membrane proteins. Cyanobacteria contain only PsaK []. While cyanobacterial PSI have phycobilisomes to harvest light, eukaryotic PSI have a membrane-imbedded peripheral antenna []. This entry represents Photosystem I reaction center subunit psaK found in plants, predominantly in Streptophytes. PsaK is important for stable interaction and proper function of the antenna [
]. The crystal structure of the plant PSI complex show this protein is closely related to the similar subunit PsaG [].
This entry represents eukaryotic condensin complex subunit 2 proteins. Included in this group are several Barren protein homologues from several eukaryotic organisms. In Drosophila Barren (barr) is required for sister-chromatid segregation in mitosis. barr encodes a novel protein that is present in proliferating cells and has homologues in yeast and human. Mitotic defects in barr embryos become apparent during cycle 16, resulting in a loss of PNS and CNS neurons. Centromeres move apart at the metaphase-anaphase transition and Cyclin B is degraded, but sister chromatids remain connected, resulting in chromatin bridging. Barren protein localises to chromatin throughout mitosis. Colocalisation and biochemical experiments indicate that Barren associates with Topoisomerase II throughout mitosis and alters the activity of Topoisomerase II. It has been suggested that this association is required for proper chromosomal segregation by facilitating the decatenation of chromatids at anaphase [
].
Mitochondrial Rho (Miro) proteins are aberrant members of the small GTPase superfamily found in most eukaryotes. Miro contains a transmembrane region located at the C terminus anchors the protein to the outer membrane of mitochondria, the GTPase domains and EF-hands located in the cytoplasm. Miro and its cytoplasmic binding partner Milton/TRAK link mitochondria to kinesin and dynein molecular motors in various cell types [
]. Mammals have two Miro orthologs , Miro1 and Miro2. They mediate mitochondrial trafficking in neurons by linking mitochondria to kinesin and dynein motor proteins for their transport in axons and dendrites [,
]. Yeasts have one Miro homologue, known as Gem1, which is part of the ERMES complex that links the ER to mitochondria [
]. Interestingly, ERMES is absent in metazoa []. This entry represents the tandemly repeated GTPase domain from these proteins.
ClpX is a member of the HSP (heat-shock protein) 100 family. Gel filtration and electron microscopy showed that ClpX subunits associate to form a six-membered ring that is stabilised by binding of ATP or nonhydrolysable analogs of ATP [
]. It functions as an ATP-dependent [] molecular chaperone and is the regulatory subunit of the ClpXP protease [].ClpXP is involved in DNA damage repair, stationary-phase gene expression, and ssrA-mediated protein quality control. To date more than 50 proteins include transcription factors, metabolic enzymes, and proteins involved in the starvation
and oxidative stress responses have been identified as substrates []. The N-terminal domain of ClpX is a C4-type zinc binding domain (ZBD) involved in substrate recognition. ZBD forms a very stable dimer that is essential for promoting the degradation of some typical ClpXP substrates such as lO and MuA [
].
This entry includes the Saccharomyces cerevisiae (Baker's yeast) protein SPT2 which is a chromatin protein involved in transcriptional regulation [
].These proteins shows conservation of several domains across numerous species, including having a cluster of positively charged amino acids. This cluster probably functions in the binding properties of the proteins [
]. Sin1p/Spt2p probably modulates the local chromatin structure by binding two strands of double-stranded DNA at their crossover point.Sin1p/Spt2p has sequence similarity to HMG1 and serves as a negative transcriptional regulator of a small family of genes that are activated by the SWI/SNF chromatin-remodelling complex. It is also involved in maintaining the integrity of chromatin during transcription elongation. Sin1p/Spt2 is required for, and is directly involved in, the efficient recruitment of the mRNA cleavage/polyadenylation complex [
]. Spt2 is also involved in regulating levels of histone H3 over transcribed regions [].
This group of metallopeptidases belong to the MEROPS peptidase family M18, (clan MH). The proteins have two catalytic zinc ions at the active site, bound by His/Asp, Asp, Glu, Asp/Glu and His. The catalysed reaction involves the release of an N-terminal aminoacid, usually neutral or hydrophobic, from a polypeptide [
].The type example is aminopeptidase I from Saccharomyces cerevisiae (Baker's yeast), the sequence of which has been deduced, and the mature protein shown to consist
of 469 amino acids []. A 45-residue presequence contains bothpositively- and negatively-charged and hydrophobic residues, which could be arranged
in an N-terminal amphiphilic α-helix []. The presequence differs fromsignal sequences that direct proteins across bacterial plasma membranes and
endoplasmic reticulum or into mitochondria. It is unclear how this uniquepresequence targets aminopeptidase I to yeast vacuoles, and how this
sorting utilises classical protein secretory pathways [].
PTR2 family proton/oligopeptide symporter, conserved site
Type:
Conserved_site
Description:
The transport of peptides into cells is a well-documented biological phenomenon which is accomplished by specific, energy-dependent transporters found in a number of organisms as diverse as bacteria and humans. The PTR family of proteins is distinct from the ABC-type peptide transporters and was uncovered by sequence analyses of a number of recently discovered peptide transport proteins [
]. These proteins that seem to be mainly involved in the intake of small peptides with the concomitant uptake of a proton [].These integral membrane proteins are predicted to comprise twelve
transmembrane regions.This entry describes two conserved sites. The first conserved site is found within a region that includes the end of the second transmembrane region, a cytoplasmic loop as well as the third transmembrane region. The second conserved site corresponds to the core of the fifth transmembrane region.
This entry represents a conserved domain found in the Rab3 GTPase-activating protein catalytic subunit (Rab3GAP1).
Small G proteins of the Rab family are regulators of intracellular vesicle traffic. Their rate of GTP hydrolysis is enhanced by specific GTPase-activating proteins (GAPs) that switch G proteins to their inactive form [
]. Rab3GAP1 (catalytic subunit) has been shown to form a heterodimeric complex with Rab3GAP2 (the regulatory subunit), and this complex acts as a guanosine nucleotide exchange factor for Rab3 subfamily (RAB3A, RAB3B, RAB3C and RAB3D). Rab3GAP complex may participate in neurodevelopmental processes such as proliferation, migration and differentiation before synapse formation, and non-synaptic vesicular release of neurotransmitters [,
]. It also activates Rab18 and promotes autolysosome maturation through the Vps34 Complex I [].Mutations in the Rab3GAP1/2 gene cause Warburg micro syndrome (WMS), a hereditary autosomal neuromuscular disorder [
].
SymE (SOS-induced yjiW gene with similarity to MazE) is an SOS-induced toxin. It inhibits cell growth, decreases protein synthesis and increases RNA degradation. It may play a role in the recycling of RNAs damaged under SOS response-inducing conditions. Its translation is repressed by the antisense RNA SymR, which acts as an antitoxin [
,
].SymE belongs to type I toxin-antitoxin systems, but it does not show functional homology to other type I toxin proteins. Its function resembles that of type II toxins such as MazF, which can cleave mRNA independent of the ribosome. However, SymE has homology to the AbrB-fold superfamily proteins such as MazE, which act as transcriptional factors and antitoxins in various type II TA modules [
]. It seems probable that SymE has evolved into an RNA cleavage protein with toxin-like properties from a transcription factor or antitoxin [].
Transport of molybdenum into bacteria involves a high-affinity ABC transporter system whose expression is controlled by a repressor protein called ModE. While molybdate transport is tightly coupled to utilization in some bacteria, other organisms have molybdenum storage proteins. One class of putative molybdate storage proteins is characterised by a sequence consisting of about 70 amino acids (Mop). A tandem repeat of Mop sequences also constitutes the molybdate binding domain of ModE.The 7kDa Mop protein from the methanol-utilizing anaerobe Sporomusa ovata occurs as highly symmetric hexamers binding eight oxyanions. Each peptide assumes an OB fold, which has previously also been observed in ModE. Each hexameric Mop molecule contains eight metal binding sites of two different types; all of them are only formed upon oligomer assembly, i.e., each binding site is located on the interface between two or three dimers [
].
The accumulation of abnormal membrane proteins is something which must be avoided in order to maintain cell viability. In Escherichia coli, the membrane-bound, ATP-dependent protease FtsH plays a central role in the degradation of these abnormal proteins [
]. Known substrates of this protease include several lambda bacteriophage proteins, the heat-shock transcription factor sigma-32, and the unassembled form of the membrane protein SecY. While FtsH is active as a protease on its own, in vivo it forms a complex with a membrane-bound HflKC heterodimer. HflKC has a generally inhibitory effect on the protease activity of FtsH, though the mechanism of this inhibition is not known. HflK is a member of peptidase inhibitor family I87 []. The HflK and HflC polypeptides are paralogous, and often encoded by tandem genes within bacterial genomes.This entry represents the HflK subunit (MEROPS identifier I87.002) of the HflKC heterodimer.
In prokaryotes, membrane lipoproteins are synthesized with a precursor signal peptide, which is cleaved by a specific lipoprotein signal peptidase (signal peptidase II). The peptidase recognises a conserved sequence and cuts upstream of a cysteine residue to which a glyceride-fatty acid lipid is attached [
,
]. This lipid attachment site is found in homologues of the VirB proteins of type IV secretion systems (T4SS). Conjugal transfer across the cell envelope of Gram-negative bacteria is mediated by a supramolecular structure termed mating pair formation (Mpf) complex. Collectively, secretion pathways ancestrally related to bacterial conjugation systems are now known as T4SS. T4SS are involved in the delivery of effector molecules to eukaryotic target cells; each of these systems exports distinct DNA or protein substrates to effect a myriad of changes in host cell physiology during infection [
].
MOV-10 (moloney leukemia virus 10 protein) is the human homologue of Drosophila melanogaster Armitage (armi). Human MOV-10 is a 5' to 3' RNA helicase required for RNA-mediated gene silencing by the RNA-induced silencing complex (RISC) [
,
,
]. It has been shown to interact with fragile X messenger ribonucleoprotein 1 (FMRP) and regulates miRNA-mediated translational repression by AGO2 []. Interestingly, it interacts with retrotransposons and acts as a potent inhibitor of retrotransposition in cells []. This protein is required for RNA-directed transcription and replication of the human hepatitis delta virus (HDV). It interacts with small capped HDV RNAs derived from genomic hairpin structures that mark the initiation sites of RNA-dependent HDV RNA transcription []. This entry represents the ATP-binding DEXXQ/H-box helicase domain of MOV-10 and similar eukaryotic proteins, including Probable RNA helicase SDE3 from Arabidopsis thaliana [
,
].
Aar2 is a U5 small nuclear ribonucleoprotein (snRNP) particle assembly factor and part of Prp8, which forms a large complex containing U5 snRNA, Snu114, and seven Sm proteins (B, D1, D2, D3, E, F and G). Upon import of the complex into the nucleus, Aar2 phosphorylation leads to its release from Prp8 and replacement by Brr2p, thus playing an important role in Brr2p regulation and possibly safeguarding against non-specific RNA binding to Prp8 [
,
,
,
,
]. Aar2p binds directly with the RNaseH-like domain in the C-terminal region of Prp8p []. In yeast, Aar2 protein is involved in splicing pre-mRNA of the a1 cistron and other genes important for cell growth [].This entry consists of the C-terminal domain of eukaryotic Aar2 and Aar2-like proteins. This domain consists of 9 alpha helices, 1 pi helix and 1 3(10)-helix.
This superfamily of peroxiredoxins includes osmotically inducible protein C (OsmC), a stress-induced protein found in Escherichia coli. This superfamily also contains organic hydroperoxide resistance protein (Ohr), that has a novel pattern of oxidative stress regulation.The transcription of the osmC gene of E. coli is regulated as a function of the phase of growth and is induced during the late exponential phase when the growth rate slows before entry into stationary phase. The transcription is initiated by two overlapping promoters, osmCp1 and osmCp2 [
].Ohr from Xanthomonas campestris pv. phaseoli is highly induced by organic hydroperoxides, weakly induced by H2O2, and not induced at all by a superoxide generator. OHR may be a new type of organic hydroperoxide detoxification protein [
,
].The structure of OsmC present a swapped dimer of beta(3)-α-β(2)-alpha(2)-beta subunits fold with mixed β-sheet and buried helix.
The PWWP domain is an essential part of the cytokine-like nuclear factor N-PAC protein, also known as NP60 or GLYR1, which enhances the activity of MAP2K4 and MAP2K6 kinases to phosphorylate p38-alpha [
]. NPAC/GLYR1 is a nucleosome-destabilizing factor that is recruited to genes during transcriptional activation and facilitates Pol II transcription through nucleosomes []. It is a KDM1B demethylase cofactor that stimulates H3K4me1 and H3K4me2 demethylation []. In addition to the PWWP domain, NP60 also contains an AT-hook and a C-terminal NAD-binding domain alpha [].The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding proteins, that function as transcription factors regulating a variety of developmental processes [
].
Mismatch repair is one of five major DNA repair pathways, the others being homologous recombination repair, non-homologous end joining, nucleotide excision repair, and base excision repair. The mismatch repair system recognises and repairs mispaired or unpaired nucleotides that result from errors in DNA replication. Many proteins involved in the different repair processes also play a role in apoptosis when DNA damage is excessive, thereby helping to prevent carcinogenesis [
]. The mismatch repair protein, Mlh1 (mutL homologue 1), has a dual role in DNA repair and apoptosis. Mlh1 acts as a heterodimer in conjunction with Pms2, Pms1 (post-meiotic segregation 1 and 2) or Mlh3 (MutL homologue 3), which function as adaptor proteins that link Msh (MutS homologue) heterodimers to the DNA repair machinery, resulting in excision and repair of the mispaired base [].This entry represents the mismatch repair protein MutL.
The repeat has the consensus sequence GDV(K/Q/R)(T/S/G)X(R/K/T) WLFETXPLD. This repeat motif is typically found in the N terminus of the proteins, with a copy number between 2 and 28 repeats. Direct evidence for binding to and stabilising F-actin has been found in the human protein (
) [
]. The homologues in mouse and chicken localise in the adherens junction complex of the intercalated disc in cardiac muscle and in the myotendon junction of skeletal muscle. mXin may co-localise with Vinculin which is known to attach the actin to the cytoplasmic membrane []. It has been shown that the amino-terminus of human xin (CMYA1) binds the EVH1 domain of Mena/VASP/EVL, and the carboxy-terminus binds the, for the filamin family unique, domain 20 of filamin C []. This confirms the proposed role of xin repeat containing proteins as F-actin-binding adapter proteins.
This entry represents the SH2 domain of VAV2, which is a member of the VAV family. The SH2 domain of VAV2 has been shown to interact with PPxY motifs of TXNIP (thioredoxin-interacting protein) [
]. The VAV protein family members are multiple domain proteins, including Vav from flies and VAV1/2/3 from mammals. VAV1 predominates in hematopoietic cells, whereas VAV2 and VAV3 are more broadly expressed. They have a calponin homology (CH) domain, an acidic domain (AC), a Dbl homology (DH) domain, a pleckstrin homology (PH) domain, a cysteine-rich (CR) domain containing a zinc finger, and a complex region with SH2 and SH3 domains. Therefore they may participate in the activity of several pathways [
,
]. They are signal transducer proteins that couple tyrosine kinase signals with the activation of the Rho/Rac GTPases, [,
,
].
Cytoplasmic proteins Nck are non-enzymatic adaptor proteins composed of three SH3 (Src homology 3) domains and a C-terminal SH2 domain [
]. They regulate actin cytoskeleton dynamics by linking proline-rich effector molecules to protein tyrosine kinases and phosphorylated signaling intermediates []. They function downstream of the PDGFbeta receptor and are involved in Rho GTPase signaling and actin dynamics []. They associate with tyrosine-phosphorylated growth factor receptors or their cellular substrates [,
]. There are two vertebrate Nck proteins, Nck1 and Nck2. Nck1 (also called Nck-alpha) plays a crucial role in connecting signaling pathways of tyrosine kinase receptors and important effectors in actin dynamics and cytoskeletal remodeling [
]. It binds and activates RasGAP, resulting in the downregulation of Ras []. It is also involved in the signaling of endothilin-mediated inhibition of cell migration [].This entry represents the SH2 domain of Nck1.
This is a family of cognate antitoxins to the CbtA toxins that act by inhibiting the polymerisation of cytoskeletal proteins (see
). These are classified as a type IV toxin-antitoxin system [
]. The family includes three proteins from E. coli YagB, YeeU and YfjZ, which act not by forming a complex with CbtA but through acting as antagonists to the CbtA toxicity, by stabilising the CbtA target proteins. For example, YeeU binds directly to both MreB and FtsZ and enhances the bundling of their filaments in vitro. YeeU is also able to neutralise the toxicity caused by other MreB and FtsZ inhibitors, such as A22 [S-(3, 4-dichlorobenzyl)isothiourea]for MreB, and SulA and DicB for FtsZ [
]. Thus CbeA, for cytoskeleton bundling-enhancing factor A, is proposed as a general name for all of these antitoxin proteins.
Cyclic AMP (cAMP) is a ubiquitous signalling molecule which mediates many cellular processes by activating cAMP-dependent kinases and also inducing protein-protein interactions. This molecule is produced by the adenylate cyclase (AC) enzyme, using ATP as its substrate. Mammalian adenylate cyclase has nine closely related membrane-bound isoforms (AC1-9) showing significant sequence homology and sharing the same overall structure: two hydrophobic transmembrane domains, and two cytoplasmic domains that are responsible for the catalytic activity. These isoforms differ in both their tissue specificity and their regulation. Regulatory factors known to influence one or more of these isoforms include G proteins, protein kinases, calcium and calmodulin [
,
].This entry represents a region of unknown function found in many of these isoforms. It is part of the N-terminal cytoplasmic domain but its presence is not necessary for catalytic activity [
].
Many flagellar proteins are exported by a flagellum-specific export pathway. Attempts have been made to characterise the apparatus responsible for this process, by designing assays to screen for mutants with export defects [
]. Experiments involving filament removal from temperature-sensitive flagellar mutants of Salmonella typhimurium have shown that, while most mutants were able to regrow filaments, flhA, fliH, fliI and fliN mutants showed no or greatly reduced regrowth. This suggests that the corresponding gene products are involved in the process of flagellum-specific export. The sequences of fliH, fliI and the adjacent gene, fliJ, have been deduced. FliJ was shown to encode a protein of molecular mass 17,302 Da [
]. It is a membrane-associated protein that affects chemotactic events, mutations in FliJ result in failure to respond to chemotactic stimuli.
This subgroup is dominated by FliJ proteins found in Proteobacteria.
Copper homeostasis protein CutC was originally thought to be involved in copper tolerance in Escherichia coli, as mutation in the corresponding gene lead to an increased copper sensitivity [
]. However, this phenotype has been later reported to depend on the levels of the mRNA-interfering complementary RNA regulator MicL, which is transcribed from a promoter located within the coding sequence of the cutC gene in the enterobacteria []. In the plant pathogen Xylella fastidiosa, this protein has been reported as specific for copper efflux []. The structure of this protein in the bacteria Shigella flexneri showed a monomer structure that adopts a common TIM β/α barrel with 8 β-strands surrounded by 8 α-helices [].The human homologue of this protein, which structure showed a potential copper-binding site, has an important role in intracellular copper homeostasis [,
].
In eukaryotes, cyclin-dependent protein kinases interact with cyclins to regulate cell cycle
progression, and are required for the G1 and G2 stages of cell division []. Theproteins bind to a regulatory subunit, cyclin-dependent kinase regulatory subunit (CKS),
which is essential for their function. This regulatory subunit is a small protein of 79 to 150residues. In yeast (gene CKS1) and in fission yeast (gene suc1) a single isoform is known,
while mammals have two highly related isoforms. The regulatory subunits exist as hexamers,formed by the symmetrical assembly of 3 interlocked homodimers, creating an unusual
12-stranded β-barrel structure []. Through the barrel centre runs a 12A diametertunnel, lined by 6 exposed helix pairs [
]. Six kinase units can be modelled to bind thehexameric structure, which may thus act as a hub for cyclin-dependent protein kinase
multimerisation [,
].
The members of this entry are similar to a region close to the C terminus of the HipA protein expressed by various bacterial species (for example
). This protein is known to be involved in high-frequency persistence to the lethal effects of inhibition of either DNA or peptidoglycan synthesis [
]. When expressed alone, it is toxic to bacterial cells [], but it is usually tightly associated with HipB [], and the HipA-HipB complex may be involved in autoregulation of the hip operon. The hip proteins may be involved in cell division control and may interact with cell division genes or their products []. The domain is also found in the serine/threonine-protein kinase CtkA from Helicobacter pyloriwhich is important in induction of host inflammation by inducing release of cytokines such as TNF-alpha and IL-8 by phosphorylation of NF-kappa-B [
].
The nebulin-like motif or nebulin repeat is a tandemly repeated actin-binding module of about 35 amino acids. The repeat is named after the nebulin protein, which is a large protein specific for vertebrate skeletal muscle that may regulate the length of thin filaments. Nebulin contains about 185 copies of the repeat and those in the central part of nebulin are organised into seven-module super-repeats, which seems to reflect an interaction with tropomysin/troponin [
]. Nebulin repeats occur in metazoan actin-binding proteins, most of which are specific for muscle tissues. Most nebulin repeat proteins contain a C-terminal SH3 domain and/or a N-terminal LIM zinc-binding domain. The repeats in nebulin and nebulette bind filamentous actin (F-actin) and may also associate with tropomyosin and troponin. The nebulin repeat has a predicted α-helical secondary structure and contains a central conserved SXXXY motif [,
,
].
This entry represents MICOS complex subunit MIC26, and its paralogue, MIC27, from animals. The MICOS complex is a central player determining mitochondrial cristae structure and formation of crista junctions [
]. Human MIC26 (also known as apolipoprotein O, APOO) plays a crucial role in crista junction formation and mitochondrial function [
] and can promote cardiac lipotoxicity by enhancing mitochondrial respiration and fatty acid metabolism in cardiac myoblasts []. It promotes cholesterol efflux from macrophage cells and can be detected in HDL, LDL and VLDL. It is secreted by a microsomal triglyceride transfer protein (MTTP)-dependent mechanism, probably as a VLDL-associated protein that is subsequently transferred to HDL [].Human MIC27 (also known as apolipoprotein O-like, APOOL) is also a subunit of the MICOS complex. It interacts with MIC26 and is involved in the formation of crista junctions [].
The LicD family of proteins show high sequence similarity and are involved in phosphorylcholine metabolism. There is evidence to show that LicD2 mutants have a reduced ability to take up choline, have decreased ability to adhere to host cells and are less virulent [
]. These proteins are part of the nucleotidyltransferase superfamily [].Ribitol-5-phosphate transferase FKTN (also known as Fukutin), which is a member of the LicD family, is a mammal protein which may be involved in the modification of glycan moieties of alpha-dystroglycan; defects in Fukutin are associated with congential muscular dystrophy [,
]. Ribitol 5-phosphate transferase FKRP (also known as Fukutin-related protein), responsible for the the second step in the formation of the ribose 5-phosphate tandem repeat after FKTN activity [,
], has N-terminal stem and C-terminal catalytic domains, and adopts a tetramer assembly [].
Xeroderma pigmentosum (XP) [
] is a human autosomal recessive disease,characterised by a high incidence of sunlight-induced skin cancer. Skin cells of individual's with this condition are hypersensitive to ultraviolet light, due
to defects in the incision step of DNA excision repair. There are a minimum ofseven genetic complementation groups involved in this pathway: XP-A to XP-G.
XP-A is the most severe form of the disease and is due to defects in a 30kDanuclear protein called XPA (or XPAC) [
].The sequence of the XPA protein is conserved from higher eukaryotes [
] toyeast (gene RAD14) [
]. XPA is a hydrophilic protein of 247 to 296 amino-acidresidues which has a C4-type zinc finger motif in its central section.
This entry contains the zinc-finger containing region in the XPA protein. It is found N-terminal to (
)
This family represents mitochondrial respiratory chain complex III (or cytochrome b-c1 complex) assembly factors, including cytochrome B pre-mRNA-processing protein 6 (Cbp6) from yeast and its human orthologue Ubiquinol-Cytochrome c Reductase Complex Assembly Factor 2 (UQCC2) [
,
]. These proteins have diverged significantly among eukaryotes and UQCC2/Cbp6 have an overall low amino acid conservation []. UQCC2 interacts with UQCC1, the human orthologue of the Cbp6 binding partner Cbp3, to form a complex necessary for cytochrome b biogenesis []. Cbp3-Cbp6 complex interacts with mitochondrial ribosomes for efficient translation of cytochrome b transcript and it also interacts with newly synthesized cytochrome b to assist its correct assembly [
]. These proteins contain a LRY-like N-terminal sequence being related to the LYR superfamily of proteins, which function as subunits or assembly factors for respiratory complexes I, II, III, and V [].
This entry includes the Nitrobindin family members mostly from bacteria and plants. Mycobacterium tuberculosis Nb (Mt-Nb(III) is a peroxynitrite isomerase, which is a heme-binding protein able to scavenge peroxynitrite and to protect free L-tyrosine against peroxynitrite-mediated nitration, by acting as a peroxynitrite isomerase that converts peroxynitrite to nitrate [
]. This entry also includes ferric nitrobindin-like protein, lacks the conserved His residue that binds heme iron.Nitrobindins (Nbs) are evolutionary conserved all-β-barrel heme-proteins displaying a highly solvent-exposed heme-Fe(III) atom. Mycobacterium tuberculosis Nb (Mt-Nb(III) and the C terminus of Homo sapiens Nb (Hs-Nb(III)) share this β-barrel structure, suggesting that Nb may act as a sensor possibly modulating the THAP4 transcriptional activity residing in the N-terminal region [
]. Ferric nitrobindin-like proteins that lack the conserved His residue which binds heme iron are also included in the Nb family.
The zinc finger protein (ZPR1) is a eukaryotic protein that comprises tandem ZPR1 domains and which, in response to growth stimuli, binds to eukaryotic translation elongation factor 1A (eEF1A), assembles into multiprotein complexes with the survival motor neurons (SMN) protein, and accumulates in subnuclear structures, such as gems and Cajal bodies. ZPR1 has a conserved tandem architecture consisting of a duplicated module, the ZPR1 domain, comprised of two apparently modular domains: an elongation initiation factor 2-like zinc finger (Znf) and a double-stranded beta helix with a helical hairpin insertion (A/B domain). In consequence, the N- and C-terminal ZPR1 domains are referred to as the Znf1-A domain and Znf2-B domain modules, respectively. The Znf2-B domain module is required for viability, whilst the Znf1-A domain module is required for normal cell growth and proliferation [
].This superfamily represents the A/B domain found in ZPR1.
The deltex family of proteins is involved in the regulation of Notch signaling, and therefore may play roles in cell-to-cell communications that regulate mechanisms determining cell fate [
]. They have a central RING-type zinc finger domain and contain a C-terminal domain that is also found in other domain architectures. Deltex-1 (DTX1) contains a RING finger and two WWE domains, indicating that it may be an E3 ubiquitin ligase []. Human deltex 3-like, which contains an additional N-terminal domain (presumably with ubiquitin ligase activity) is also described as E3 ubiquitin-protein ligase DTX3L, B-lymphoma- and BAL-associated protein (BBAP), or rhysin-2. DTX3L mediates monoubiquitination of K91 of histone H4 in response to DNA damage [,
].This entry represents the C-terminal domain of the Deltex proteins. It contains a fold composed of central β-sheet lined with two long parallel α-helices [
].
Flavocytochrome c sulphide dehydrogenase, flavin-binding
Type:
Domain
Description:
This entry represents the flavin-binding domain of flavocytochrome c sulphide dehydrogenase (FCSD), enzymes found in sulphur-oxidising bacteria such as the purple phototrophic bacteria Chromatium vinosum [
,
]. These enzymes are complexes of flavoprotein and a dihaem cytochrome that carry out hydrogen sulphide-dependent cytochrome C reduction. The dihaem cytochrome folds into two domains, each of which resembles mitochondrial cytochrome c, with the two haem groups bound to the interior of the subunit. The flavoprotein subunit has a glutathione reductase-like fold consisting of a beta(3,4)-alpha(3) core, and an alpha+beta sandwich. The active site of the flavoprotein subunit contains a catalytically important disulphide bridge located above the pyrimidine portion of the flavin ring []. Electrons are transferred from the flavin to one of the haem groups in the cytochrome. This entry represents a flavoprotein domain required for binding to flavin, and subsequent electron transfer.
The calx-beta motif is present as a tandem repeat in the cytoplasmic domains of Calx Na-Ca exchangers, which are used to expel calcium from cells. This motif overlaps domains used for calcium binding and regulation. The calx-beta motif is also present in the cytoplasmic tail of mammalian integrin-beta4, which mediates the bi-directional transfer of signals across the plasma membrane, as well as in some cyanobacterial proteins. This motif is also found in Fras1/Frem family of extracellular proteins (extracellular matrix organizing protein FRAS1 and FRAS1-related extracellular matrix proteins FRAM1, 2 and 3) required for proper organogenesis during embryonic development and whose mutations lead to Fraser Syndrome, a rare congenital disorder characterised by multisystem malformation usually comprising abnormal brain formation, cryptophthalmos, syndactyly and renal defects [
]. This motif contains a series of β-strands and turns that form a self-contained β-sheet [,
].
This 14 amino acid motif has been identified within the C-terminal region of several paired-like homeodomain (HD) containing proteins [
,
]. It was named OAR domain after the initials of otp, aristaless, and rax []. Although it has been proposed that this domain could be important for transactivation and be involved in protein-protein interactions or DNA binding [,
], its function is not yet known. Some proteins known to contain a OAR domain include human RIEG, defects in which are the cause of Rieger syndrome []; human OG12X and Mus musculus (Mouse) Og12x, whose function is not yet known []; vertebrate Rax, which plays a role in the proliferation and/or differentiation of retinal cells []; Drosophila DRX, which appears to be important in brain development []; and human SHOX, encoded by the short stature homeobox-containing gene.
This entry represents a group of peptidase C39-like proteins.The cysteine peptidases in family C39 cleave the 'double-glycine' leader peptides from the precursors of various bacteriocins (mostly non-lantibiotic). The cleavage is mediated by the transporter as part of the secretion process. Bacteriocins are antibiotic proteins secreted by some species of bacteria that inhibit the growth of other bacterial species. The bacteriocin is synthesized as a precursor with an N-terminal leader peptide, and processing involves removal of the leader peptide by cleavage at a Gly-Gly bond, followed by translocation of the mature bacteriocin across the cytoplasmic membrane. Most endopeptidases of family C39 are N-terminal domains in larger proteins (ABC transporters) that serve both functions. The proposed protease active site is conserved in this sub-family of proteins with a single peptidase domain, which are lacking the nucleotide-binding transporter signature or have different domain architectures [,
].
This entry includes interphotoreceptor matrix proteoglycans 1 and 2.IMPG1, also known as SPACR, is a sialoprotein associated with photoreceptor cones and rods protein. It is a secreted glycoprotein containing a central mucin-like domain and an EGF-like domain in the C-terminal region. It also contains two SEA domains (after sea urchin sperm protein, enterokinase, and agrin), a highly conserved structure composed of four anti-parallel beta sheets surrounded by alpha helices in an O-linked glycosylation-rich environment. IMPG1 has been linked to vitelliform macular dystrophies [
].Interphotoreceptor matrix proteoglycan 2 (IMPG2, also known as SPACRCAN) is a interphotoreceptor matrix proteoglycan that binds to chondroitin sulfate and hyaluronan. It may be involved in organisation of the insoluble interphotoreceptor matrix [
,
]. Mutations in the IMPG2 gene cause retinitis pigmentosa 56 (RP56) [] and vitelliform macular dystrophy 5 (VMD5) [].
The TROVE (Telomerase, Ro and Vault) domain is a module of ~300-500 residues that is found in TEP1 and Ro60 the protein components of three ribonucleoprotein particles. It is also found in bacterial ribonucleoproteins suggesting an ancient origin of these ribonucleoproteins. It can be found associated with other domains, such as the VWFA domain, the TEP1 N-terminal domain, the NACHT-NTPase domain, and WD-40 repeats. This domain may be involved in binding the RNA components of the three RNPs, which are telomerase RNA, Y RNA and vault RNA [
].The TROVE domain contains a few absolutely conserved residues. As none of these conserved residues are the polar type of amino acids found in active sites, it seems unlikely that this region has an enzymatic function [
].Structurally, the TROVE domain consist of two superhelical sections arranged like a horse-shoe.
Flavocytochrome c sulphide dehydrogenase, flavin-binding domain superfamily
Type:
Homologous_superfamily
Description:
This entry represents the flavin-binding domain superfamily of flavocytochrome c sulphide dehydrogenase (FCSD), enzymes found in sulphur-oxidising bacteria such as the purple phototrophic bacteria Chromatium vinosum [
,
]. These enzymes are complexes of flavoprotein and a dihaem cytochrome that carry out hydrogen sulphide-dependent cytochrome C reduction. The dihaem cytochrome folds into two domains, each of which resembles mitochondrial cytochrome c, with the two haem groups bound to the interior of the subunit. The flavoprotein subunit has a glutathione reductase-like fold consisting of a beta(3,4)-alpha(3) core, and an alpha+beta sandwich. The active site of the flavoprotein subunit contains a catalytically important disulphide bridge located above the pyrimidine portion of the flavin ring []. Electrons are transferred from the flavin to one of the haem groups in the cytochrome. This entry represents a flavoprotein domain required for binding to flavin, and subsequent electron transfer.
This entry represents the C-terminal SH3 domain found in the CRK family members [
]. CRK adaptor proteins consists of SH2 and SH3 domains, which bind tyrosine-phosphorylated peptides and proline-rich motifs, respectively. They function downstream of protein tyrosine kinases in many signaling pathways started by various extracellular signals, including growth and differentiation factors. Cellular CRK (c-CRK) contains a single SH2 domain, followed by N-terminal and C-terminal SH3 domains. It is involved in the regulation of many cellular processes including cell growth, motility, adhesion, and apoptosis. CRK has been implicated in the malignancy of various human cancers [
,
]. The C-terminal SH3 domain of CRK has not been shown to bind any target protein; it acts as a negative regulator of CRK function by stabilizing a structure that inhibits the access by target proteins to the N-terminal SH3 domain [
].
IRTKS (insulin receptor tyrosine kinase substrate) or BAIAP2L1 (brain-specific angiogenesis inhibitor 1-associated protein 2-like protein 1) is widely expressed, serves as a substrate for the insulin receptor, and binds the small GTPase Rac [
] ]. It plays a role in regulating the actin cytoskeleton and colocalizes with F-actin, cortactin, VASP, and vinculin. BAIAP2L1 expression leads to the formation of short actin bundles, distinct from filopodia-like protrusions induced by the expression of the related protein IRSp53 [
]. IRTKS mediates the recruitment of effector proteins Tir and EspFu, which regulate host cell actin reorganization, to bacterial attachment sites [].IRTKS contains an N-terminal IMD or Inverse-Bin/Amphiphysin/Rvs (I-BAR) domain, an SH3 domain, and a WASP homology 2 (WH2) actin-binding motif at the C terminus. The SH3 domain of IRTKS has been shown to bind the proline-rich C terminus of EspFu [
].
This entry represents P40 nucleoproteins from several Borna disease virus (BDV) strains. BDV is an RNA virus that is a member of the Mononegavirales family, which includes such members as Measles virus and Ebola virus sp.. BDV causes an infection of the central nervous system in a wide range of vertebrates, which can progress to an often fatal immune-mediated disease known as Borna disease. Viral nucleoproteins are central to transcription, replication, and packaging of the RNA genome. P40 nucleoprotein from BDV is multi-helical in structure and can be divided into two subdomains, each of which has an α-bundle topology [
]. The nucleoprotein assembles into a planar homotetramer, with the RNA genome either wrapping around the outside of the tetramer or possibly fitting within the charged central channel of the tetramer [].This entry represents one of the two α-bundle subdomains.
This entry represents P40 nucleoproteins from several Borna disease virus (BDV) strains. BDV is an RNA virus that is a member of the Mononegavirales family, which includes such members as Measles virus and Ebola virus sp.. BDV causes an infection of the central nervous system in a wide range of vertebrates, which can progress to an often fatal immune-mediated disease known as Borna disease. Viral nucleoproteins are central to transcription, replication, and packaging of the RNA genome. P40 nucleoprotein from BDV is multi-helical in structure and can be divided into two subdomains, each of which has an α-bundle topology [
]. The nucleoprotein assembles into a planar homotetramer, with the RNA genome either wrapping around the outside of the tetramer or possibly fitting within the charged central channel of the tetramer [].This entry represents one of the two α-bundle subdomains.
This domain, consisting of the distinct N-terminal PRY subdomain followed by the SPRY subdomain, is found at the C terminus of TRIM60, which is also known as RING finger protein 33 (RNF33) or 129 (RNF129). TRIM proteins are defined by the presence of the tripartite motif RING/B-box/coiled-coil region and are also known as RBCC proteins [
]. Based on its expression profile, RNF33 likely plays an important role in the spermatogenesis process, the development of the pre-implantation embryo, and in testicular functions. RNF33 is temporally transcribed in the unfertilized egg and the pre-implantation embryo, and is permanently silenced before the blastocyst stage [,
]. Mice experiments have shown that RNF33 associates with the cytoplasmic motor proteins, kinesin-2 family members 3A (KIF3A) and 3B (KIF3B), suggesting possible contribution to cargo movement along the microtubule in the expressed sites [].
In eukaryotes, cyclin-dependent protein kinases interact with cyclins to regulate cell cycle
progression, and are required for the G1 and G2 stages of cell division []. Theproteins bind to a regulatory subunit, cyclin-dependent kinase regulatory subunit (CKS),
which is essential for their function. This regulatory subunit is a small protein of 79 to 150residues. In yeast (gene CKS1) and in fission yeast (gene suc1) a single isoform is known,
while mammals have two highly related isoforms. The regulatory subunits exist as hexamers,formed by the symmetrical assembly of 3 interlocked homodimers, creating an unusual
12-stranded β-barrel structure []. Through the barrel centre runs a 12A diametertunnel, lined by 6 exposed helix pairs [
]. Six kinase units can be modelled to bind thehexameric structure, which may thus act as a hub for cyclin-dependent protein kinase
multimerisation [,
].
This entry represents the caudal homeobox family, which includes homeotic protein caudal from Drosophila melanogaster (Cad), homeobox protein CDX-1 from human and similar proteins found in animals and some fungal species. Cad is a transcription factor involved in processes such as anterior/posterior patterning formation, organ morphogenesis and innate immune homeostasis. Postembryonically its function is mostly restricted to the intestine where it regulates antimicrobial peptide (AMP) levels preserving the normal gut flora [
,
,
]. CDX-1, is also involved in transcriptional regulation []. It may play a role in the terminal differentiation of the intestine. It binds preferentially to methylated DNA [].This entry also includes homeobox protein pal-1 from Caenorhabditis elegans, a transcriptional activator required for posterior V6 neuroectoblast cell fate specification during postembryonic neurogenesis (patterning) which generates the characteristic ray lineage during male tail development [
,
].
IRAK4 is a serine/threonine-protein kinase that plays a critical role in initiating innate immune response against foreign pathogens []. It is required for the efficient recruitment of IRAK1 to the interleukin-1 (IL-1) receptor complex following IL-1 engagement, triggering intracellular signalling cascades leading to transcriptional up-regulation and mRNA stabilisation [,
]. IRAK4 phosphorylates IRAK1. Pellino-2 may be a substrate for both IRAK1 and IRAK4 [].This entry represents the Death domain of IRAK4.DDs (Death domains) are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes [
,
].
This entry represents a group of Gid-type RING finger containing proteins, including Rmd5 and Fyv10 from budding yeasts. Rmd5 and Fyv10 form the heterodimeric E3 ligase unit of the Gid (glucose induced degradation deficient) complex, which is involved in the proteasome-dependent degradation of fructose-1,6-bisphosphatase [
,
].This entry also includes animal MAEA and RMND5A/B, which are core components of the CTLH E3 ubiquitin-protein ligase complex that selectively accepts ubiquitin from UBE2H and mediates ubiquitination and subsequent proteasomal degradation of the transcription factor HBP1. MAEA and RMND5A are both required for catalytic activity of the CTLH E3 ubiquitin-protein ligase complex [
]. The CTLH complex has been linked to several different functions like regulation of cell morphology, proteasome-dependent degradation of non-ubiquitinated alpha-catenin, or modulation of endosome/lysosome-dependent degradation of ubiquitinated proteins via interaction with HRS (hepatocyte growth factor-regulated tyrosine kinase substrate) [,
].
PWWP domain-containing DNA repair factor 3A (also known as EXPAND1 or MUM1) is a nucleosome-binding protein that contributes to the maintenance of chromatin state. Through its interaction with 53BP1, it serves as an accessory factor in the DNA damage response pathway to promote chromatin change in response to DNA damage [
]. MUM1 promotes cell survival following DNA damage, suggesting that it may facilitate DNA repair by regulating the organization of chromatin structure [].This entry represents the PWWP domain. The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding proteins, that function as transcription factors regulating a variety of developmental processes [
].
This entry represents the C-terminal domain found in eukaryotic proteins such as Exportin-5 from mammals, HASTY 1 from Arabidopsis and Msn5 from yeasts. Exportin-5 mediates the nuclear export of proteins bearing a double-stranded RNA binding domain (dsRBD) and double-stranded RNAs (cargos). It also mediates the nuclear export of micro-RNA precursors, which form short hairpins and micro-RNA precursors, which form short hairpins and, in some circumstances, it can also mediate the nuclear export of deacylated and aminoacylated tRNAs. [
,
,
,
]. Exportin-5 yeast homologue, also known as Msn5 [
], is involved in nuclear import of replication protein A and export of various proteins, including transcription factors such as Pho4 []. Its Arabidopsis homologue, HASTY 1, is involved in the nuclear export of microRNAs (miRNAs) and plays a role in plant development through its role in miRNAs processing [,
].