Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 16001 to 16100 out of 30763 for seed protein

Category restricted to ProteinDomain (x)

0.033s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Name: Glucocorticoid receptor
Type: Family
Description: Steroid or nuclear hormone receptors (NRs) constitute an important super- family of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. Members of the superfamily include thesteroid hormone receptors and receptors for thyroid hormone, retinoids, 1,25-dihydroxy-vitamin D3 and a variety of other ligands. The proteins function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner [, ]. In addition to C-terminalligand-binding domains, these nuclear receptors contain a highly-conserved, N-terminal zinc-finger that mediates specific binding to target DNA sequences, termed ligand-responsive elements. In the absence of ligand, steroid hormone receptors are thought to be weakly associated with nuclearcomponents; hormone binding greatly increases receptor affinity. NRs are extremely important in medical research, a large number of them being implicated in diseases such as cancer, diabetes, hormone resistancesyndromes, etc. While several NRs act as ligand-inducible transcription factors, many do not yet have a defined ligand and are accordingly termed "orphan"receptors. During the last decade, more than 300 NRs have been described, many of which are orphans, which cannot easily be named due to current nomenclature confusions in the literature. However, a new system has recently been introduced in an attempt to rationalise the increasingly complex set of names used to describe superfamily members.The glucocorticoid receptor consists of 3 functional and structural domains: an N-terminal (modulatory) domain; a DNA binding domain thatmediates specific binding to target DNA sequences (ligand-responsive elements); and a hormone binding domain. The N-terminal domain is uniqueto the glucocorticoid receptors; it spans the first 440 residues, and is primarily responsible for transcriptional activation. The smaller (around65 residues), highly-conserved central portion of the protein is the DNA binding domain, which plays a role in DNA binding specificity, homo-dimerisation and in interactions with other proteins. The hormone binding domain comprises approximately 250 residues at the C terminus of thereceptor. This domain mediates receptor activity via interaction with heat shock proteins and cyclophilins, or with hormone.
Protein Domain
Name: Tropoelastin
Type: Family
Description: Tropoelastin is the precursor to the elastin molecule. Elastin aggregates are responsible for the stretch properties of skin, arterial walls andligaments, and elastin is implicated in several hereditary diseases, including cutis laxa (where the elasticity of the skin is lost) andelastoderma (similar to cutis laxa but with grape-like accumulations of elastin in the dermis). The unusual and highly characteristic amino acidcomposition of this protein accounts for its great hydrophobicity. It contains one-third glycine amino acids and several lysine derivatives that serve as covalent cross-links between protein monomers. Elastin is thus a three-dimensional network with 60-70 amino acids between two cross-linking points. This moleculararchitecture is determinant for its elastic properties, insolubility and resistance to proteolysis.Normally, the elastin gene contains 36 exons, and this structure allows the formation of stable isoforms by alternative splicing. The 3-dimensionalstructure of elastin is currently unknown and was originally thought to be an amorphous polymer. This is consistent with the theory of rubberelasticity, which requires the resting state of the protein to be of higher disorder (entropy) than the extended state [].More recent studies show the presence of helical and other secondary structures [], and the elasticity theory has been amended to involve, inthe resting state, secondary structure elements in chaotic motion. In the extended state of the protein, the secondary structures align to form anordered structure together with neighbouring molecules [ ].Tropoelastin consists mainly of repetitive elements of four, five,six and nine hydrophobic residues []. The five, six and nine residue repeatsfunction as binding sites for fibroblasts during chemotaxis (the hexapeptide and nonapeptide repeats competing for the same receptor) []. Thehexapeptide repeat is also known to bind calcium ions. The formation of the elastin fibre is a complicated process, involving the binding of a chaperone to the precursor to prevent aggregation in the cell,followed by migration out of the cell, whereupon the chaperone disassociates. The tropoelastin molecules then cross-link to each otherusing deaminated lysine residues, the microfibril structures functioning as a scaffold [].
Protein Domain
Name: Photosystem II PsbW, class 2
Type: Family
Description: Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll 'a' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane [ , , ]. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection []. This family represents the low molecular weight transmembrane protein PsbW found in PSII, where it is a subunit of the oxygen-evolving complex. PsbW appears to have several roles, including guiding PSII biogenesis and assembly, stabilising dimeric PSII [ ], and facilitating PSII repair after photo-inhibition []. There appears to be two classes of PsbW, class 1 being found predominantly in algae and cyanobacteria, and class 2 being found predominantly in plants. This entry represents class 2 PsbW.
Protein Domain
Name: Mediator complex, subunit Med2, fungi
Type: Family
Description: The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins. The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11. The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation. The head module contains: MED6, MED8, MED11, SRB4/MED17, SRB5/MED18, ROX3/MED19, SRB2/MED20 and SRB6/MED22. The middle module contains: MED1, MED4, NUT1/MED5, MED7, CSE2/MED9, NUT2/MED10, SRB7/MED21 and SOH1/MED31. CSE2/MED9 interacts directly with MED4. The tail module contains: MED2, PGD1/MED3, RGR1/MED14, GAL11/MED15 and SIN4/MED16. The CDK8 module contains: MED12, MED13, CCNC and CDK8. Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.This family of mediator complex subunit 2 proteins is conserved in fungi. Cyclin-dependent kinase CDK8 or Srb10 interacts with and phosphorylates Med2. Post-translational modifications of Mediator subunits are important for regulation of gene expression [, ].
Protein Domain
Name: Interferon gamma receptor, D2 domain, poxvirus/mammal
Type: Domain
Description: Interferon (IFN)-gamma is a dimeric glycoprotein produced by activated T cells and natural killer cells. Although originally isolated based on its antiviral activity, IFN-gamma also displays powerful anti-proliferative and immunomodulatory activities, which are essential for developing appropriate cellular defences against a variety of infectious agents. The first step in eliciting these responses is the specific high affinity interaction of IFN-gamma with its cell-surface receptor (IFN-gammaRalpha); the complex then interacts with at least one of a family of additional species-specific accessory factors (AF-1 or IFN-gammabeta), which convey different cellular responses. One such response is the association and phosphorylation of two protein tyrosine kinases (Jak-1 and Jak-2), which in turn stimulate nuclear transcription activators [].This entry includes:The human IFN-gamma receptor 1 (IFN-gammaR1), a member of the hematopoietic cytokine receptor superfamily. It is expressed in a membrane-bound form in many cell types, and is over-expressed in tumour cells. It comprises an extracellular portion of 229 residues, a single transmembrane region, and a cytoplasmic domain of 221 residues. As with other members of its superfamily, the cytokine-binding sites are formed by a small set of closely-spaced surface loops that extend from a β-sheet core, much like antigen-binding sites on antibodies. The extracellular IFN-gammaR monomer comprises two domains (D1 and D2 domains), each resembling an Ig-like fold with fibronectin type III topology [ , , ]. The signalling complex comprises two IFN-gammaR1 chains and two IFN-gammaR2 chains, which dimerises in an IFN-gamma-driven fashion [].The vaccinia virus interferon (IFN)-gamma receptor (IFN-gammaR) is a 43kDa soluble glycoprotein that is secreted from infected cells early during infection. IFN-gammaR from vaccinia virus, cowpoxvirus and camelpox virus exist naturally as homodimers, whereas the cellular IFN-gammaR dimerizes only upon binding the homodimeric IFN-gamma. The existence of the virus protein as a dimer in the absence of ligand may provide an advantage to the virus in efficient binding and inhibition of IFN-gamma in solution [ ].This is the D2 domain, which is involved in forming receptor-receptor contacts [ ].
Protein Domain
Name: ADAMTS/ADAMTS-like, cysteine-rich domain 3
Type: Domain
Description: This cysteine rich domain (CRD) is found in a variety of ADAMTS and ADAMTS-like endopeptidases widely spread in animals [ , ]. It is a well-conserved cysteine-rich sequence containing 10 cysteine residues [].Proteolysis of the extracellular matrix plays a critical role in establishing tissue architecture during development and in tissue degradation in diseases such as cancer, arthritis, Alzheimer's disease and a variety of inflammatory conditions [ , ]. The proteolytic enzymes responsible for this process are members of diverse protease families, including the secreted zinc metalloproteases (MPs) []. ADAM-TS (A Disintegrin and Metalloproteinase with Thrombospondin Motifs) is closely related to the ADAM family (A Disintegrin and Metalloproteinase) and is a subfamily of the MP family, consists of at least 20 members sharing a high degree of sequence similarity and conserved domain organisation [ , ]. The defining domains of the ADAM-TS family are (from N- to C-termini) a pre-pro metalloprotease domain of the reprolysin type, a snake venom disintegrin-like domain, a thrombospondin type-I (TS) module, a cysteine-rich region, and a cysteine-free (spacer) domain [ ]. Domain organisation following the spacer domain C terminus shows some variability in certain ADAM-TS members, principally in the number of additional TS domains. These enzymes have a wide-spectrum role in vascular biology and cardiovascular pathophysiology [].Members of the ADAM-TS family have been implicated in a range of diseases [ , , ]. For instance, members of this family have been found to participate directly in processes in the central nervous system (CNS) such as the regulation of brain plasticity []. ADAM-TS1 is reported to be involved in inflammation and cancer cachexia [], whilst recessively inherited ADAM-TS2 mutations cause Ehlers-Danlos syndrome type VIIC, a disorder characterised clinically by severe skin fragility []. ADAM-TS4 is an aggrecanase involved in arthritic destruction of cartilage []. ADAM-TS-like proteins lack a metalloprotease domain. They resides in the ECM and have regulatory roles []. Examples of ADAM-TS-like proteins are papilin [] and punctin [].
Protein Domain
Name: Transcription factor, T-box, conserved site
Type: Conserved_site
Description: Transcription factors of the T-box family are required both for early cell-fate decisions, such as those necessary for formation of the basic vertebrate body plan, for differentiation and organogenesis [ ] and also have been associated to multiple aspects of development and in adult terminal cell-type differentiation in different animal lineages []. The T-box is defined as the minimal region within the T-box protein that is both necessary and sufficient for sequence-specific DNA binding, all members of the family so far examined bind to the DNA consensus sequence TCACACCT and function as transcriptional repressors and/or activators []. The T-box is a relatively large DNA-binding domain, generally comprising about a third of the entire protein (17-26kDa) [].These genes were uncovered on the basis of similarity to the DNA binding domain [ ] of Mus musculus (Mouse) Brachyury (T) gene product, which similarity is the defining feature of the family. The Brachyury gene is named for its phenotype, which was identified 70 years ago as a mutant mouse strain with a short blunted tail. The gene, and its paralogues, have become a well-studied model for the family, and hence much of what is known about the T-box family is derived from the murine Brachyury gene.Consistent with its nuclear location, Brachyury protein has a sequence-specific DNA-binding activity and can act as a transcriptional regulator [ ]. Homozygous mutants for the gene undergo extensive developmental anomalies, thus rendering the mutation lethal []. The postulated role of Brachyury is as a transcription factor, regulating the specification and differentiation of posterior mesoderm during gastrulation in a dose-dependent manner [].T-box proteins tend to be expressed in specific organs or cell types, especially during development, and they are generally required for the development of those tissues, for example, Brachyury is expressed in posterior mesoderm and in the developing notochord, and it is required for the formation of these cells in mice [ ]. The T-box family is an ancient group that appears to play a critical role in development in all animal species [ ].
Protein Domain
Name: Autoimmune regulator, AIRE
Type: Family
Description: AIRE (AutoImmune REgulator) is a transcription factor that plays an essential role to promote self-tolerance in the thymus by regulating the expression of a wide array of self-antigens that have the commonality of being tissue-restricted in their expression pattern in the periphery, called tissue restricted antigens (TRA) [ , ]. Mutations cause a rare autosomal recessively inherited disease termed APECED. APECED, also called Autoimmune Polyglandular Syndrome type I (APS 1), is the only described autoimmune disease with established monogenic background, being localised outside the major histocompatibility complex region. It is characterised by the presence of two of the three major clinical entities, chronic mucocutaneus candidiasis, hypoparathyroidism and Addison's disease. Other immunologically mediated phenotypes, including insulin-dependent diabetes mellitus (IDDM), gonadal failure, chronic gastritis, vitiligo, autoimmune thyroid disease, enamel hypoplasia, and alopecia may also be present. Immunologically, APECED patients have deficient T cell responses towards Candida antigens, and clinical symptoms both within and outside the endocrine system, mainly as a result of autoimmunity against organ-specific autoantigens [, ].AIRE has a HSR/CARD domain involved in promoting AIRE to multimerise to itself, a SAND domain that appears to be involved in promoting a protein-protein interaction with a transcriptional repressive complex and two zinc fingers of the plant homodomain (PHD) type PHD1 and PHD2, of which PHD1 functions as a histone code reader [ , , , ].AIRE has a dual subcellular location. It is not only expressed in multiple immunologically relevant tissues, such as the thymus, spleen, lymph nodes and bone marrow, but it has also been detected in various other tissues, such as kidney, testis, adrenal glands, liver and ovary, suggesting that APECED proteins might also have a function outside the immune system. However, AIRE is not expressed in the target organs of autoimmune destruction. At the subcellular level, AIRE can be found in the cell nucleus in a speckled pattern in domains resembling promyeolocytic leukaemia nuclear bodies, also known as ND10, nuclear dots or potential oncogenic domains associated with the AIRE homologous nuclear proteins Sp100, Sp140, and Lysp100.
Protein Domain
Name: G2 nidogen/fibulin G2F
Type: Domain
Description: Basement membranes are sheet-like extracellular matrices found at the basal surfaces of epithelia and condensed mesenchyma. By preventing cell mixing and providing a cell-adhesive substrate, they play crucial roles in tissue development and function. Basement membranes are composed of an evolutionarily ancient set of large glycoproteins, which includes members of the laminin family, collagen IV, perlecan and nidogen/entactin [ ]. Nidogen/entactin is an important basement membrane component, which promotes cell attachment, neutrophil chemotaxis, trophoblast outgrowth, and angiogenesis and interacts with many other basement membrane proteins, like collagen, perlecan, lamin, and has a potential role in the assembly and connection of networks. It consists of three globular regions, G1-G3. G1 and G2 are connected by a thread-like structure, whereas that between G2 and G3 is rod-like [, ].The nidogen G2 region binds to collagen IV and perlecan. The nidogen G2 structure is composed of two domains, an N-terminal EGF-like domain and a much larger β-barrel domain of ~230 residues. The nidogen G2 β-barrel consists of an 11-stranded β-barrel of complex topology, the interior of which is traversed by the hydrophobic, predominantly alpha helical segment connecting strands C and D. The N-terminal half of the barrel comprises two β-meanders (strands A-C and D-F) linked by the buried α-helical segment. The polypeptide chain then crosses the bottom of the barrel and forms a five-stranded Greek key motif in the C-terminal half of the domain. Helix alpha3 caps the top of the barrel and forms the interface to the EGF-like domain. The nidogen G2 β-barrel domain has unexpected structural similarity to green fluorescent proteins of Cnidaria, suggesting that they derive from a common ancestor. A large surface patch on the barrel surface is strikingly conserved in all metazoan nidogens. Site-directed mutagenesis demonstrates that the conserved residues in the conserved patch are involved in the binding of perlecan, and possibly also of collagen IV [].A similar domain is also found in hemicentin, a protein which functions at various cell-cell and cell-matrix junctions and might assist in refining broad regions of cell contact into oriented, line-shaped junctions [ ].
Protein Domain
Name: Voltage-dependent calcium channel, gamma subunit
Type: Family
Description: Ca2+ ions are unique in that they not only carry charge but they are also the most widely used of diffusible second messengers. Voltage-dependent Ca2+ channels (VDCC) are a family of molecules that allow cells to couple electrical activity to intracellular Ca2+ signalling. The opening and closing of these channels by depolarizing stimuli, such as action potentials, allows Ca2+ ions to enter neurons down a steep electrochemical gradient, producing transient intracellular Ca2+ signals. Many of the processes that occur in neurons, including transmitter release, gene transcription and metabolism are controlled by Ca2+ influx occurring simultaneously at different cellular locales. The pore is formed by the alpha-1 subunit which incorporates the conduction pore, the voltage sensor and gating apparatus, and the known sites of channel regulation by second messengers, drugs, and toxins [ ]. The activity of this pore is modulated by four tightly-coupled subunits: an intracellular beta subunit; a transmembrane gamma subunit; and a disulphide-linked complex of alpha-2 and delta subunits, which are proteolytically cleaved from the same gene product. Properties of the protein including gating voltage-dependence, G protein modulation and kinase susceptibility can be influenced by these subunits.Voltage-gated calcium channels are classified as T, L, N, P, Q and R, and are distinguished by their sensitivity to pharmacological blocks, single-channel conductance kinetics, and voltage-dependence. On the basis of their voltage activation properties, the voltage-gated calcium classes can be further divided into two broad groups: the low (T-type) and high (L, N, P, Q and R-type) threshold-activated channels.The voltage-dependent calcium channel gamma (VDCCG) subunit family consistsof at least 8 members, which share a number of common structural features []. Each member is predicted to possess 4 transmembrane domains, with intracellular N- and C-termini. The first extracellular loop contains a highly conserved N-glycosylation site and a pair of conserved cysteine residues. The C-terminal 7 residues of VDCCG-2, -3, -4 and -8 are also conserved andcontain a consensus site for phosphorylation by cAMP and cGMP-dependent protein kinases, and a target site for binding by PDZ domain proteins [].
Protein Domain
Name: Septin
Type: Family
Description: This entry represents various septin proteins. These proteins were initially described in yeast, where a cross wall (septum) is produced during cytokinesis and then splits in certain organisms to allow the daughter cells to separate [ ]. However, the septin family is now recognised to extend to mammals and is associated with a variety of events. Septins are GTPases that form filaments used during cytokinesis in fungi and animals. Septins at the bud site serve as a structural scaffold that recruits different components involved in diverse processes at specific stages during the cell cycle. The septin assembly is regulated by protein kinases Gin4 and/or Cla4, and may act by recruiting Myo1 and Hof1 (involved in septation) to the site of cleavage. In addition to their original role in cell division, septins are also involved in cell morphogenesis, bud site selection, chitin deposition, cell cycle regulation, cell compartmentalisation, membrane trafficking, spore wall formation and organisation of the cytoskeleton. The localisation of a septin reflects its function []:Septins localising to projections help shape and compartmentalise emerging growth.Septins localising to partitions help compartmentalise pre-existing cellular material.Septins localising to the whole cell are involved in membrane trafficking and organising the cytoskeleton (usually in animals).In yeast, septins encoded by Cdc3, Cdc10, Cdc11, Cdc12 and probably Shs1, form a septin complex localised at the cytoplasmic face of the plasma membrane in the mother-bud neck, where it rearranges to a cortical collar of highly ordered filaments [ ]. This complex can form long filaments through end-to-end polymerisation of Cdc3-Cdc12-Cdc11 complexes with Cdc10 serving as a bridge to bundle the polymers into paired filaments. In humans, 12 septin genes generate dozens of polypeptides, many of which comprise heterooligomeric complexes. Since septin mutants are commonly defective in cytokinesis and formation of the neck filaments/septin rings, septins have been considered to be the primary constituents of the neck filaments [ , , , , , ].In Drosophila, protein peanut is involved in cytokinesis, and is an enhancer of the sina gene, which has a role in photoreceptor development [ ].
Protein Domain
Name: Mediator of RNA polymerase II, subunit Med31 superfamily
Type: Homologous_superfamily
Description: The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins. The Mediator complex is composed of at least 31 subunits. They form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation.Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.This entry includes subunit Med31 of the Mediator complex and the Saccharomyces cerevisiae homologue, Soh1. Soh1 is responsible for the repression of temperature sensitive growth of the Hpr1 mutant [] and has been found to be a component of the RNA polymerase II transcription complex. Soh1 not only interacts with factors involved in DNA repair, but transcription as well. Thus, the Soh1 protein may serve to couple these two processes [].Med31 is organised as a four helix bundle and with the N-terminal part of subunit Med7 forms a submodule of the middle module of the mediator core which is unique in structure and function. In vivo, Med7N/31 has a predominantly positive function on the expression of a specific subset of genes, including genes involved in methionine metabolism and iron transport [ ].
Protein Domain
Name: Carbohydrate binding module family 17/28
Type: Domain
Description: A carbohydrate-binding module (CBM) is defined as a contiguous amino acid sequence within a carbohydrate-active enzyme with a discreet fold having carbohydrate-binding activity. A few exceptions are CBMs in cellulosomal scaffolding proteins and rare instances of independent putative CBMs. The requirement of CBMs existing as modules within larger enzymes sets this class of carbohydrate-binding protein apart from other non-catalytic sugar binding proteins such as lectins and sugar transport proteins.CBMs were previously classified as cellulose-binding domains (CBDs) based on the initial discovery of several modules that bound cellulose [ , ]. However, additional modules in carbohydrate-active enzymes are continually being found that bind carbohydrates other than cellulose yet otherwise meet the CBM criteria, hence the need to reclassify these polypeptides using more inclusive terminology.Previous classification of cellulose-binding domains were based on amino acid similarity. Groupings of CBDs were called "Types"and numbered with roman numerals (e.g. Type I or Type II CBDs). In keeping with the glycoside hydrolase classification, these groupings are now called families and numbered with Arabic numerals. Families 1 to 13 are the same as Types I to XIII. For a detailed review on the structure and binding modes of CBMs see [ ].binds to amorphous cellulose and soluble beta-1,4-glucans, with a minimal binding requirement of cellotriose and optimal affinity for cellohexaose. Family 17 CBMs appear to have a very shallow binding cleft that may be more accessible to cellulose chains in non-crystalline cellulose than the deeper binding clefts of family 4 CBMs [ ]. does not compete with CBM17 modules when binding to non-crystalline cellulose but does have a "β-jelly roll"topology, which is similar in structure to the CBM17 domains. Sequence and structural conservation in families 17 and 28 suggests that they have evolved through gene duplication and subsequent divergence [ ].This entry includes family 17 and 28 which show structural homology. The domain is found in a number of alkaline cellulases.
Protein Domain
Name: Peptidase A8, signal peptidase II
Type: Family
Description: This group of aspartic endopeptidases belong to the MEROPSpeptidase family A8 (signal peptidase II family). The type example is the Escherichia coli lipoprotein signal peptidase or SPase II ( , MEROPS identifier A08.001), which removes the signal peptide from the N terminus of the murein prolipoprotein, an essential step in production of the bacterial cell wall. This enzyme recognises a conserved sequence known as the "lipobox sequence"(Leu-Xaa-Yaa+Cys, in which Xaa is Ala or Ser and Yaa is Gly or Ala) and cleaves on the amino side of the cysteine residue to which a glyceride-fatty acid lipid is attached. SPase II is an integral membrane protein with four transmembrane regions, with the active site on the periplasmic side and close to the membrane surface. The active site aspartic acid residues have been identified by site-directed mutagenesis and occur in the motifs GNXXDRX and FNXAD, where X is a hydrophobic residue [ ]. The enzyme is inhibited by the cyclic pentapeptide antibiotic globomycin [] and also by pepstatin []. Although no tertiary structure has been solved, proteins in this family are unlikely to have similar folds to any other aspartic peptidase, and family A8 is assigned to is own clan, AC.Homologues are found only in bacteria. Most bacteria have one homologue, but a few bacteria, including Pseudomonas fluorescensand Staphylococcus epidermidis, have two family members. Predicted homologues in eukaryotes are probably derived from contaminants. Aspartic peptidases, also known as aspartyl proteases ([intenz:3.4.23.-]), are widely distributed proteolytic enzymes [, , ] known to exist in vertebrates, fungi, plants, protozoa, bacteria, archaea, retroviruses and some plant viruses. All known aspartic peptidases are endopeptidases. A water molecule, activated by two aspartic acid residues, acts as the nucleophile in catalysis. Aspartic peptidases can be grouped into five clans, each of which shows a unique structural fold [].Peptidases in clan AA are either bilobed (family A1 or the pepsin family) or are a homodimer (all other families in the clan, including retropepsin from HIV-1/AIDS) [ ]. Each lobe consists of a single domain with a closed β-barrel and each lobe contributes one Asp to form the active site. Most peptidases in the clan are inhibited by the naturally occurring small-molecule inhibitor pepstatin [].Clan AC contains the single family A8: the signal peptidase 2 family. Members of the family are found in all bacteria. Signal peptidase 2 processes the premurein precursor, removing the signal peptide. The peptidase has four transmembrane domains and the active site is on the periplasmic side of the cell membrane. Cleavage occurs on the amino side of a cysteine where the thiol group has been substituted by a diacylglyceryl group. Site-directed mutagenesis has identified two essential aspartic acid residues which occur in the motifs GNXXDRX and FNXAD (where X is a hydrophobic residue) [ ]. No tertiary structures have been solved for any member of the family, but because of the intramembrane location, the structure is assumed not to be pepsin-like.Clan AD contains two families of transmembrane endopeptidases: A22 and A24. These are also known as "GXGD peptidases"because of a common GXGD motif which includes one of the pair of catalytic aspartic acid residues. Structures are known for members of both families and show a unique, common fold with up to nine transmembrane regions [ ]. The active site aspartic acids are located within a large cavity in the membrane into which water can gain access [].Clan AE contains two families, A25 and A31. Tertiary structures have been solved for members of both families and show a common fold consisting of an α-β-alpha sandwich, in which the beta sheet is five stranded [ , ].Clan AF contains the single family A26. Members of the clan are membrane-proteins with a unique fold. Homologues are known only from bacteria. The structure of omptin (also known as OmpT) shows a cylindrical barrel containing ten beta strands inserted in the membrane with the active site residues on the outer surface [ ].There are two families of aspartic peptidases for which neither structure nor active site residues are known and these are not assigned to clans. Family A5 includes thermopsin, an endopeptidase found only in thermophilic archaea. Family A36 contains sporulation factor SpoIIGA, which is known to process and activate sigma factor E, one of the transcription factors that controls sporulation in bacteria [ ].
Protein Domain
Name: 2-aminoethylphosphonate ABC transport system, ATP-binding component PhnT
Type: Family
Description: ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [ ].The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [, , ].The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly β-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel β-sheet of armI by a two-fold axis [ , , , , , ].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions [ ]. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [, ]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [, , ].The enzyme phosphonatase catalyses the degradation of 2-aminoethylphosphonate (AEP) in bacteria. This allows them to metabolise a range of organophosphonate compounds, including 2-aminoethylphosphonate, as a sole source of carbon, energy and phosphorus for growth [ ]. The C-P bond in phosphonoacetaldehyde (Pald) is hydrolysed and a bi-covalent Lys53ethylenamine/Asp12 aspartylphosphate intermediate is formed []. This step can also be catalysed by C-P lyase [], with some bacteria having the genes for both pathways and some only for one of them. The 2-aminoethylphosphonate ABC transport system functions in the transport of 2-aminoethylphosphonate across the membrane for utilisation in the bacterial cell [].This ATP-binding component of an ABC transport system is found in Salmonella and Burkholderia lineages in the vicinity of enzymes for the breakdown of 2-aminoethylphosphonate.
Protein Domain
Name: 2-aminoethylphosphonate ABC transport system, membrane component PhnV
Type: Family
Description: ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [ ].The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [, , ].The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly β-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel β-sheet of armI by a two-fold axis [ , , , , , ].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions [ ]. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [, ]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [, , ].The enzyme phosphonatase catalyses the degradation of 2-aminoethylphosphonate (AEP) in bacteria. This allows them to metabolise a range of organophosphonate compounds, including 2-aminoethylphosphonate, as a sole source of carbon, energy and phosphorus for growth [ ]. The C-P bond in phosphonoacetaldehyde (Pald) is hydrolysed and a bi-covalent Lys53ethylenamine/Asp12 aspartylphosphate intermediate is formed []. This step can also be catalysed by C-P lyase [], with some bacteria having the genes for both pathways and some only for one of them. The 2-aminoethylphosphonate ABC transport system functions in the transport of 2-aminoethylphosphonate across the membrane for utilisation in the bacterial cell [].This membrane component of an ABC transport system is found in Salmonella and Burkholderia lineages in the vicinity of enzymes for the breakdown of 2-aminoethylphosphonate.
Protein Domain
Name: Peptidase A22A, presenilin, nematode type
Type: Family
Description: Aspartic peptidases, also known as aspartyl proteases ([intenz:3.4.23.-]), are widely distributed proteolytic enzymes [, , ] known to exist in vertebrates, fungi, plants, protozoa, bacteria, archaea, retroviruses and some plant viruses. All known aspartic peptidases are endopeptidases. A water molecule, activated by two aspartic acid residues, acts as the nucleophile in catalysis. Aspartic peptidases can be grouped into five clans, each of which shows a unique structural fold [].Peptidases in clan AA are either bilobed (family A1 or the pepsin family) or are a homodimer (all other families in the clan, including retropepsin from HIV-1/AIDS) [ ]. Each lobe consists of a single domain with a closed β-barrel and each lobe contributes one Asp to form the active site. Most peptidases in the clan are inhibited by the naturally occurring small-molecule inhibitor pepstatin [].Clan AC contains the single family A8: the signal peptidase 2 family. Members of the family are found in all bacteria. Signal peptidase 2 processes the premurein precursor, removing the signal peptide. The peptidase has four transmembrane domains and the active site is on the periplasmic side of the cell membrane. Cleavage occurs on the amino side of a cysteine where the thiol group has been substituted by a diacylglyceryl group. Site-directed mutagenesis has identified two essential aspartic acid residues which occur in the motifs GNXXDRX and FNXAD (where X is a hydrophobic residue) [ ]. No tertiary structures have been solved for any member of the family, but because of the intramembrane location, the structure is assumed not to be pepsin-like.Clan AD contains two families of transmembrane endopeptidases: A22 and A24. These are also known as "GXGD peptidases"because of a common GXGD motif which includes one of the pair of catalytic aspartic acid residues. Structures are known for members of both families and show a unique, common fold with up to nine transmembrane regions [ ]. The active site aspartic acids are located within a large cavity in the membrane into which water can gain access [].Clan AE contains two families, A25 and A31. Tertiary structures have been solved for members of both families and show a common fold consisting of an α-β-alpha sandwich, in which the beta sheet is five stranded [ , ].Clan AF contains the single family A26. Members of the clan are membrane-proteins with a unique fold. Homologues are known only from bacteria. The structure of omptin (also known as OmpT) shows a cylindrical barrel containing ten beta strands inserted in the membrane with the active site residues on the outer surface [ ].There are two families of aspartic peptidases for which neither structure nor active site residues are known and these are not assigned to clans. Family A5 includes thermopsin, an endopeptidase found only in thermophilic archaea. Family A36 contains sporulation factor SpoIIGA, which is known to process and activate sigma factor E, one of the transcription factors that controls sporulation in bacteria [ ].This group of aspartic peptidases belong to MEROPS peptidase family A22 (presenilin family, clan AD): subfamily A22A, the type example being presenilin 1 from Homo sapiens (Human).Presenilins are polytopic transmembrane (TM) proteins, mutations in which are associated with the occurrence of early-onset familial Alzheimer'sdisease, a rare form of the disease that results from a single-gene mutation [, ]. They are also thought to be involved in control of Notch signalling, apoptotic signal transduction, or processing of selected proteins, such as the beta-amyloid precursor protein(beta-APP). There are a number of subtypes which belong to this presenilin family. That presenilin homologues have been identified in species that do not have an Alzhemier's disease correlate suggests that they may have functions unrelated to the disease, homologues having been identified in mouse, Drosophila melanogaster, Caenorhabditis elegans [] and other members of the eukarya including plants. In worms, presenilins are involved in nervous system development [ ] and ovipositioning []. Sel-12, a worm homologue of the mammalian presenilins (with which it shares ~50% amino acid identity), has been shown to facilitate the function of the Notch receptor (LIN-12 protein), which plays a role in cell-cell signalling during cell differentiation in development. Intriguingly, presenilin 1 (one of the two two presenilin genes present in human) is able to restore function in a C. elegans mutant lacking sel-12, suggesting presenilin may also be involved in cell-cell signalling in higher species [].
Protein Domain
Name: Potassium channel, voltage dependent, Kv3, inactivation domain
Type: Domain
Description: Potassium channels are the most diverse group of the ion channel family [ , ]. They are important in shaping the action potential, and in neuronal excitability and plasticity []. The potassium channel family is composed of several functionally distinct isoforms, which can be broadly separated into 2 groups []: the practically non-inactivating 'delayed' group and the rapidly inactivating 'transient' group.These are all highly similar proteins, with only small amino acid changes causing the diversity of the voltage-dependent gating mechanism, channel conductance and toxin binding properties. Each type of K +channel is activated by different signals and conditions depending on their type of regulation: some open in response to depolarisation of the plasma membrane; others in response to hyperpolarisation or an increase in intracellular calcium concentration; some can be regulated by binding of a transmitter, together with intracellular kinases; while others are regulated by GTP-binding proteins or other second messengers [ ]. In eukaryotic cells, K+channels are involved in neural signalling and generation of the cardiac rhythm, act as effectors in signal transduction pathways involving G protein-coupled receptors (GPCRs) and may have a role in target cell lysis by cytotoxic T-lymphocytes [ ]. In prokaryotic cells, they play a role in the maintenance of ionic homeostasis [].All K +channels discovered so far possess a core of alpha subunits, each comprising either one or two copies of a highly conserved pore loop domain (P-domain). The P-domain contains the sequence (T/SxxTxGxG), which has been termed the K +selectivity sequence. In families that contain one P-domain, four subunits assemble to form a selective pathway for K +across the membrane. However, it remains unclear how the 2 P-domain subunits assemble to form a selective pore. The functional diversity of these families can arise through homo- or hetero-associations of alpha subunits or association with auxiliary cytoplasmic beta subunits. K +channel subunits containing one pore domain can be assigned into one of two superfamilies: those that possess six transmembrane (TM) domains and those that possess only two TM domains. The six TM domain superfamily can be further subdivided into conserved gene families: the voltage-gated (Kv) channels; the KCNQ channels (originally known as KvLQT channels); the EAG-like K +channels; and three types of calcium (Ca)-activated K +channels (BK, IK and SK) [ ]. The 2TM domain family comprises inward-rectifying K+channels. In addition, there are K +channel alpha-subunits that possess two P-domains. These are usually highly regulated K +selective leak channels. The Kv family can be divided into several subfamilies on the basis of sequence similarity and function. Four of these subfamilies, Kv1 (Shaker), Kv2 (Shab), Kv3 (Shaw) and Kv4 (Shal), consist of pore-forming alpha subunits that associate with different types of beta subunit. Each alpha subunit comprises six hydrophobic TM domains with a P-domain between the fifth and sixth, which partially resides in the membrane. The fourth TM domain has positively charged residues at every third residue and acts as a voltage sensor, which triggers the conformational change that opens the channel pore in response to a displacement in membrane potential [ ]. More recently, 4 new electrically-silent alpha subunits have been cloned: Kv5 (KCNF), Kv6 (KCNG), Kv8 and Kv9 (KCNS). These subunits do not themselves possess any functional activity, but appear to form heteromeric channels with Kv2 subunits, and thus modulate Shab channel activity []. When highly expressed, they inhibit channel activity, but at lower levels show more specific modulatory actions.A voltage-dependent potassium channel gene designated Shaw was initially isolated from Drosophila melanogaster (Fruit fly). Subsequently, several vetebrate potassium channels with similar amino acid sequences were found and, together with the D. melanogaster channel, now constitute the Kv3 family. These channels are thought to play a role in shortening of action potential durations and modulating pre-synaptic neurotransmitter release. In mammals, the family consists of 4 genes (Kv3.1, Kv3.2, Kv3.3 and Kv3.4). Each gene product has its own subcellular location and function.Fast inactivation of voltage-dependent potassium channels controls membrane excitability and signal propagation in central neurons. This occurs by a 'ball-and-chain'-type mechanism where an N-terminal protein inactivation domain occludes the pore from the cytoplasmic side. In Kv3 channels this process is regulated by protein phosphorylation, where phosphorylation of serine residues leads to a reduction or removal of the fast inactivation [].
Protein Domain
Name: NifC-like ABC-type porter
Type: Family
Description: ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [ ].The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [, , ].The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly β-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel β-sheet of armI by a two-fold axis [ , , , , , ].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions [ ]. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [, ]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [ , , ].This entry represents a clade of ABC porter genes with relatively weak homology compared to its neighbour clades, the molybdate and sulphate porters. Neighbour-Joining, PAM-distance phylogenetic trees support the separation of these clades in this way. Included in this group are the NifC genes of Clostridium pasteurianum [ ] and Pasteurella multocida, which are involved in the biosynthesis and/or control of nitrogenase. It would be reasonable to presume that NifC acts as a molybdate porter since the most common form of nitrogenase is a molybdoenzyme. Several other sequences falling within this group are annotated as molybdate porters and one, from Halobacterium, is annotated as a sulphate porter. There is presently no experimental evidence to support annotations with this degree of specificity.
Protein Domain
Name: Potassium channel, voltage dependent, Kv2.1
Type: Family
Description: Potassium channels are the most diverse group of the ion channel family [ , ]. They are important in shaping the action potential, and in neuronal excitability and plasticity []. The potassium channel family is composed of several functionally distinct isoforms, which can be broadly separated into 2 groups []: the practically non-inactivating 'delayed' group and the rapidly inactivating 'transient' group.These are all highly similar proteins, with only small amino acid changes causing the diversity of the voltage-dependent gating mechanism, channel conductance and toxin binding properties. Each type of K +channel is activated by different signals and conditions depending on their type of regulation: some open in response to depolarisation of the plasma membrane; others in response to hyperpolarisation or an increase in intracellular calcium concentration; some can be regulated by binding of a transmitter, together with intracellular kinases; while others are regulated by GTP-binding proteins or other second messengers [ ]. In eukaryotic cells, K+channels are involved in neural signalling and generation of the cardiac rhythm, act as effectors in signal transduction pathways involving G protein-coupled receptors (GPCRs) and may have a role in target cell lysis by cytotoxic T-lymphocytes [ ]. In prokaryotic cells, they play a role in the maintenance of ionic homeostasis [].All K +channels discovered so far possess a core of alpha subunits, each comprising either one or two copies of a highly conserved pore loop domain (P-domain). The P-domain contains the sequence (T/SxxTxGxG), which has been termed the K +selectivity sequence. In families that contain one P-domain, four subunits assemble to form a selective pathway for K +across the membrane. However, it remains unclear how the 2 P-domain subunits assemble to form a selective pore. The functional diversity of these families can arise through homo- or hetero-associations of alpha subunits or association with auxiliary cytoplasmic beta subunits. K +channel subunits containing one pore domain can be assigned into one of two superfamilies: those that possess six transmembrane (TM) domains and those that possess only two TM domains. The six TM domain superfamily can be further subdivided into conserved gene families: the voltage-gated (Kv) channels; the KCNQ channels (originally known as KvLQT channels); the EAG-like K +channels; and three types of calcium (Ca)-activated K +channels (BK, IK and SK) [ ]. The 2TM domain family comprises inward-rectifying K+channels. In addition, there are K +channel alpha-subunits that possess two P-domains. These are usually highly regulated K +selective leak channels. The Kv family can be divided into several subfamilies on the basis of sequence similarity and function. Four of these subfamilies, Kv1 (Shaker), Kv2 (Shab), Kv3 (Shaw) and Kv4 (Shal), consist of pore-forming alpha subunits that associate with different types of beta subunit. Each alpha subunit comprises six hydrophobic TM domains with a P-domain between the fifth and sixth, which partially resides in the membrane. The fourth TM domain has positively charged residues at every third residue and acts as a voltage sensor, which triggers the conformational change that opens the channel pore in response to a displacement in membrane potential [ ]. More recently, 4 new electrically-silent alpha subunits have been cloned: Kv5 (KCNF), Kv6 (KCNG), Kv8 and Kv9 (KCNS). These subunits do not themselves possess any functional activity, but appear to form heteromeric channels with Kv2 subunits, and thus modulate Shab channel activity []. When highly expressed, they inhibit channel activity, but at lower levels show more specific modulatory actions.The Kv2 voltage-dependent potassium channels (also known as the Shab family) are responsible for much of the delayed rectifier current in Drosophila melanogaster (Fruit fly) nervous system and muscle. However, in vertebrates, Kv2 channels have been shwon to be involved in the delayed rectifier currents of the heart and skeletal muscles. They are also thought to be important in determining intrinsic neuronal excitability in both mammals and non-mammals [ ]. Kv2 channels can be further divided into 2 subtypes, designated Kv2.1 and Kv2.2.Kv2.1 channels are expressed in the neurons. Essential for their function is protein phosphorylation dependent on protein kinase A. Three isoforms exhibiting temporal patterning during neuronal development have also been discovered, implying distinct roles for these channels in development.
Protein Domain
Name: Photosystem II cytochrome b559, N-terminal
Type: Domain
Description: Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll 'a' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane [ , , ]. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection [ ]. Cytochrome b559, which forms part of the reaction centre core of PSII is a heterodimer composed of one alpha subunit (PsbE), one beta (PsbF) subunit, and a haem cofactor. Two histidine residues from each subunit coordinate the haem. Although cytochrome b559 is a redox-active protein, it is unlikely to be involved in the primary electron transport in PSII due to its very slow photo-oxidation and photo-reduction kinetics. Instead, cytochrome b559 could participate in a secondary electron transport pathway that helps protect PSII from photo-damage. Cytochrome b559 is essential for PSII assembly [ ].This domain occurs in both the alpha and beta subunits of cytochrome B559. In the alpha subunit it occurs together with a lumenal domain ( ), while in the beta subunit it occurs on its own.
Protein Domain
Name: NADH:cytochrome b5 reductase-like
Type: Family
Description: Flavoprotein pyridine nucleotide cytochrome reductases [ ] (FPNCR) catalyse the interchange of reducing equivalents between one-electron carriers and the two-electron-carrying nicotinamide dinucleotides. The enzymes includeferredoxin:NADP +reductases (FNR) [ ].plant and fungal NAD(P)H:nitrate reductases [ , ].NADH:cytochrome b5 reductases [ ].NADPH:P450 reductases.NADPH:sulphite reductases.nitric oxide synthases.phthalate dioxygenase reductase.and various other flavoproteins.NADH:cytochrome b5 reductase (CBR) serves as electron donor for cytochrome b5, a ubiquitous electron carrier (see ), thus participating in a variety of metabolic pathways (including steroid biosynthesis, desaturation and elongation of fatty acids, P450-dependent reactions, methaemoglobin reduction, etc.). A membrane-bound form of CBR is located on the cytosolic side of the endoplasmic reticulum, while a soluble form is found in erythrocytes [ ]. In the membrane-bound form, the N-terminal residue is myristoylated []. Deficiency of the erythrocyte form causes hereditary methaemoglobinemia [].In biological nitrate assimilation, reduction of nitrate to nitrite is catalysed by the multidomain redox enzyme NAD(P)H:nitrate reductase (NR). Three forms of NR are known: an NADH-specific enzyme found in higher plants and algae ( ); an NAD(P)H-bispecific enzyme found in higher plants, algae and fungi ( ); and an NADPH-specific enzyme found only in fungi ( ) [ ]. NR can be divided into 3 structure/function domains: the molybdopterin cofactor binds in the N-terminal domain; the central region is the cytochrome b domain, which is similar to animal cytochrome b5 (see ); and the C-terminal portion of the protein is occupied by the FAD/NAD(P)H binding domain, which is similar to CBR [ ]. The catalytic reduction of nitrate to nitrite can be viewed as a single polypeptide electron transport chain with electron flow from NAD(P)H ->FAD ->cytochrome b5 ->molybdopterin ->NO(3). Thus, the flavin domain of NR is functionally identical to CBR. To date, the 3D-structures of the flavoprotein domain of Zea mays (Maize) nitrate reductase [ ] and of Sus scrofa (Pig) NADH:cytochrome b5 reductase [] have been solved. The overall fold is similar to that of ferredoxin:NADP+reductase [ ]: the FAD-binding domain (N-terminal) has the topology of an anti-parallel β-barrel, while the NAD(P)-binding domain (C-terminal) has the topology of a classical pyridine dinucleotide-binding fold (i.e. a central parallel β-sheet flanked by 2 helices on each side).
Protein Domain
Name: Chaperone DnaJ
Type: Family
Description: Molecular chaperones are a diverse family of proteins that function to protect proteins in the intracellular milieu from irreversible aggregation during synthesis and in times of cellular stress. The bacterial molecular chaperone DnaK is an enzyme that couples cycles of ATP binding, hydrolysis, and ADP release by an N-terminal ATP-hydrolizing domain to cycles of sequestration and release of unfolded proteins by a C-terminal substrate binding domain. Dimeric GrpE is the co-chaperone for DnaK, and acts as a nucleotide exchange factor, stimulating the rate of ADP release 5000-fold [ ]. DnaK is itself a weak ATPase; ATP hydrolysis by DnaK is stimulated by its interaction with another co-chaperone, DnaJ. Thus the co-chaperones DnaJ and GrpE are capable of tightly regulating the nucleotide-bound and substrate-bound state of DnaK in ways that are necessary for the normal housekeeping functions and stress-related functions of the DnaK molecular chaperone cycle.Besides stimulating the ATPase activity of DnaK through its J-domain, DnaJ also associates with unfolded polypeptide chains and prevents their aggregation [ ]. Thus, DnaK and DnaJ may bind to one and the same polypeptide chain to form a ternary complex. The formation of a ternary complex may result in cis-interaction of the J-domain of DnaJ with the ATPase domain of DnaK. An unfolded polypeptide may enter the chaperone cycle by associating first either with ATP-liganded DnaK or with DnaJ. DnaK interacts with both the backbone and side chains of a peptide substrate; it thus shows binding polarity and admits only L-peptide segments. In contrast, DnaJ has been shown to bind both L- and D-peptides and is assumed to interact only with the side chains of the substrate.DnaJ comprises a 70-residue N-terminal domain (the J-domain); a 30-residue glycine-rich region (the G-domain); a centraldomain containing 4 repeats of a CxxCxGxG motif (the CRR-domain); and a 120-170 residue C-terminal region. The J- and CRR-domains are found in many prokaryotic and eukaryoticproteins [ ], either together or separately.The three components of the DnaK-DnaJ-GrpE system are typically encoded by consecutive genes. DnaJ homologues occur in many genomes, typically not encoded near DnaK and GrpE-like genes. Only some homologues are included in this family.
Protein Domain
Name: XPG conserved site
Type: Conserved_site
Description: Xeroderma pigmentosum (XP) [ ] is a human autosomal recessive disease, characterised by a high incidence of sunlight-induced skin cancer. People's skin cells with this condition are hypersensitive to ultraviolet light, due to defects in the incision step of DNA excision repair. There are a minimum of seven genetic complementation groups involved in this pathway: XP-A to XP-G. XP-G is one of the most rare and phenotypically heterogeneous of XP, showing anything from slight to extreme dysfunction in DNA excision repair [, ]. XP-G can be corrected by a 133 Kd nuclear protein, XPGC []. XPGC is an acidic protein that confers normal UV resistance in expressing cells []. It is a magnesium-dependent, single-strand DNA endonuclease that makes structure-specific endonucleolytic incisions in a DNA substrate containing a duplex region and single-stranded arms [, ]. XPGC cleaves one strand of the duplex at the border with the single-stranded region [].XPG (ERCC-5) belongs to a family of proteins that includes RAD2 from Saccharomyces cerevisiae (Baker's yeast) and rad13 from Schizosaccharomyces pombe (Fission yeast), which are single-stranded DNA endonucleases [ , , ]; mouse and human FEN-1, a structure-specific endonuclease; RAD2 from fission yeast and RAD27 from budding yeast; fission yeast exo1, a 5'-3' double-stranded DNA exonuclease that may act in a pathway that corrects mismatched base pairs; yeast DHS1, and yeast DIN7. Sequence alignment of this family of proteins reveals that similarities are largely confined to two regions. The first is located at the N-terminal extremity (N-region) and corresponds to the first 95 to 105 amino acids. The second region is internal (I-region) and found towards the C terminus; it spans about 140 residues and contains a highly conserved core of 27 amino acids that includes a conserved pentapeptide (E-A-[DE]-A-[QS]). It is possible that the conserved acidic residues are involved in the catalytic mechanism of DNA excision repair in XPG. The amino acids linking the N- and I-regions are not conserved.The first pattern, , corresponds to the central part of the N-region, the second pattern, , is part of the I-region and includes the putative catalytic core pentapeptide.
Protein Domain
Name: AsnC-type HTH domain
Type: Domain
Description: The asnC-type HTH domain is a DNA-binding, helix-turn-helix (HTH) domain of about 60 amino acids present in transcription regulators of the asnC/lrp family. This family of prokaryotic regulators is named after Escherichia coli asnC and Leucine-responsive Regulatory Protein (lrp), which are a regulator of asparagine synthesis and a global regulator of various operons, respectively [ ]. AsnC/lrp-like proteins are present in bacteria and archaea []. The DNA-binding asnC-type HTH domain occurs usually in the N-terminal part. The C-terminal part can contain an effector-binding domain and/or an oligomerisation domain. The crystal structure of hyperthermophilic archaeal lrpA shows that the N-terminal, DNA binding domain contains a core of three α-helices, followed by a single β-strand, which connects as a flexible hinge to the effector binding domain. The second and third helices, connected via a turn, comprise the helix-turn-helix motif. Helix 3 is termed the recognition helix as it binds the DNA major groove, like in other HTHs. Most E. coli lrp DNA binding mutants are positioned in the lrpA structure on the HTH and three are on the hinge [].Proteins known to contain an asnC-type HTH domain include: Escherichia coli Leucine-responsive Regulatory Protein (lrp), a global transcriptional regulator of 35-75 different genes involved in amino acid biosynthesis, amino acid degradation, transport or pili formation. Binding of leucine by lrp can stimulate or reduce the regulatory effect of activation for some operons or repression for others. Lrp negatively autoregulates the lrp gene, independently of leucine.Salmonella typhimurium lrp, a global leucine-responsive regulator involved in branched-chain amino acid biosynthesis, pili formation and plasmid virulence.Escherichia coli asnC, a specific asparagine-dependent transcriptional activator of asparagine biosynthesis. AsnC is also an asparagine-independent repressor of its own transcription.Pseudomonas putida bkdR, a specific autoregulatory transcriptional regulator, involved in catabolism of branched-chain amino acids.Agrobacterium tumefaciens putR, a specific proline-responsive regulator of proline catabolism.Bacillus subtilis lrpA/lrpB and lrpC, transcriptional regulators involved in serine-glycine interconversion, sporulation and amino acid metabolism. LrpC binds to a specific DNA structure and wraps and overwinds the DNA [ ].Bacillus subtilis azlB, a specific transcriptional repressor of branched-chain amino acid transport.Pyrococcus furiosus lrpA, a putative lrp with negative autoregulation.Zymomonas mobilis grp, a repressor of the glutamate uptake operon.
Protein Domain
Name: 7TM GPCR, serpentine receptor class a (Sra)
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli [ ]. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf' []. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr [, , ]. Many of these proteins have homologues in Caenorhabditis briggsae.This entry represents serpentine receptor class a (Sra) from the Sra superfamily [ ]. Sra receptors contain 6-7 hydrophobic, putative transmembrane, regions and can be distinguished from other 7TM GPCR receptors by their own characteristic TM signatures.
Protein Domain
Name: 7TM GPCR, serpentine receptor class b (Srb)
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli [ ]. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf' [ ]. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr [, , ]. Many of these proteins have homologues in Caenorhabditis briggsae.This entry represents serpentine receptor class b (Srb) from the Sra superfamily [ ]. Srb receptors contain 6-8 hydrophobic, putative transmembrane, regions and can be distinguished from other 7TM GPCR receptors by their own characteristic TM signatures.
Protein Domain
Name: Protease-activated receptor
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].Thrombin is a coagulation protease that activates platelets, leukocytes, endothelial and mesenchymal cells at sites of vascular injury, acting partlythrough an unusual proteolytically activated GPCR [ ]. Gene knockout experiments have provided definitive evidence for a second thrombin receptor in mouse platelets and have suggested tissue-specific roles for differentthrombin receptors. Because the physiological agonist at the receptor was originally unknown, it was provisionally named protease-activated receptor(PAR) [ ]. At least 4 PAR subtypes have now been characterised. Thus, the thrombin and PAR receptors constitute a fledgling receptor family that shares a novel proteolytic activation mechanism [ ].
Protein Domain
Name: Cytochrome P450, CYP2 family
Type: Family
Description: P450 enzymes constitute a superfamily of haem-thiolate proteins [ ], widely distributed in bacteria, fungi, plants and animals. The enzymes are involved in metabolism of a plethora of both exogenous and endogenous compounds []. Usually, they act as terminal oxidases in multi-component electron-transfer chains, called P450-containing monooxygenase systems. On the basis of sequence similarity, all P450s can be categorised into 2 main classes [ ], the so-called B- and E-classes: P450 proteins of prokaryotic 3-component systems and fungal P450nor (CYP55) belong to the B-class; all other known P450s from distinct systems are of the E-class. E-class P450s may be further divided into 5 subclasses (groups) according to protein sequence similarities. The data suggest that divergence of the P450 superfamily into B- and E-classes, and further divergence into stable P450 groups within the E-class, must be very ancient and had occurred before the appearance of eukaryotes. Given the rapid increase in numbers of P450s, Nelson introduced the concept of a higher-order classification of P450 families into clans [ ] based on sequence similarity. This is similar to the previous grouping into B- and E-classes; both classifications are still used. According to Nelsons system, clan 2 contains the CYP2 plus CYP1, 17, 18, 21 and 71 families, and corresponds to the E-class group I proteins []. Members of the first 4 families are of vertebrate origin, while those from CYP71 derive from plants. CYP1 and CYP2 enzymes mainly metabolise exogenous substrates, whereas CYP17 and CYP21 are involved in metabolism of endogenous physiologically-active compounds. This entry represents the CYP2 family, comprising 15 subfamilies (A-H, J-N, P and Q), is the most dominant in clan 2. Six of these subfamilies are non-mammalian: 2H derives from chicken; 2K, 2M, 2N and 2P are from fish; 2L is from lobster; and 2Q from Xenopus. The first five (A-E) are present in mammalian liver, but in differing amounts and with different inducibilities. Members of the CYP2F gene subfamily, meanwhile, are selectively expressed in lung tissues, and have been implicated as important catalysts in the formation of reactive intermediates from several pneumotoxic chemicals. Human CYP2F1 bioactivates 3-methylindole (3MI), while mouse CYP2F2 bioactivates naphthalene [ ].
Protein Domain
Name: Mu opioid receptor
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].The term opioid refers to a class of substance that produces its effects via the major classes of opioid receptor, termed mu, delta and kappa.In the CNS, the mu opioid receptor is found in the cerebral cortex, thalamus, hypothalamus, periaqueductal grey, interpeduncular nucleus andmedian raphe. In the periphery, it is found in the myenteric plexus, and in certain smooth muscles, e.g. mouse vas deferens. Mu opioidreceptors are believed to mediate analgesia, hypothermia, respiratory depression, miosis, bradycardia, nausea, euphoria and physical dependence.Beta-endorphin is the most potent endogenous ligand.
Protein Domain
Name: Prostanoid receptor
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].Prostanoids (prostaglandins (PG) and thromboxanes (TX)) mediate a wide variety of actions and play important physiological roles in the cardiovascular and immune systems, and in pain sensation in peripheral systems. PGI2 and TXA2 have opposing actions, involving regulation of the interaction of platelets with the vascular endothelium, while PGE2, PGI2 and PGD2 are powerful vasodilators and potentiate the action of various autocoids to induce plasma extravasation and pain sensation. To date, evidence for at least 5 classes of prostanoid receptor has been obtained. However, identification of subtypes and their distribution is hampered by expression of more than one receptor within a tissue, coupled with poor selectivity of available agonists and antagonists.
Protein Domain
Name: Neurokinin receptor
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].Neuropeptide receptors are present in very small quantities in the cell and are embedded tightly in the plasma membrane. The neuropeptides exhibita high degree of functional diversity through both regulation of peptide production and through peptide-receptor interaction []. The mammaliantachykinin system consists of 3 distinct peptides: substance P, substance K and neuromedin K. All possess a common spectrum of biological activities,including sensory transmission in the nervous system and contraction/ relaxation of peripheral smooth muscles, and each interacts with aspecific receptor type.
Protein Domain
Name: Gastrin receptor
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [ ].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].Gastrins and cholecystokinins (CCKs) are naturally-occurring peptides that share a common C-terminal sequence, GWMDF; full biological activity resides in this region. The principal physiological role of gastrin is to stimulate acid secretion in the stomach; it also has trophic effects on gastric mucosa. Gastrin is produced from a single gene transcript, and is found predominantly in the stomach and intestine, but also in vagal nerves. The CCKB receptor has a widespread distribution in the CNS and has been implicated in the pathogenesis of panic-anxiety attacks caused by CCK-related peptides. It has a more limited distribution in the periphery, where it is found in smooth muscle and secretory glands.
Protein Domain
Name: Peptidase T1A, proteasome beta-subunit, archaeal
Type: Family
Description: The proteasome (or macropain) ( ) [ , , , , ] is a multicatalytic proteinase complex in eukaryotes and archaea, and in some bacteria, that seems to be involved in an ATP/ubiquitin-dependent nonlysosomal proteolytic pathway. In eukaryotes the proteasome is composed of 28 distinct subunits which form a highly ordered ring-shaped structure (20S ring) of about 700kDa. Most proteasome subunits can be classified, on the basis on sequence similarities into two groups, alpha (A) and beta (B). These are arranged in four rings of seven proteins, consisting of a ring of alpha subunits, two rings of beta subunits, and a ring of alpha subunits. In eukaryotes, each alpha and each beta ring consists of different proteins. Three of the beta subunits are peptidases in subfamily T1A, and each has a distinctive specificity (trypsin-like, chymotrypsin-like and glutamyl peptidase-like). The peptidases are N-terminal nucleophile hydrolases in which the N-terminal threonine is the nucleophile in the hydrolytic reaction []. In the immunoproteasome, the catalytic components are replaced by three specialist, catalytic beta subunits []. In bacteria and archaea there is only one alpha subunit and one beta subunit, and each ring is a homoseptamer.This entry includes the beta subunit of the archaean proteasome (MEROPS identifier T01.002). The archaean proteasome consists of four stacked rings each of which contains a homoheptamer of either alpha or beta components, so that the rings are stacked in the order alpha, beta, beta, alpha. Alpha and beta subunits are homologous to one another, but only beta subunits are proteolytically active. The beta subunits are arranged so that the active sites are directed towards the centre of each ring. The proteasome is therefore a torus structure with a large cavity, and entrance and exit pores at the top and bottom. A dentured protein enters through the top pore, is degraded by the beta subunits into short peptides which exit from the bottom pore. The archaean proteasome is therefore similar to, but a simplified version of, the eukaryote proteasome. The crystal structure of the proteasome from Thermoplasma acidophylumwas the first to be solved, showing a structure similar to that of N-terminal nucleophile hydrolases [ ], and the beta subunit was found to be the first threonine peptidase [].
Protein Domain
Name: Prostaglandin DP receptor
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].Prostanoids (prostaglandins (PG) and thromboxanes (TX)) mediate a wide variety of actions and play important physiological roles in the cardiovascular and immune systems, and in pain sensation in peripheral systems. PGI2 and TXA2 have opposing actions, involving regulation of the interaction of platelets with the vascular endothelium, while PGE2, PGI2 and PGD2 are powerful vasodilators and potentiate the action of various autocoids to induce plasma extravasation and pain sensation. To date, evidence for at least 5 classes of prostanoid receptor has been obtained. However, identification of subtypes and their distribution is hampered by expression of more than one receptor within a tissue, coupled with poor selectivity of available agonists and antagonists.
Protein Domain
Name: Kappa opioid receptor
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].The term opioid refers to a class of substance that produces its effects via the major classes of opioid receptor, termed mu, delta and kappa.In the CNS, the kappa opioid receptor is found in the cerebral cortex, substantia nigra, interpeduncular nucleus, striatum and hippocampus. Inthe periphery, it is found in the myenteric plexus of the guinea pig ileum, and it is also in certain smooth muscles, e.g. rabbit vas deferens.K-opioid receptors are believed to mediate analgesia, sedation, miosis and diuresis. Dynorphin is the most potent endogenous ligand.
Protein Domain
Name: TonB-dependent siderophore receptor
Type: Family
Description: The outer membrane is an essential component of Gram-negative bacteria, providing them with increased resistance to antibiotics, digestive enzymes, detergents and immune surveillance [ ]. The outer membrane is permeable to small hydrophilic molecules because of the presence of aqueous diffusion channels (e.g. porins). Small molecules present at high concentration diffuse down their concentration gradients into the periplasmic space. Porins are inadequate for the efficient acquisition of iron siderophores, cobalamins (Cbl), and other molecules that are present at very low concentrations and that are too bulky to pass through the lumen of the porin. Another class of outer membrane proteins binds these substrates with high specificity and carries out active transport across the outer membrane. This active transport process requires an energy source and a second protein, TonB. Outer membrane active-transport proteins interact with the transperiplasmic protein TonB through a conserved sequence, the "Ton-box". Interaction with TonB couples the energy of the proton motive force of the inner membrane to drive an outer membrane transport cycle [ ]. To date, crystal structures of four TonB-dependent transporters have been solved. Three of these are iron-siderophore transporters: ferrichrome transporter FhuA [ , ]; ferric enterobactin transporter FepA []; and ferric dicitrate transporter FecA []. The fourth structure is the cobalamin transporter BtuB []. All of these structures are composed of two domains, a conserved N-terminal globular domain (hatch) and a 22-stranded β-barrel (barrel). The hatch domain resides within the barrel and occludes the large pore of the large barrel domain. The hatch domains are composed of a central core of four β-strands connected by loops. The conserved Ton-box is located near the periplasmic opening of the barrel and precedes the conserved hatch core. The barrels have large extracellular loops and short periplasmic turns connecting the β-strands. TonB-dependent transporters bind their cognate substrates using residues from hatch loops, from the interior surfaces of β-strands in the barrel wall, and from extracellular loops of the barrel [ ]. This group of sequences include a variety of TonB-dependent outer membrane siderophore receptors. It has no overlap with TonB receptors known to transport other substances.
Protein Domain
Name: TonB-dependent vitamin B12 transporter BtuB
Type: Family
Description: The outer membrane is an essential component of Gram-negative bacteria, providing them with increased resistance to antibiotics, digestive enzymes, detergents and immune surveillance [ ]. The outer membrane is permeable to small hydrophilic molecules because of the presence of aqueous diffusion channels (e.g. porins). Small molecules present at high concentration diffuse down their concentration gradients into the periplasmic space. Porins are inadequate for the efficient acquisition of iron siderophores, cobalamins (Cbl), and other molecules that are present at very low concentrations and that are too bulky to pass through the lumen of the porin. Another class of outer membrane proteins binds these substrates with high specificity and carries out active transport across the outer membrane. This active transport process requires an energy source and a second protein, TonB. Outer membrane active-transport proteins interact with the transperiplasmic protein TonB through a conserved sequence, the "Ton-box". Interaction with TonB couples the energy of the proton motive force of the inner membrane to drive an outer membrane transport cycle [ ]. To date, crystal structures of four TonB-dependent transporters have been solved. Three of these are iron-siderophore transporters: ferrichrome transporter FhuA [ , ]; ferric enterobactin transporter FepA []; and ferric dicitrate transporter FecA []. The fourth structure is the cobalamin transporter BtuB []. All of these structures are composed of two domains, a conserved N-terminal globular domain (hatch) and a 22-stranded β-barrel (barrel). The hatch domain resides within the barrel and occludes the large pore of the large barrel domain. The hatch domains are composed of a central core of four β-strands connected by loops. The conserved Ton-box is located near the periplasmic opening of the barrel and precedes the conserved hatch core. The barrels have large extracellular loops and short periplasmic turns connecting the β-strands. TonB-dependent transporters bind their cognate substrates using residues from hatch loops, from the interior surfaces of β-strands in the barrel wall, and from extracellular loops of the barrel []. This entry represents the TonB-dependent outer membrane receptor found in gamma-proteobacteria responsible for translocating the cobalt-containing vitamin B12 (cobalamin). In addition to binding and transport of cobalamins, BtuB serves a a receptor for the E and A colicins and for the bacteriophage BF23 [ , , ].
Protein Domain
Name: DNA-binding transcriptional regulator NtrC
Type: Family
Description: Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions [ ]. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk []. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more []. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) []. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK.A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response [, ].This entry represents Nitrogen regulatory protein C (NtrC), which is a bacterial enhancer-binding protein that activates the transcription of genes encoding enzymes required for nitrogen metabolism. It is phosphorylated by NtrB and interacts with sigma-54. One of the best studied examples is its activation of the gene glnA, which encodes the enzyme glutamine synthetase [ ].NtrC is composed of three domains [ , ]. The 124 residue N-terminal domain is homologous to receiver domains of other response regulator proteins in two-component signal transduction systems [, ]. The 240 residue central domain of NtrC is homologous to a domain found in all activators of the sigma-54 RNA polymerase holoenzyme [, ]. The C-terminal domain has been indicated to contain the determinants necessary for both DNA-binding and dimerization of full-length NtrC.
Protein Domain
Name: 7TM GPCR, serpentine receptor class g (Srg)
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli [ ]. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf' []. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified inC. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr [ , , ]. Many of these proteins have homologues in Caenorhabditis briggsae.This entry represents serpentine receptor class g (Srg) from the Srg superfamily [ , ]. Srg receptors contain seven hydrophobic, putative transmembrane, regions and can be distinguished from other 7TM GPCR receptors by their own characteristic TM signatures.
Protein Domain
Name: Methyltransferase, NNMT/PNMT/TEMT
Type: Family
Description: Methyl transfer from the ubiquitous S-adenosyl-L-methionine (AdoMet) to either nitrogen, oxygen or carbon atoms is frequently employed in diverse organisms ranging from bacteria to plants and mammals. The reaction is catalysed by methyltransferases (Mtases) and modifies DNA, RNA, proteins and small molecules, such as catechol for regulatory purposes. The various aspects of the role of DNA methylation in prokaryotic restriction-modification systems and in a number of cellular processes in eukaryotes including gene regulation and differentiation is well documented.Three classes of DNA Mtases transfer the methyl group from AdoMet to the target base to form either N-6-methyladenine, or N-4-methylcytosine, or C-5- methylcytosine. In C-5-cytosine Mtases, ten conserved motifs are arranged in the same order [ ]. Motif I (a glycine-rich or closely related consensus sequence; FAGxGG in M.HhaI []), shared by other AdoMet-Mtases [], is part of the cofactor binding site and motif IV (PCQ) is part of the catalytic site. In contrast, sequence comparison among N-6-adenine and N-4-cytosine Mtases indicated two of the conserved segments [], although more conserved segments may be present. One of them corresponds to motif I in C-5-cytosine Mtases, and the other is named (D/N/S)PP(Y/F). Crystal structures are known for a number of Mtases [, , , ]. The cofactor binding sites are almost identical and the essential catalytic amino acids coincide. The comparable protein folding and the existence of equivalent amino acids in similar secondary and tertiary positions indicate that many (if not all) AdoMet-Mtases have a common catalytic domain structure. This permits tertiary structure prediction of other DNA, RNA, protein, and small-molecule AdoMet-Mtases from their amino acid sequences [].Several cytoplasmic vertebrate methyltransferases are evolutionary related [ ], includingnicotinamide N-methyltransferase ( ) (NNMT); phenylethanolamine N-methyltransferase () (PNMT); and thioether S-methyltransferase () (TEMT). NNMT catalyses the N-methylation of nicotinamide and other pyridines to form pyridinium ions. This activity is important for the biotransformation of many drugs and xenobiotic compounds. PNMT catalyses the last step in catecholamine biosynthesis, the conversion of noradrenalin to adrenalin; and TEMT catalyses themethylation of dimethyl sulphide into trimethylsulphonium. These three enzymes use S-adenosyl-L-methionine as the methyl donor. They are proteins of 30 to 32kDa.
Protein Domain
Name: CSK-like, SH2 domain
Type: Domain
Description: This entry represents the SH2 domain found in CSK and CHK. Both the C-terminal Src kinase (CSK) and CSK-homologous kinase (CHK) are members of the CSK-family of protein tyrosine kinases. These proteins suppress activity of Src-family kinases (SFK) by selectively phosphorylating the conserved C-terminal tail regulatory tyrosine by a similar mechanism [ ]. CHK is also capable of inhibiting SFKs by a non-catalytic mechanism that involves binding of CHK to SFKs to form stable protein complexes. The unphosphorylated form of SFKs is inhibited by CSK and CHK by a two-step mechanism. The first step involves the formation of a complex of SFKs with CSK/CHK with the SFKs in the complex are inactive. The second step, involves the phosphorylation of the C-terminal tail tyrosine of SFKs, which then dissociates and adopt an inactive conformation. The structural basis of how the phosphorylated SFKs dissociate from CSK/CHK to adopt the inactive conformation is not known. The inactive conformation of SFKs is stabilized by two intramolecular inhibitory interactions: (a) the pYT:SH2 interaction in which the phosphorylated C-terminal tail tyrosine (YT) binds to the SH2 domain, and (b) the linker:SH3 interaction of which the SH2-kinase domain linker binds to the SH3 domain. SFKs are activated by multiple mechanisms including binding of the ligands to the SH2 and SH3 domains to displace the two inhibitory intramolecular interactions, autophosphorylation, and dephosphorylation of YT. By selective phosphorylation and the non-catalytic inhibitory mechanism CSK and CHK are able to inhibit the active forms of SFKs [ ]. CSK and CHK are regulated by phosphorylation and inter-domain interactions. They both contain SH3, SH2, and kinase domains separated by the SH3-SH2 connector and SH2 kinase linker, intervening segments separating the three domains. They lack a conserved tyrosine phosphorylation site in the kinase domain and the C-terminal tail regulatory tyrosine phosphorylation site. The CSK SH2 domain is crucial for stabilizing the kinase domain in the active conformation. A disulfide bond here regulates CSK kinase activity. The subcellular localization and activity of CSK are regulated by its SH2 domain [ ]. In general SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites [ ].
Protein Domain
Name: Cholinesterase
Type: Family
Description: Cholinesterase enzymes are members of the broader alpha/beta hydrolase family and can be dividied into two distinct groups: those that catalyse the hydrolysis of acetylcholine to choline and acetate (acetylcholinesterases ) acetylcholine + H2O ->choline + acetate and those that catalyse the conversion of other acylcholines to a choline and a weak acid (cholinesterases ) an acylcholine + H2O ->choline + a carboxylate Acetylcholinesterase also acts on a variety of acetic esters and catalyses transacetylations. It is the most intensively studied of the cholinesterase enzymes due to its key physiological role in the turnover of the neurotransmitter acylcholine [ ]. This enzyme is found in, or attached to, cellular or basement membranes of presynaptic cholinergic neurons and postsynaptic cholinoceptive cells within the neuromuscular junction. Signal transmission at the neuromuscular junction involves the release of acylcholine, its interaction with the acycholine receptor and hydrolysis, all occuring in a period of a few milliseconds. Rapid hydrolysis of the newly released aceytlcholine is vital in order to prevent continuous firing of the nerve impulses []. Consistent with its role in this process, acetylcholinesterase has an unusually high turnover number, ensuring that acetylcholine is broken down quickly. There is evidence to suggest that acetylcholinesterase has additional important roles including involvement in neuronal adhesion, the formation of Alzheimer fibrils, and neurite growth [, , ]. The 3D structure of acetylcholinesterase and a cholinesterase have been determined [ , ]. These proteins share the 3-layer α-β-alpha sandwich fold common to members of the alpha/beta hydrolase family. Surprisingly, given the high turnover number of acetylcholinesterase, the active site of these enzymes is located at the bottom of a deep and narrow cleft, named the active-site gorge. Some of the proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Acetylcholinesterase () belongs to the Yt blood group system and is associated with Yt(a/b) antigen.
Protein Domain
Name: T-box transcription factor, DNA-binding domain
Type: Domain
Description: This is the highly conserved DNA-binding domain of T-box transcription factors. Transcription factors of the T-box family are required both for early cell-fate decisions, such as those necessary for formation of the basic vertebrate body plan, for differentiation and organogenesis [ ] and also have been associated to multiple aspects of development and in adult terminal cell-type differentiation in different animal lineages []. The T-box is defined as the minimal region within the T-box protein that is both necessary and sufficient for sequence-specific DNA binding, all members of the family so far examined bind to the DNA consensus sequence TCACACCT and function as transcriptional repressors and/or activators []. The T-box is a relatively large DNA-binding domain, generally comprising about a third of the entire protein (17-26kDa) [].These genes were uncovered on the basis of similarity to the DNA binding domain [ ] of Mus musculus (Mouse) Brachyury (T) gene product, which similarity is the defining feature of the family. The Brachyury gene is named for its phenotype, which was identified 70 years ago as a mutant mouse strain with a short blunted tail. The gene, and its paralogues, have become a well-studied model for the family, and hence much of what is known about the T-box family is derived from the murine Brachyury gene.Consistent with its nuclear location, Brachyury protein has a sequence-specific DNA-binding activity and can act as a transcriptional regulator [ ]. Homozygous mutants for the gene undergo extensive developmental anomalies, thus rendering the mutation lethal []. The postulated role of Brachyury is as a transcription factor, regulating the specification and differentiation of posterior mesoderm during gastrulation in a dose-dependent manner [].T-box proteins tend to be expressed in specific organs or cell types, especially during development, and they are generally required for the development of those tissues, for example, Brachyury is expressed in posterior mesoderm and in the developing notochord, and it is required for the formation of these cells in mice [ ]. The T-box family is an ancient group that appears to play a critical role in development in all animal species [ ].
Protein Domain
Name: Peptidase S8/S53 domain superfamily
Type: Homologous_superfamily
Description: These proteins contain a domain superfamily found in serine peptidases belonging to the MEROPS peptidase families S8 (subfamilies S8A (subtilisin) and S8B (kexin) and S53 (sedolisin), both of which are members of clan SB [ ].The subtilisin family is one of the largest serine peptidase families characterised to date. Over 200 subtilises are presently known, more than 170 of which with their complete amino acid sequence [ ]. It is widespread, being found in eubacteria, archaebacteria, eukaryotes and viruses []. The vast majority of the family are endopeptidases, although there is an exopeptidase, tripeptidyl peptidase [, ]. Structures have been determined for several members of the subtilisin family: they exploit the same catalytic triad as the chymotrypsins, although the residues occur in a different order (HDS in chymotrypsin and DHS in subtilisin), but the structures show no other similarity [, ]. Some subtilisins are mosaic proteins, while others contain N- and C-terminal extensions that show no sequence similarity to any other known protein [].The proprotein-processing endopeptidases kexin, furin and related enzymes form a distinct subfamily known as the kexin subfamily (S8B). These preferentially cleave C-terminally to paired basic amino acids. Members of this subfamily can be identified by subtly different motifs around the active site [, ]. Members of the kexin subfamily, along with endopeptidases R, T and K from the yeast Tritirachium and cuticle-degrading peptidase from Metarhizium, require thiol activation. This can be attributed to the presence of a cysteine near to the active site histidine []. Only one viral member of the subtilisin family is known, a 56kDa protease from herpes virus 1, which infects the channel catfish []. Sedolisins (serine-carboxyl peptidases) are proteolytic enzymes whose fold resembles that of subtilisin; however, they are considerably larger, with the mature catalytic domains containing approximately 375 amino acids. The defining features of these enzymes are a unique catalytic triad, Ser-Glu-Asp, as well as the presence of an aspartic acid residue in the oxyanion hole. High-resolution crystal structures have now been solved for sedolisin from Pseudomonas sp. 101, as well as for kumamolisin from a thermophilic bacterium, Bacillus sp. MN-32. Mutations in the human gene leads to a fatal neurodegenerative disease [ ].
Protein Domain
Name: Peptidase S54, GlpG peptidase, N-terminal
Type: Domain
Description: Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes [ ]. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Many families of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence [ ]. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases [].Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base [ ]. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ].This entry represents the N-terminal domain of membrane-bound serine endopeptidases belonging to MEROPS peptidase family S54 (rhomboid-1, clan ST). This domain contains a conserved ASW sequence motif and a single completely conserved residue F that may be functionally important. The tertiary structure of the GlpG protein from Escherichia coli has been determined [ ]. The GlpG protein has six transmembrane domains (other members of the family are predicted to have seven), with the N- and C-terminal ends anchored in the cytoplasm. One transmembrane domain is shorter than the rest, creating an internal, aqueous cavity just below the membrane surface and it is here were proteolysis occurs. There is also a membrane-embedded loop between the first and second transmembrane domains which is postulated to act as a gate controlling substrate access to the active site. No other family of serine peptidases is known to have active site residues within transmembrane domains (although transmembrane active sites are known for aspartic peptidase and metallopeptidases), and the GlpG protein has the type structure for clan ST.
Protein Domain
Name: 7TM GPCR, serpentine chemoreceptor class i (Sri)
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli [ ]. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf' []. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr [, , ]. Many of these proteins have homologues in Caenorhabditis briggsae.This entry represents Sri, which is part of the Str superfamily of chemoreceptors.
Protein Domain
Name: 7TM GPCR, serpentine receptor class v (Srv)
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli [ ]. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf' []. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr [, , ]. Many of these proteins have homologues in Caenorhabditis briggsae.This entry represents serpentine receptor class v (Srv) from the Srg superfamily [ , ]. Srg receptors contain seven hydrophobic, putative transmembrane, regions and can be distinguished from other 7TM GPCR receptors by their own characteristic TM signatures.
Protein Domain
Name: 7TM GPCR, serpentine receptor class j (Srj)
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli [ ]. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf' []. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr [, , ]. Many of these proteins have homologues in Caenorhabditis briggsae.This entry represents serpentine receptor class j (Srj) from the Str superfamily [ , ]. The Srj family is designated as the out-group based on its location in preliminary phylogenetic analyses of the entire superfamily [].
Protein Domain
Name: Mediator of RNA polymerase II transcription subunit 9
Type: Family
Description: The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins. The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11. The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation. The head module contains: MED6, MED8, MED11, SRB4/MED17, SRB5/MED18, ROX3/MED19, SRB2/MED20 and SRB6/MED22. The middle module contains: MED1, MED4, NUT1/MED5, MED7, CSE2/MED9, NUT2/MED10, SRB7/MED21 and SOH1/MED31. CSE2/MED9 interacts directly with MED4. The tail module contains: MED2, PGD1/MED3, RGR1/MED14, GAL11/MED15 and SIN4/MED16. The CDK8 module contains: MED12, MED13, CCNC and CDK8. Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.This entry represents subunit Med9 of the Mediator complex. Subunit Med9 is part of the middle module of the Mediator complex [ ]; this associates with the core polymerase subunits to form the RNA polymerase II holoenzyme.Med9 alternatively known as the chromosome segregation protein, CSE2 ( ) is required, along with CSE1 ( ) for accurate mitotic chromosome segregation in Saccharomyces cerevisiae (Baker's yeast) [ ].
Protein Domain
Name: Adenosylcobinamide-GDP ribazoletransferase
Type: Family
Description: Cobalamin (vitamin B12) is a structurally complex cofactor, consisting of a modified tetrapyrrole with a centrally chelated cobalt. Cobalamin is usually found in one of two biologically active forms: methylcobalamin and adocobalamin. Most prokaryotes, as well as animals, have cobalamin-dependent enzymes, whereas plants and fungi do not appear to use it. In bacteria and archaea, these include methionine synthase, ribonucleotide reductase, glutamate and methylmalonyl-CoA mutases, ethanolamine ammonia lyase, and diol dehydratase [ ]. In mammals, cobalamin is obtained through the diet, and is required for methionine synthase and methylmalonyl-CoA mutase []. There are at least two distinct cobalamin biosynthetic pathways in bacteria [ ]:Aerobic pathway that requires oxygen and in which cobalt is inserted late in the pathway [ ]; found in Pseudomonas denitrificans and Rhodobacter capsulatus.Anaerobic pathway in which cobalt insertion is the first committed step towards cobalamin synthesis [ , ]; found in Salmonella typhimurium, Bacillus megaterium, and Propionibacterium freudenreichii subsp. shermanii. Either pathway can be divided into two parts: (1) corrin ring synthesis (differs in aerobic and anaerobic pathways) and (2) adenosylation of corrin ring, attachment of aminopropanol arm, and assembly of the nucleotide loop (common to both pathways) [ ]. There are about 30 enzymes involved in either pathway, where those involved in the aerobic pathway are prefixed Cob and those of the anaerobic pathway Cbi. Several of these enzymes are pathway-specific: CbiD, CbiG, and CbiK are specific to the anaerobic route of S. typhimurium, whereas CobE, CobF, CobG, CobN, CobS, CobT, and CobW are unique to the aerobic pathway of P. denitrificans.This entry represents the CobS protein, which joins adenosylcobinamide-GDP and alpha-ribazole to generate adenosylcobalamin (Ado-cobalamin) and also synthesizes adenosylcobalamin 5'-phosphate from adenosylcobinamide-GDP and alpha-ribazole 5'-phosphate [ ]. It catalyses the reactions:Adenosylcobinamide-GDP + alpha-ribazole = GMP + adenosylcobalaminAdenosylcobinamide-GDP + alpha-ribazole 5'-phosphate = GMP + adenosylcobalamin 5'-phosphateThe protein product from these catalyses is associated with a large complex of proteins and is induced by cobinamide. CobS is involved in part III of cobalamin biosynthesis, one of the late steps in adenosylcobalamin synthesis that, together with CobU, CobT, and CobC proteins, defines the nucleotide loop assembly pathway [ , ].
Protein Domain
Name: 7TM GPCR, serpentine receptor class u (Sru)
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli [ ]. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf' []. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr [, , ]. Many of these proteins have homologues in Caenorhabditis briggsae.This entry represents serpentine receptor class u (Sru) from the Srg superfamily [ ].
Protein Domain
Name: Thromboxane receptor
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].Prostanoids (prostaglandins (PG) and thromboxanes (TX)) mediate a wide variety of actions and play important physiological roles in the cardiovascular and immune systems, and in pain sensation in peripheral systems. PGI2 and TXA2 have opposing actions, involving regulation of the interaction of platelets with the vascular endothelium. To date, evidencefor at least 5 classes of prostanoid receptor has been obtained. However, identification of subtypes and their distribution is hampered by expressionof more than one receptor within a tissue, coupled with poor selectivity of available agonists and antagonists. Moreover, many endogenous prostanoidsundergo rapid metabolism, especially TXA2.
Protein Domain
Name: Adrenocorticotrophin (ACTH) receptor
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].Adrenocorticotrophin (ACTH), melanocyte-stimulating hormones and beta-endorphin are peptide products of pituitary pro-opiomelanocortin. ACTH regulates synthesis and release of glucocorticoids and aldosterone inthe adrenal cortex; it also has a trophic action on these cells. ACTH and beta-endorphin are synthesised and released in response tocorticotrophin-releasing factor at times of stress (heat, cold, infections, etc.) - their release leads to increased metabolism and analgesia.The ACTH receptor is found in high levels in the adrenal cortex - binding sites are present in lower levels in the CNS.
Protein Domain
Name: FGD1, N-terminal PH domain
Type: Domain
Description: This entry represents the N-terminal PH domain of FGD1.In general, FGDs (including FGD1, FGD2, FGD3 and FGD4/Frabin) have a RhoGEF (DH) domain, followed by an N-terminal PH domain, a FYVE domain and a C-terminal PH domain. All FGDs are guanine nucleotide exchange factors that activates the Rho GTPase Cdc42, an important regulator of membrane trafficking. The RhoGEF domain is responsible for GEF catalytic activity, while the N-terminal PH domain is involved in intracellular targeting of the DH domain []. Mutations in the FGD1 gene are responsible for the X-linked disorder known as faciogenital dysplasia (FGDY) []. Both FGD1 and FGD3 are targeted by the ubiquitin ligase SCF(FWD1/beta-TrCP) upon phosphorylation of two serine residues in its DSGIDS motif and subsequently degraded by the proteasome. However, FGD1 and FGD3 induced significantly different morphological changes in HeLa Tet-Off cells and while FGD1 induced long finger-like protrusions, FGD3 induced broad sheet-like protrusions when the level of GTP-bound Cdc42 was significantly increased by the inducible expression of FGD3. They also reciprocally regulated cell motility in inducibly expressed in HeLa Tet-Off cells, FGD1 stimulated cell migration while FGD3 inhibited it. FGD1 and FGD3 therefore play different roles to regulate cellular functions, even though their intracellular levels are tightly controlled by the same destruction pathway through SCF(FWD1/beta-TrCP) [, ].PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner [ ]. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity []. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane []. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes [].
Protein Domain
Name: FGD1-4, C-terminal PH domain
Type: Domain
Description: This entry represents the C-terminal PH domain of FGD1-4.In general, FGDs (including FGD1, FGD2, FGD3 and FGD4/Frabin) have a RhoGEF (DH) domain, followed by an N-terminal PH domain, a FYVE domain and a C-terminal PH domain. All FGDs are guanine nucleotide exchange factors that activates the Rho GTPase Cdc42, an important regulator of membrane trafficking. The RhoGEF domain is responsible for GEF catalytic activity, while the N-terminal PH domain is involved in intracellular targeting of the DH domain [ ]. Mutations in the FGD1 gene are responsible for the X-linked disorder known as faciogenital dysplasia (FGDY) []. Both FGD1 and FGD3 are targeted by the ubiquitin ligase SCF(FWD1/beta-TrCP) upon phosphorylation of two serine residues in its DSGIDS motif and subsequently degraded by the proteasome. However, FGD1 and FGD3 induced significantly different morphological changes in HeLa Tet-Off cells and while FGD1 induced long finger-like protrusions, FGD3 induced broad sheet-like protrusions when the level of GTP-bound Cdc42 was significantly increased by the inducible expression of FGD3. They also reciprocally regulated cell motility in inducibly expressed in HeLa Tet-Off cells, FGD1 stimulated cell migration while FGD3 inhibited it. FGD1 and FGD3 therefore play different roles to regulate cellular functions, even though their intracellular levels are tightly controlled by the same destruction pathway through SCF(FWD1/beta-TrCP) [, ].PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner [ ]. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity []. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane []. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes [ ].
Protein Domain
Name: Proteinase inhibitor I25A, stefin
Type: Family
Description: The cystatins are a superfamily of similar proteins present in mammals, birds, fish, insects, plants and protozoa. In general they are potent peptidase inhibitors [ , , , ] belonging to MEROPS inhibitor family I25, clan IH. The type 1 cystatins or stefins (A and B) are mainly intracellular, the type 2 cystatins (C, D, E/M, F, G, S, SN and SA) are extracellular, and the type 3 cystatins (L- and H-kininogens) are intravascular proteins. All true cystatins inhibit cysteine peptidases of the papain family (MEROPS peptidase family C1), and some also inhibit legumain family enzymes (MEROPS peptidase family C13). These peptidases play key roles in physiological processes, such as intracellular protein degradation (cathepsins B, H and L), are pivotal in the remodelling of bone (cathepsin K), and may be important in the control of antigen presentation (cathepsin S, mammalian legumain). Moreover, the activities of such peptidases are increased in pathophysiological conditions, such as cancer metastasis and inflammation. Additionally, such peptidases are essential for several pathogenic parasites and bacteria. Thus in animals cystatins not only have capacity to regulate normal body processes and perhaps cause disease when down-regulated, but in other organisms may also participate in defence against biotic and abiotic stress.The stefin family (MEROPS inhibitor family I25 clan IH, subfamily I25A) includes proteins that lack disulphide bonds and carbohydrates. The most abundant source of stefin A is poly-morphonuclear leucocytes from the liver, but it is also found in extracts of squamous epithelia from the mouth and oesophagus, and has been localised to the cytoplasm of the strata corneum and granulosum of the epidermis. The selective distribution of the inhibitor correlates with tissues that constitute a 'first line of defence' against pathogenic organisms. Stefin A may thus provide a protective function as an inhibitor of cysteine proteases utilised as invasive tools by many infectious agents [].The structure of stefin A contains a 5-stranded anti-parallel β-sheet, wrapped around a central helix. The loops formed between the strands are involved in inhibitor binding, one of these containing a QVVAG sequence, which is highly conserved in most members of the cystatin superfamily [ ].
Protein Domain
Name: Tapasin
Type: Family
Description: Major histocompatibility complex (MHC) class I molecules present antigenic peptides to CD8 T cells. The majority of peptides found associated with class I molecules are derived from nuclear and cytosolic proteins, and they are generated largely by the proteasome complex. These peptides are transported from cytosol into the lumen of the endoplasmic reticulum (ER) by a peptide transporter, which is known as the transporter associated with antigen processing (TAP). TAP is a trimeric complex consisting of TAP1, TAP2 and tapasin (TAP-A). TAP1 and TAP2 are required for peptide transport. Tapasin, which actually serves as a docking site on the TAP complex specific for interaction with class I MHC molecules, is essential for peptide loading (up to four MHC class I-tapasin complexes have been found to bind to each TAP molecule). However, since the exact mechanisms oftapasin functions are still unknown, it has also been speculated that tapasin may regulate the MHC class I release from the ER rather than directly loading peptides onto MHC class I molecules [, , , ].In studies of the interaction between MHC class I and TAP, it was found that TAP1, but not TAP2, is required for the association of TAP with class I molecules. Because tapasin is essential for the association of MHC class I to TAP, tapasin may directly interact with TAP1. Thus the predicted order of interaction between different molecules in the TAP complex is TAP2 to TAP1, TAP1 to tapasin, and tapasin to MHC class I molecules. Thus, by these linked events, the translocation and loading of peptides rapidly and efficiently proceed in the same microenvironment [, ].Tapasin is a type I transmembrane (TM) glycoprotein with a double lysine motif that is thought to be involved with mediating the retrieval of proteins back from the cis-Golgi, thus maintaining membrane proteins in the ER []. It is encoded by an MHC-linked gene and is a member of theimmunoglobulin superfamily. Binding to TAP is mediated by the C-terminal region, whereas its N-terminal 50 residues constitute the key element thatconverts the MHC class I molecules and TAP weak interactions into a stable complex [, ].
Protein Domain
Name: Mediator complex, subunit Med29, metazoa
Type: Family
Description: The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins. The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11. The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation. The head module contains: MED6, MED8, MED11, SRB4/MED17, SRB5/MED18, ROX3/MED19, SRB2/MED20 and SRB6/MED22. The middle module contains: MED1, MED4, NUT1/MED5, MED7, CSE2/MED9, NUT2/MED10, SRB7/MED21 and SOH1/MED31. CSE2/MED9 interacts directly with MED4. The tail module contains: MED2, PGD1/MED3, RGR1/MED14, GAL11/MED15 and SIN4/MED16. The CDK8 module contains: MED12, MED13, CCNC and CDK8. Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.Med29, along with Med11 and Med28, in mammals, is part of the core head-region of the complex. Med29 is the apparent orthologue of the Drosophila melanogaster Intersex protein, which interacts directly with, and functions as a transcriptional coactivator for, the DNA-binding transcription factor Doublesex, so it is likely that mammalian Med29 serves as a target for one or more DNA-binding transcriptional activators [ ].
Protein Domain
Name: Voltage-dependent calcium channel, gamma-5 subunit
Type: Family
Description: Ca2+ ions are unique in that they not only carry charge but they are also the most widely used of diffusible second messengers. Voltage-dependent Ca2+ channels (VDCC) are a family of molecules that allow cells to couple electrical activity to intracellular Ca2+ signalling. The opening and closing of these channels by depolarizing stimuli, such as action potentials, allows Ca2+ ions to enter neurons down a steep electrochemical gradient, producing transient intracellular Ca2+ signals. Many of the processes that occur in neurons, including transmitter release, gene transcription and metabolism are controlled by Ca2+ influx occurring simultaneously at different cellular locales. The pore is formed by the alpha-1 subunit which incorporates the conduction pore, the voltage sensor and gating apparatus, and the known sites of channel regulation by second messengers, drugs, and toxins [ ]. The activity of this pore is modulated by four tightly-coupled subunits: an intracellular beta subunit; a transmembrane gamma subunit; and a disulphide-linked complex of alpha-2 and delta subunits, which are proteolytically cleaved from the same gene product. Properties of the protein including gating voltage-dependence, G protein modulation and kinase susceptibility can be influenced by these subunits.Voltage-gated calcium channels are classified as T, L, N, P, Q and R, and are distinguished by their sensitivity to pharmacological blocks, single-channel conductance kinetics, and voltage-dependence. On the basis of their voltage activation properties, the voltage-gated calcium classes can be further divided into two broad groups: the low (T-type) and high (L, N, P, Q and R-type) threshold-activated channels.The voltage-dependent calcium channel gamma (VDCCG) subunit family consists of at least 8 members, which share a number of common structural features[ ]. Each member is predicted to possess 4 transmembrane domains, with intracellular N- and C-termini. The first extracellular loop contains a highly conserved N-glycosylation site and a pair of conserved cysteine residues. The C-terminal 7 residues of VDCCG-2, -3, -4 and -8 are also conserved andcontain a consensus site for phosphorylation by cAMP and cGMP-dependent protein kinases, and a target site for binding by PDZ domain proteins [].The VDCCG-5 subunit was identified by genomic database searching, pursuing sequences similar to VDCCG-1 and -2. Mouse, human and rat isoforms havebeen cloned. VDCCG-5 is expressed in a range of tissues, including brain, kidney and testis [].
Protein Domain
Name: Voltage-dependent calcium channel, gamma-8 subunit
Type: Family
Description: Ca2+ ions are unique in that they not only carry charge but they are also the most widely used of diffusible second messengers. Voltage-dependent Ca2+ channels (VDCC) are a family of molecules that allow cells to couple electrical activity to intracellular Ca2+ signalling. The opening and closing of these channels by depolarizing stimuli, such as action potentials, allows Ca2+ ions to enter neurons down a steep electrochemical gradient, producing transient intracellular Ca2+ signals. Many of the processes that occur in neurons, including transmitter release, gene transcription and metabolism are controlled by Ca2+ influx occurring simultaneously at different cellular locales. The pore is formed by the alpha-1 subunit which incorporates the conduction pore, the voltage sensor and gating apparatus, and the known sites of channel regulation by second messengers, drugs, and toxins [ ]. The activity of this pore is modulated by four tightly-coupled subunits: an intracellular beta subunit; a transmembrane gamma subunit; and a disulphide-linked complex of alpha-2 and delta subunits, which are proteolytically cleaved from the same gene product. Properties of the protein including gating voltage-dependence, G protein modulation and kinase susceptibility can be influenced by these subunits.Voltage-gated calcium channels are classified as T, L, N, P, Q and R, and are distinguished by their sensitivity to pharmacological blocks, single-channel conductance kinetics, and voltage-dependence. On the basis of their voltage activation properties, the voltage-gated calcium classes can be further divided into two broad groups: the low (T-type) and high (L, N, P, Q and R-type) threshold-activated channels.The voltage-dependent calcium channel gamma (VDCCG) subunit family consists of at least 8 members, which share a number of common structural features[ ]. Each member is predicted to possess 4 transmembrane domains, with intracellular N- and C-termini. The first extracellular loop contains a highly conserved N-glycosylation site and a pair of conserved cysteine residues. The C-terminal 7 residues of VDCCG-2, -3, -4 and -8 are also conserved andcontain a consensus site for phosphorylation by cAMP and cGMP-dependent protein kinases, and a target site for binding by PDZ domain proteins [].The VDCCG-8 subunit was identified by high throughput genomic sequencedatabase searching, pursuing sequences similar to VDCCG-1 to -5 [ ].Mouse and rat isoforms have been cloned. VDCCG-8 mRNA is expressed in the brain and testis.
Protein Domain
Name: Phosphotransferase system, glucitol/sorbitol-specific IIA component, subgroup
Type: Family
Description: The phosphoenolpyruvate-dependent sugar phosphotransferase system (PTS) [ , ] is a major carbohydrate transport system in bacteria. The PTS catalyses the phosphorylation of incoming sugar substrates and coupled with translocation across the cell membrane, makes the PTS a link between the uptake and metabolism of sugars.The general mechanism of the PTS is the following: a phosphoryl group from phosphoenolpyruvate (PEP) is transferred via a signal transduction pathway, to enzyme I (EI) which in turn transfers it to a phosphoryl carrier, the histidine protein (HPr). Phospho-HPr then transfers the phosphoryl group to a sugar-specific permease, a membrane-bound complex known as enzyme 2 (EII), which transports the sugar to the cell. EII consists of at least three structurally distinct domains IIA, IIB and IIC [ ]. These can either be fused together in a single polypeptide chain or exist as two or three interactive chains, formerly called enzymes II (EII) and III (EIII). The first domain (IIA or EIIA) carries the first permease-specific phosphorylation site, a histidine which is phosphorylated by phospho-HPr. The second domain (IIB or EIIB) is phosphorylated by phospho-IIA on a cysteinyl or histidyl residue, depending on the sugar transported. Finally, the phosphoryl group is transferred from the IIB domain to the sugar substrate concomitantly with the sugar uptake processed by the IIC domain. This third domain (IIC or EIIC) forms the translocation channel and the specific substrate-binding site. An additional transmembrane domain IID, homologous to IIC, can be found in some PTSs, e.g. for mannose [ , , , ]. The Man family is unique in several respects among PTS permease families:It is the only PTS family in which members possess a IID protein.It is the only PTS family in which the IIB constituent is phosphorylated on a histidyl rather than a cysteyl residue.Its permease members exhibit broad specificity for a range of sugars, rather than being specific for just one or a few sugars.This family consists only of glucitol-specific transporters, and occur both in Gram-negative and Gram-positive bacteria. The system in Escherichia coli consists of a IIA protein, and a IIBC protein. This family is specific for the IIA component.
Protein Domain
Name: Phosphotransferase system, glucitol/sorbitol-specific IIA component
Type: Family
Description: The phosphoenolpyruvate-dependent sugar phosphotransferase system (PTS) [ , ] is a major carbohydrate transport system in bacteria. The PTS catalyses the phosphorylation of incoming sugar substrates and coupled with translocation across the cell membrane, makes the PTS a link between the uptake and metabolism of sugars.The general mechanism of the PTS is the following: a phosphoryl group from phosphoenolpyruvate (PEP) is transferred via a signal transduction pathway, to enzyme I (EI) which in turn transfers it to a phosphoryl carrier, the histidine protein (HPr). Phospho-HPr then transfers the phosphoryl group to a sugar-specific permease, a membrane-bound complex known as enzyme 2 (EII), which transports the sugar to the cell. EII consists of at least three structurally distinct domains IIA, IIB and IIC [ ]. These can either be fused together in a single polypeptide chain or exist as two or three interactive chains, formerly called enzymes II (EII) and III (EIII). The first domain (IIA or EIIA) carries the first permease-specific phosphorylation site, a histidine which is phosphorylated by phospho-HPr. The second domain (IIB or EIIB) is phosphorylated by phospho-IIA on a cysteinyl or histidyl residue, depending on the sugar transported. Finally, the phosphoryl group is transferred from the IIB domain to the sugar substrate concomitantly with the sugar uptake processed by the IIC domain. This third domain (IIC or EIIC) forms the translocation channel and the specific substrate-binding site. An additional transmembrane domain IID, homologous to IIC, can be found in some PTSs, e.g. for mannose [ , , , ]. The Man family is unique in several respects among PTS permease families:It is the only PTS family in which members possess a IID protein.It is the only PTS family in which the IIB constituent is phosphorylated on a histidyl rather than a cysteyl residue.Its permease members exhibit broad specificity for a range of sugars, rather than being specific for just one or a few sugars.This family consists only of glucitol-specific transporters, and occur both in Gram-negative and Gram-positive bacteria. The system in Escherichia coli consists of a IIA protein, and a IIBC protein. This family is specific for the IIA component.
Protein Domain
Name: 7TM GPCR, serpentine receptor class e (Sre)
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli [ ]. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf' []. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr [, , ]. Many of these proteins have homologues in Caenorhabditis briggsae.This entry represents serpentine receptor class e (Sre) from the Sra superfamily [ ].
Protein Domain
Name: 7TM GPCR, serpentine receptor class z (Srz)
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli [ ]. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf' []. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr [, , ]. Many of these proteins have homologues in Caenorhabditis briggsae.This entry represents serpentine receptor class z (Srz), a solo family amongst the superfamilies of chemoreceptors [ , ]. The genes encoding Srz appear to be under strong adaptive evolutionary pressure [].
Protein Domain
Name: 7TM GPCR, serpentine receptor class x (Srx)
Type: Domain
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice [ ]. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli [ ]. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf' []. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr [, , ]. Many of these proteins have homologues in Caenorhabditis briggsae.This entry represents a domain found in serpentine receptor class x (Srx) from the Srg superfamily [ , ]. Srg receptors contain seven hydrophobic, putative transmembrane, regions and can be distinguished from other 7TM GPCR receptors by their own characteristic TM signatures.
Protein Domain
Name: 7TM GPCR, serpentine receptor class xa (Srxa)
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The nematode Caenorhabditis elegans has only 14 types of chemosensory neuron, yet is able to sense and respond to several hundred different chemicals because each neuron detects several stimuli [ ]. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf' []. Chemoreception in C. elegans is mediated by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs). More than 1300 potential chemoreceptor genes have been identified in C. elegans, which are generally prefixed sr for serpentine receptor. The receptor superfamilies include Sra (Sra, Srb, Srab, Sre), Str (Srh, Str, Sri, Srd, Srj, Srm, Srn) and Srg (Srx, Srt, Srg, Sru, Srv, Srxa), as well as the families Srw, Srz, Srbc, Srsx and Srr [ , , ]. Many of these proteins have homologues in Caenorhabditis briggsae.This entry represents serpentine receptor class xa (Srxa), from the Str superfamily [ ].
Protein Domain
Name: Domain of unknown function DUF5740
Type: Domain
Description: This domain of unknown function is found in proteins from Platyhelminthes, including proteins annotated as death-domain containing proteins.
Protein Domain
Name: ZNF706/At2g23090 superfamily
Type: Homologous_superfamily
Description: Proteins with this domain include zinc finger protein 706 from animals [ ] and uncharacterised protein At2g23090 from Arabidopsis.
Protein Domain
Name: Outer membrane protein, MIM1/TOM13, mitochondrial
Type: Family
Description: The TOM13 family of proteins are mitochondrial outer membrane proteins that mediate the assembly of β-barrel proteins [ ].
Protein Domain
Name: SH3D21, SH3 domain
Type: Domain
Description: This entry represents the N-terminal SH3 domain of the uncharacterized protein SH3 domain-containing protein 21, and similar uncharacterized proteins.
Protein Domain
Name: Synergin gamma
Type: Family
Description: Synergin gamma is an EH domain-containing protein that may link the adapter protein complex AP-1 to other proteins [ ].
Protein Domain
Name: BH2638-like superfamily
Type: Homologous_superfamily
Description: This superfamily represents an orthogonal bundle domain found in proteins belonging to the uncharacterised protein family UPF0223, including protein BH2638.
Protein Domain
Name: Poxvirus serine/threonine kinase
Type: Family
Description: This family of proteins contain poxvirus serine/threonine protein kinases, which are essential for phosphorylation of virion proteins during virion assembly.
Protein Domain
Name: Herpesvirus UL41A
Type: Family
Description: This entry represents proteins from Herpesvirus. The protein UL41A is thought to be a membrane associated glycoprotein but is currently uncharacterised.
Protein Domain
Name: Calmodulin-regulated spectrin-associated protein-like, Calponin-homology domain
Type: Domain
Description: This entry represents the Calponin-homology (CH) domain found at the N-terminal Calmodulin-regulated spectrin-associated proteins (CAMSAP proteins) and related proteins from animals.
Protein Domain
Name: S-adenosyl-L-methionine binding protein, YMR209C, predicted
Type: Family
Description: This entry represents proteins predicted to function as S-adenosyl-L-methionine binding proteins, such as the uncharacterised protein YMR209C from Saccharomyces cerevisiae (Baker's yeast).
Protein Domain
Name: Orthopoxvirus A36R
Type: Family
Description: This family consists of several Orthopoxvirus A36R proteins. The A36R protein is predicted to be a type Ib membrane protein [ ].
Protein Domain
Name: Herpesvirus UL96 family
Type: Family
Description: The proteins in this entry belong to the Herpesviridae UL96 family. They include protein UL96 and protein U68. Currently no function is known.
Protein Domain
Name: Domain of unknown function DUF5817
Type: Domain
Description: This domain is present in a family of proteins predominantly found in Halobacteria. These proteins are thought to be the replication protein H.
Protein Domain
Name: Phosphoribosyltransferase-like
Type: Homologous_superfamily
Description: This superfamily represents a phosphoribosyltransferase-like domain. Proteins containing this domain include phosphoribosyltransferases (PRTases) [ ], phosphoribosylpyrophosphate synthetase-like proteins [] and some uncharacterised proteins.
Protein Domain
Name: Nonaspanin (TM9SF)
Type: Family
Description: The transmembrane 9 superfamily protein (TM9SF) may function as a channel or small molecule transporter. Proteins in this group are endosomal integral membrane proteins.
Protein Domain
Name: Helix-turn-helix, conjugative transposon-like
Type: Domain
Description: This domain appears to be a helix-turn-helix domain, suggesting a transcriptional regulatory protein. Some proteins with this domain are annotated as conjugative transposon proteins.
Protein Domain
Name: Selenium-dependent molybdenum hydroxylase system protein, YqeB family
Type: Domain
Description: Members of this protein family are probable accessory proteins for the biosynthesis of enzymes with labile selenium-containing centres, different from selenocysteine-containing proteins [ ].
Protein Domain
Name: Envelope glycoprotein N domain
Type: Domain
Description: Glycoprotein N (gN) is a Herpesvirus envelope glycoprotein necessary for proper maturation of glycoprotein M (gM) [ ]. This entry represents the C-terminal domain.
Protein Domain
Name: HnRNP-L/PTB
Type: Family
Description: Included in this family of heterogeneous ribonucleoproteins are PTB (polypyrimidine tract binding protein [ ]) and hnRNP-L []. These proteins contain four RNA recognition motifs.
Protein Domain
Name: Domain of unknown function DUF5671
Type: Domain
Description: This domain is functionally uncharacterised. Proteins containing this domain can be found in bacteria and archaea. These proteins are likely to be integral membrane proteins.
Protein Domain
Name: C-protein
Type: Family
Description: This entry represents C protein. C protein is one of the proteins involved in the production and packaging of viral single-stranded DNA [ , ].
Protein Domain
Name: DM16 repeat
Type: Repeat
Description: This repeat of unknown function has been found in Ciona intestinalis (sea squirt) COS41.4 protein, Caenorhabditis elegans R01H10.6 protein and Drosophila melanogaster (Fruit fly) CG1126 protein.
Protein Domain
Name: Herpesvirus Glycoprotein B
Type: Family
Description: This family of proteins are the surface glycoproteins of various herpesviruses. The glycoprotein is anchored to the lipid envelope of the virus by a transmembrane region.
Protein Domain
Name: MucBP domain
Type: Domain
Description: The MucBP (MUCin-Binding Protein) domain is found in a wide variety of bacterial proteins, in several repeats. The domain is found in bacterial peptidoglycan bound proteins.
Protein Domain
Name: Calcium homeostasis modulator family
Type: Family
Description: This entry represents a group of voltage-gated ion channel proteins, including calcium homeostasis modulator protein 1/2/3/4/5/6 [ ]. This family is also known as FAM26 proteins.
Protein Domain
Name: CNP1-like, uncharacterised domain
Type: Domain
Description: This group of proteins are likely to be lipoproteins. CNP1 (cryptic neisserial protein) has been expressed in Escherichia coli and shown to be localised periplasmicly [ ].
Protein Domain
Name: HipA-like kinase
Type: Domain
Description: This entry represents a domain found in proteins that are distantly related to the HipA protein . Members of this group of proteins are found in bacteria.
Protein Domain
Name: SipA, vinculin binding site
Type: Conserved_site
Description: This motif includes the three vinculin binding sites found in the Shigella SipA/IpaA protein [ ]. Proteins containing this site also include some proteins from Chlamydia species.
Protein Domain
Name: Dehydrase, ECs4332, predicted
Type: Family
Description: This group represents a predicted dehydrase, ECs4332 type. Some proteins in this entry share protein sequence similarity with 3-hydroxymyristoyl-(acyl carrier protein) dehydratase FabZ from E. coli [ ].
Protein Domain
Name: E3 ubiquitin-protein ligase RNF126-like, zinc-ribbon
Type: Domain
Description: This entry represents the zinc-ribbon domain found in RNF126 and related proteins. RNF126 is a E3 ubiquitin-protein ligase that mediates ubiquitination of target proteins [ , , ].
Protein Domain
Name: Envelope glycoprotein precursor, Hantavirus
Type: Family
Description: The medium (M) genome segment of Hantaviruses encodes the two virion glycoproteins [ ], G1 and G2, as a polyprotein precursor. This entry represents the M polyprotein precursor.
Protein Domain
Name: Domain of unknown function DUF4352
Type: Domain
Description: This immunoglobulin-like domain can be found in a group of proteins that fall into the antigen MPT63/MPB63 (immunoprotective extracellular protein) superfamily, such as uncharacterised lipoprotein YjhA ( ).
Protein Domain
Name: CppA, C-terminal
Type: Domain
Description: This is the C-terminal domain of the CppA protein found mainly in species of Streptococcus. CppA is a putative C3-glycoprotein degrading proteinase, involved in pathogenicity [ , ].
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom