Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 14701 to 14800 out of 30763 for seed protein

Category restricted to ProteinDomain (x)

0.035s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Name: KiSS-1 peptide receptor
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].The metastasis suppressor gene KiSS-1 encodes a number of RFamide-related peptides, the largest of which, metastin, contains 54 amino acids.An orphan G protein-coupled receptor, GPR54, has been identified as a receptor for these peptides []. GPR54 is highly expressed in placenta, pituitary, pancreas and spinal cord. Binding of KiSS-1-encoded peptides to the receptor results in coupling to the Gq pathway, stimulating calciummobilisation, phosphatidylinositol hydrolysis, arachidonic acid release, ERK and p38 MAP kinase phosphorylation and stress fibre formation. It alsoinhibits cell proliferation. The distribution of GPR54, together with the finding that administration of KiSS-1 peptides in rat stimulates oxytocin secretion, suggests a role in regulation of endocrine function.
Protein Domain
Name: G protein-coupled receptor 162
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence [ ]. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].Computational methods, including percent identity plots, hydropathy profiles and BLAST, have been used to analyse a gene-rich cluster at human chromosome 12p13 and to compare it with its syntenic region in mouse chromosome 6 [ , , ]. Of 6 genes identified, a number were novel receptors, including GPR153 (also known as PGR1) and GPR162 (also known as GRCA) []. GPR153 is a cerebellar target of the Gli1 transcription factor, which is involved in the maintenance and proliferation of grabule neuron precursor cells in the cerebellum, and like GPR162 has a noted role in food uptake and decision making processes [].This entry represents G-protein coupled receptor 162.
Protein Domain
Name: Neurokinin NK2 receptor
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].Neuropeptide receptors are present in very small quantities in the cell and are embedded tightly in the plasma membrane. The neuropeptides exhibita high degree of functional diversity through both regulation of peptide production and through peptide-receptor interaction []. The mammaliantachykinin system consists of 3 distinct peptides: substance P, substance K and neuromedin K. All possess a common spectrum of biological activities,including sensory transmission in the nervous system and contraction/ relaxation of peripheral smooth muscles, and each interacts with aspecific receptor type. In the periphery, NK2 receptors are found in smooth muscle of the respiratory, gastrointestinal and urogenital systems. NK2 receptorsactivate the phosphoinositide pathway through a pertussis-toxin-insensitive G-protein, probably of the Gq/G11 class.
Protein Domain
Name: Spike glycoprotein S1, N-terminal domain, betacoronavirus-like
Type: Domain
Description: This entry represents the N-terminal domain of the betacoronavirus-like trimeric spike glycoprotein. The distal S1 subunit of the coronavirus spike protein is responsible for receptor binding. S1 contains two domains; an N-terminal galectin-like domain (NTD) and a receptor-binding domain (S1 RBD) also referred to as the S1 CTD or domain B. Either the S1 NTD or S1 RBD, or occasionally both, are involved in binding the host receptors. S1 NTD is located on the side of the spike trimer and mainly recognises sugar receptors [ , , , ]. For many betacoronaviruses (bCoVs), for example mouse hepatitis virus (MHV), the RBD is located in the NTD. The structure of the MHV S1 NTD showed the same fold as human galectins (galactose-binding lectin), however it does not bind any sugar; instead, it binds to the carcinoembryonic antigen cell-adhesion molecule (CEACAM1) through protein-protein interactions []. All three CEACAM21a-binding sites in MHV spikes can be fully occupied by CEACAM1a. It has been shown that CEACAM1a binding to the MHV spike weakens the interactions between S1 and S2 and facilitates the proteolysis of the spike protein and dissociation of S1 []. The homologous bovine CoV (BCov) S1 NTD also possesses a galectin fold but binds to sialic acid-containing moieties on host cell membranes, as does the NTD of three other group A b-Covs, namely human CoV (HCoV) OC43, avian b-CoV, and infectious bronchitis virus (IBV) []. Despite the S1 NTD of human respiratory b-CoV HKU1 being highly homologous to the NTDs of MHV and bovine CoV, it does not bind to either sugar or human CEACAMs and the RBD is found instead in the S1 RBD domain [].The bCoV NTDs contain a conserved β-sandwich core, but exhibit variant folds in the peripheral elements located in the top-ceiling region and on the lateral side. The core sandwich comprises in total sixteen anti-parallel β-strands, assembling into three (upper, middle, and lower) β-sheet layers. While showing different compositions and structures, the peripheral elements are topologically equivalent β-sandwich-core insertions, highlighting a divergent evolution process for bCoVs to form different lineages [ ].
Protein Domain
Name: Photosystem II PsbJ superfamily
Type: Homologous_superfamily
Description: Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll 'a' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane [ , , ]. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection []. This entry represents the low molecular weight transmembrane protein PsbJ found in PSII. PsbJ is one of the most hydrophobic proteins in the thylakoid membrane, and is located in a gene cluster with PsbE, PsbF and PsbL (PsbEFJL). Both PsbJ and PsbL ( ) are essential for proper assembly of the OEC. Mutations in PsbJ cause the light-harvesting antenna to remain detached from the PSII dimers [ ]. In addition, both PsbJ and PsbL are involved in the unidirectional flow of electrons, where PsbJ regulates the forward electron flow from D2 (Qa) to the plastoquinone pool, and PsbL prevents the reduction of PSII by back electron flow from plastoquinol protecting PSII from photo-inactivation [].
Protein Domain
Name: Interferon regulatory factor, DNA-binding domain
Type: Domain
Description: Viral infections induce the expression of type I interferons (IFN-alpha and IFN-beta) genes. The induction is due to the transcriptional activation of theIFN genes. Interferon regulatory factor I (IRF-1) is one of the transcription factors responsible for that activation. IRF-1 binds to an upstream regulatorycis element, known as the interferon consensus sequence (ICS), which is found in the promoters of type I IFN and IFN-inducible MHC class I genes. Interferonregulatory factor 2 (IRF-2) is a protein that also interacts with the ICS, but that does not function as an activator; rather, it suppresses the function ofIRF-1 under certain circumstances [ ].These proteins share a highly conserved N-terminal domain of about 100 amino acid residues which is involved in DNA-binding and which contain fiveconserved tryptophans. This domain is known as a 'tryptophan pentad repeat' or a 'tryptophan cluster' and is also present in:Interferon consensus sequence binding protein (ICSBP) [ ], a transcriptionfactor expressed predominantly in lymphoid tissues and induced by IFN-gamma that also binds to the ICS.Transcriptional regulator ISGF3 gamma subunit [ ]. ISGF3 is responsible forthe initial stimulation of interferon-alpha-responsive genes. It recognises and binds to the interferon-stimulated response element (ISRE) within theregulatory sequences of target genes. Interferon regulatory factor 3 (IRF-3).Interferon regulatory factor 4 (IRF-4) which binds to the interferon- stimulated response element (ISRE) of the MHC class I promoter.Interferon regulatory factor 5 (IRF-5).Interferon regulatory factor 6 (IRF-6).Interferon regulatory factor 7 (IRF-7).Gamma Herpesviruses vIRF-1, -2 and -3, proteins with homology to the cellular transcription factors of the IRF family []. Neither vIRF-1 norvIRF-2 bind to DNA with the same specificity as cellular IRFs, indicating that if vIRFs are DNA-binding proteins, their binding has a patterndistinct from that of the cellular IRFs. Whether vIRF-3 can bind DNA with the same specificity as cellular IRFs is not known.The IRF tryptophan pentad repeat DNA-binding domain has an alpha/beta architecture comprising a cluster of three α-helices (alpha1-alpha3)flanked on one side by a mixed four-stranded β-sheet (beta1-beta4). It forms a helix-turn-helix motif that binds to ISRE consensus sequences found in target promoters. Three of the tryptophan residues contactDNA by recognising a GAAA sequence [ ].This entry represents the IRF tryptophan pentad repeat DNA-binding domain.
Protein Domain
Name: Anthrax toxin, edema factor, C-terminal
Type: Homologous_superfamily
Description: Anthrax toxin is a plasmid-encoded toxin complex produced by the Gram-positive, spore-forming bacteria, Bacillus anthracis. The toxin consists of three non-toxic proteins: the protective antigen (PA), the lethal factor (LF) and the edema factor (EF) [ ]. These component proteins self-assemble at the surface of host cell receptors, yielding a series of toxic complexes that can produce shock-like symptoms and death. Anthrax toxin is one of a large group of Bacillus and Clostridium exotoxins referred to as binary toxins, forming independent enzymatic (A moiety) and binding (B moiety) components. The LF and EF proteins are the enzymes (A moiety) that act on cytosolic substrates, while PA is a multi-functional protein (B moiety) that binds to cell surface receptors, mediates the assembly and internalisation of the complexes, and delivers them to the host cell endosome []. Once PA is attached to the host receptor [], it must then be cleaved by a host cell surface (furin family) protease before it is able to bind EF and LF. The cleavage of the N terminus of PA enables the C-terminal fragment to self-associate into a ring-shaped heptameric complex (prepore) that can bind LF or EF competitively. The PA-LF/EF complex is then internalised by endocytosis, and delivered to the endosome, where PA forms a pore in the endosomal membrane in order to translocate LF and EF to the cytosol. LF is a Zn-dependent metalloprotease that cleaves and inactivates mitogen-activated protein (MAP) kinases, kills macrophages, and causes death of the host by inhibiting cell proliferation [, ]. EF is a calcium-and calmodulin-dependent adenylyl cyclase that can cause edema (fluid-filled swelling) when associated with PA. EF is not toxic by itself, and is required for the survival of germinated Bacillus spores within macrophages at the early stages of infection. EF dramatically elevates the level of host intracellular cAMP, a ubiquitous messenger that integrates many processes of the cell; increases in cAMP can interfere with host intracellular signalling [].This entry represents a C-terminal region in the edema factor, the calmodulin-activated adenylate cyclase component of anthrax toxin, as well as in adenylyl cyclases from other bacterial toxins. The C-terminal region contains the calmodulin-dependent activation domain and the catalytic site [ ].
Protein Domain
Name: Mediator complex, subunit Med7
Type: Family
Description: The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins. The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11. The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation. The head module contains: MED6, MED8, MED11, SRB4/MED17, SRB5/MED18, ROX3/MED19, SRB2/MED20 and SRB6/MED22. The middle module contains: MED1, MED4, NUT1/MED5, MED7, CSE2/MED9, NUT2/MED10, SRB7/MED21 and SOH1/MED31. CSE2/MED9 interacts directly with MED4. The tail module contains: MED2, PGD1/MED3, RGR1/MED14, GAL11/MED15 and SIN4/MED16. The CDK8 module contains: MED12, MED13, CCNC and CDK8. Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.This family consists of several eukaryotic proteins, which are homologues of the yeast MED7 protein. Activation of gene transcription in metazoans is a multistep process that is triggered by factors that recognise transcriptional enhancer sites in DNA. These factors work with co-activators such as MED7 to direct transcriptional initiation by the RNA polymerase II apparatus [ ].
Protein Domain
Name: Arterivirus papain-like cysteine protease beta (PCPbeta) domain
Type: Domain
Description: Arteriviruses are enveloped, positive-stranded RNA viruses and include pathogens of major economic concern to the swine- and horse-breedingindustries: Equine arteritis virus (EAV).Porcine reproductive and respiratory syndrome virus (PRRSV).Mice actate dehydrogenase-elevating virus.Simian hemorrhagic fever virus.The arterivirus replicase gene is composed of two open reading frames (ORFs).ORF1a is translated directly from the genomic RNA, whereas ORF1b can be expressed only by ribosomal frameshifting, yelding a 1ab fusion protein. Bothreplicase gene products are multidomain precursor proteins which are proteolytically processed into functional nonstructural proteins (nsps) by a complex proteolytic cascade that is directed by four (PRRSV/LDV) or three(EAV) proteinase domains encoded in ORF1a. The arterivirus replicase processing scheme involves the rapid autoproteolytic release of two or threeN-terminal nsps (nsp1 (or nsp1alpha/1beta) and nsp2) and the subsequent processing of the remaining polyproteins by the "main protease"residing in nsp4, together resulting in a set of 13 or 14 individual nsps. The arterivirus nsp1 region contains a tandem ofpapain-like cysteine autoprotease domains (PCPalpha and PCPbeta), but in EAV PCPalpha has lost its enzymatic activity, resulting in the 'merge' ofnsp1alpha and nsp1beta into a single nsp1 subunit. Thus, instead of three self-cleaving N-terminal subunits, EAV has two: nsp1 and nsp2. The PCPalphaand PCPbeta domains mediate the nsp1alpha|1beta and nsp1beta|2 cleavages, respectively. The catalytic dyad of PCPalpha and PCPbeta domains is composedof Cys and His residues. In EAV, a Lys residue is found in place of the catalytic Cys residue, which explains the proteolytic deficiency of the EAVPCPalpha domain [ , , , ]. The PCPalpha and PCPbeta domains form respectivelypeptidase families C31 and C32. The PCPalpha and PCPbeta domains have a typical papain fold, which consists ofa compact global region containing sequentially connected left (L) and right (R) parts in a so-called standard orientation. The L subdomain of PCPalphaconsists of four α-helices, while the R subdomain is formed by three antiparallel beta strands []. The L subdomain of the PCBbetaconsists of three α-helices, while the R subdomain is formed by four antiparallel β-strands []. The Cys and His residues faceeach other at the L-R interface and form the catalytic centre of the PCPalpha and PCPbeta domains [, ].This entry represents the PCPbeta domain (peptidase C32).
Protein Domain
Name: Probable [NiFe]-hydrogenase-type-3 Eha complex membrane subunit A
Type: Family
Description: [NiFe] hydrogenases function in H2 metabolism in a variety of microorganisms, enabling them to use H2 as a source of reducing equivalent under aerobic and anaerobic conditions [NiFe]hydrogenases consist of two subunits, hydrogenase large and hydrogenase small. The large subunit contains the binuclear [NiFe] active site, while the small subunit binds at least one [4Fe-4S]cluster [ ].Energy-converting [NiFe] hydrogenases (or [NiFe]-hydrogenase-3-type) form a distinct group within the [NiFe] hydrogenase family [, , ]. Members of this subgroup include:Hydrogenase 3 and 4 (Hyc and Hyf) from Escherichia coliCO-induced hydrogenase (Coo) from Rhodospirillum rubrumMbh hydrogenase from Pyrococcus furiosusEha and Ehb hydrogenases from Methanothermobacter speciesEch hydrogenase from Methanosarcina barkeriEnergy-converting [NiFe] hydrogenases are membrane-bound enzymes with a six-subunit core: the large and small hydrogenase subunits, plus two hydrophilic proteins and two integral membrane proteins. Their large and small subunits show little sequence similarity to other [NiFe]hydrogenases, except for key conserved residues coordinating the active site and [FeS] cluster. However, they show considerable sequence similarity to the six-subunit, energy-conserving NADH:quinone oxidoreductases (complex I), which are present in cytoplasmic membranes of many bacteria and in inner mitochondrial membranes. However, the reactions they catalyse differ significantly from complex I. Energy-converting [NiFe]hydrogenases function as ion pumps.Eha and Ehb hydrogenases contain extra subunits in addition to those shared by other energy-converting [NiFe] hydrogenases (or [NiFe]-hydrogenase-3-type). Eha contains a 6[4Fe-4S] polyferredoxin, a 10[4F-4S]polyferredoxin, ten other predicted integral membrane proteins (EhaA , EhaB , EhaC , EhaD , EhaE , EhaF , EhaG , EhaI , EhaK , EhaL and ) and four hydrophilic subunits (EhaM, EhaR, EhS, EhT) [ , ]. The ten predicted integral membrane proteins are absent from Ech, Coo, Hyc and Hyf complexes, which may have simpler membrane components than Eha. Eha and Ehb catalyse the reduction of low-potential redox carriers (e.g. ferredoxins or polyferredoxins), which then might function as electron donors to oxidoreductases.Based on sequence similarity and genome context analysis, other organisms such as Methanopyrus kandleri, Methanocaldococcus jannaschii, and Methanothermobacter marburgensis also encode Eha-like [NiFe]-hydrogenase-3-type complexes and have very similar ehaoperon structure. This entry represents small membrane proteins that are predicted to be the EhaA transmembrane subunits of multisubunit membrane-bound [NiFe]-hydrogenase Eha complexes.
Protein Domain
Name: Parvalbumin
Type: Family
Description: Fish allergies are common in Europe, particularly among male children and young adults. Children allergic to fish react variably to different species. Cod is among the most common offenders, while salmon is the one besttolerated. The allergy-eliciting protein has been isolated from the white muscle albumin. It is a parvalbumin, designated Allergen M. Parvalbumins arecalcium (Ca)-binding proteins of low molecular weight. Like many other Ca-binding proteins, they belong to the EF-hand family characterised byhelix-loop-helix (HLH) binding motifs (two helices pack together at an angle of ~90 degrees, separated by a loop region where calcium binds). In the parvalbumin HLH structural motif, calcium is coordinated through one carbonyl oxygen atom and the oxygen-containing side-chains of 5 amino acidresidues, or 4 residues and a water molecule [, , ].Initially, parvalbumins were detected in relatively high amounts in lower vertebrate white muscle, where they were thought to be important for fibrerelaxation. They were subsequently found, although in lesser amounts, in the fast twitch skeletal muscles of higher vertebrates, as well as in a varietyof non-muscle tissues, including testis, endocrine glands, skin and specific neurons. There are two distinct phylogenetic lineages: alpha and beta. Mostmuscles contain parvalbumin of only alpha or beta origin. Cod parvalbumin belongs to the beta-lineage and shares significant similarity with parvalbumin of other fish species [ , ].Allergen M contains 113 residues, is a homogenous acidic protein and belongsto a group of muscle sarcoplasmic proteins. It carries the major allergenic determinants associated with cod sensitivity, which is dependent directly onthe linear structure rather than on the molecular conformation. The allergenic activity of allergen M resides in particular epitopes found inthree loops: AB (~13-33), CD (~48-64) and EF (~80-103). It has an N-acetyl terminal amino acid residue and includes 1 residue of glucose attached to the conserved N-terminal cysteine, and 1 residue each of tyrosine, tryptophan and arginine - the arginine is believed to play a key role in maintaining the tertiary structure. Mutation of the last conserved coordinating residue of the Ca-binding loop (E101D-motif 4) has also beenshown to have a significant impact on the ability of the mutant to obtain the sevenfold coordination preferred by Ca2+.
Protein Domain
Name: Mediator complex, subunit Med15, fungi
Type: Family
Description: The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins. The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11. The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation. The head module contains: MED6, MED8, MED11, SRB4/MED17, SRB5/MED18, ROX3/MED19, SRB2/MED20 and SRB6/MED22. The middle module contains: MED1, MED4, NUT1/MED5, MED7, CSE2/MED9, NUT2/MED10, SRB7/MED21 and SOH1/MED31. CSE2/MED9 interacts directly with MED4. The tail module contains: MED2, PGD1/MED3, RGR1/MED14, GAL11/MED15 and SIN4/MED16. The CDK8 module contains: MED12, MED13, CCNC and CDK8. Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.This family represents subunit 15 of the Mediator complex in fungi. It contains Saccharomyces cerevisiae GAL11 (Med15) protein. Gal11 (Med15) and Sin4 (Med16) proteins are S. cerevisiae global transcription factors that regulate transcription of a variety of genes, both positively and negatively. Gal11, in a major part, functions in the activation of transcription, whereas Sin4 has an opposite role [ ].
Protein Domain
Name: Kalirin/Triple functional domain protein, SH3 domain 1
Type: Domain
Description: This entry includes a group of RhoGEFs, including Kalirin and Triple functional domain protein (TRIO) from mammals. Kalirin and TRIO are encoded by separate genes in mammals and by a single one in invertebrates. Kalirin and TRIO share the same complex multidomain structure and display several splice variants. They are implicated in secretory granule (SG) maturation and exocytosis [ , ]. The longest Kalirin and TRIO proteins have a Sec14 domain, a stretch of spectrin repeats, a RhoGEF(DH)/PH cassette (also called GEF1), an SH3 domain, a second RhoGEF(DH)/PH cassette (also called GEF2), a second SH3 domain, Ig/FNIII domains, and a kinase domain. The first RhoGEF(DH)/PH cassette catalyses exchange on Rac1 and RhoG while the second RhoGEF(DH)/PH cassette is specific for RhoA. Kalirin and TRIO are closely related to p63RhoGEF and have PH domains of similar function. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner [, ].TRIO contains a protein kinase domain and two guanine nucleotide exchange factor (GEF) domains [ ]. These functional domains suggest that it may play a role in signalling pathways controlling cell proliferation []. TRIO may form a complex with LAR transmembrane protein tyrosine phosphatase (PT-Pase), which localises to the ends of focal adhesions and plays an important part in coordinating cell-matrix and cytoskeletal rearrangements necessary for cell migration []. Its expression is associated with invasive tumour growth and rapid tumour cell proliferation in urinary bladder cancer [].Kalirin ( ) promotes the exchange of GDP by GTP and stimulates the activity of specific Rho GTPases [ ]. There are several Kalirin isoforms in humans and mice. Each Kalirin isoform is composed of a unique collection of domains and may have different functions []. In rat, isoforms 1 and 7 are necessary for neuronal development and axonal outgrowth, while isoform 6 is required for dendritic spine formation []. In humans, the major isoform of Kalirin in the adult brain is Kalirin-7, which plays a critical role in spine formation/synaptic plasticity. Kalirin-7 has been linked to neuropsychiatric and neurological diseases such as Alzheimer's, Huntingtin's, ischemic stroke, schizophrenia, depression, and cocaine addiction [, , ].This entry represents the first SH3 domain present in Kalirin and TRIO.
Protein Domain
Name: Photosystem II PsbZ, reaction centre
Type: Family
Description: Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll 'a' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.PSII is a multisubunit protein-pigment complex containing polypeptides both intrinsic and extrinsic to the photosynthetic membrane [ , , ]. Within the core of the complex, the chlorophyll and beta-carotene pigments are mainly bound to the antenna proteins CP43 (PsbC) and CP47 (PsbB), which pass the excitation energy on to the reaction centre proteins D1 (Qb, PsbA) and D2 (Qa, PsbD) that bind all the redox-active cofactors involved in the energy conversion process. The PSII oxygen-evolving complex (OEC) oxidises water to provide protons for use by PSI, and consists of OEE1 (PsbO), OEE2 (PsbP) and OEE3 (PsbQ). The remaining subunits in PSII are of low molecular weight (less than 10kDa), and are involved in PSII assembly, stabilisation, dimerisation, and photo-protection []. This family represents PsbZ (Ycf9), which is a core low molecular weight transmembrane protein of photosystem II in thylakoid-containing chloroplasts of cyanobacteria and plants. It is thought to be located at the interface of PSII and LHCII (light-harvesting complex II) complexes, the latter containing the light-harvesting antenna. PsbZ appears to act as a structural factor, or linker, that stabilises the PSII-LHCII supercomplexes, which fail to form in PsbZ-deficient mutants. This may in part be due to the marked decrease in two LHCII antenna proteins, CP26 and CP29, found in PsbZ-deficient mutants, which result in structural changes, as well as functional modifications in PSII [ ]. PsbZ may also be involved in photo-protective processes under sub-optimal growth conditions.
Protein Domain
Name: GPCR fungal pheromone mating factor, STE2
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups []. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].GPCR Fungal pheromone mating factor receptors form a distinct family of G-protein-coupled receptors, and are also known as Class D GPCRs.The Fungal pheromone mating factor receptors STE2 and STE3 are integral membrane proteins that may be involved in the response to mating factors on the cell membrane [ , , ]. The amino acid sequences of both receptors contain high proportions of hydrophobic residues grouped into 7 domains,in a manner reminiscent of the rhodopsins and other receptors believed tointeract with G-proteins. However, while a similar 3D framework has been proposed to account for this, there is no significant sequence similarity either between STE2 and STE3, or between these and the rhodopsin-type family: the receptors thereofore bear their own unique '7TM' signatures which is why they have been given their own GPCR group: Class D Fungal mating pheromone receptors.This entry represents the STE2-type family of GPCR. The STE2 gene of Saccharomyces cerevisiae (Baker's yeast) is the cell-surface receptor that binds the 13-residue lipopeptide a-factor.
Protein Domain
Name: ABC transporter C family, six-transmembrane helical domain 1
Type: Domain
Description: This entry represents the six-transmembrane domain 1 (TMD1) of the ABC transporters that belong to the ABCC subfamily [ ]. This TM subunit exhibits the type 3 ATP-binding cassette (ABC) exporter fold, which is characterised by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The type 3 ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds, a various type of lipids and polypeptides [].ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [ ].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions []. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [, ]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [, , ]; (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1).
Protein Domain
Name: Tensin-type phosphatase domain
Type: Domain
Description: Tensins constitute an eukaryotic family of lipid phosphatases that are defined by the presence of two adjacent domains: a lipid phosphatase domain and a C2-like domain. The tensin-type C2 lacks the canonical Ca(2+) ligands found in classical C2 domains, and in this respect it is similar to the C2 domains of PKC-type [ , ]. The tensin-type C2 domain can bind phospholipid membranes in a Ca(2+) independent manner [ ]. In the tumor suppressor protein PTEN, the best characterized member of the family, the lipid phosphatase domain was shown to specifically dephosphorylate the D3 position of the inositol ring of the lipid second messenger, phosphatydilinositol-3-4-5-triphosphate (PIP3). The lipid phosphatase domain contains the signature motif HCXXGXXR present in the active sites of protein tyrosine phosphatases (PTPs) and dual specificity phosphatases (DSPs). Furthermore, two invariant lysines are found only in the tensin-type phosphatase motif (HCKXGKXR) and are suspected to interact with the phosphate group at position D1 and D5 of the inositol ring [, ].The crystal structure of the PTEN tumor suppressor has been solved [ ]. The lipid phosphatase domain has a structure similar to the dual specificity phosphatase. However, PTEN has a larger active site pocket that could be important to accommodate PI(3,4,5)P3. The tensin-type C2 domain has a structure similar to the classical C2 domain that mediates the Ca2+ dependent membrane recruitment of several signaling proteins. However the tensin-type C2 domain lacks two of the three conserved loops that bind Ca2+.Proteins known to contain a phosphatase and a C2 tensin-type domain are listed below:Tensin, a focal-adhesion molecule that binds to actin filaments. It may be involved in cell migration, cartilage development and in linking signal transduction pathways to the cytoskeleton.Phosphatase and tensin homologue deleted on chromosome 10 protein (PTEN). It antagonizes PI 3-kinase signalling by dephosphorylating the 3-position of the inositol ring of PI(3,4,5)P3 and thus inactivates downstreamsignalling. It plays major roles both during development and in the adult to control cell size, growth, and survival.Auxilin. It binds clathrin heavy chain and promotes its assembly into regular cages.Cyclin G-associated kinase or auxilin-2. It is a potential regulator of clathrin-mediated membrane trafficking.PTEN homologues in fungi have the tensin phosphatase domain, but they lack the C2 domain. This entry represents the phosphatase domain.
Protein Domain
Name: ABC transporter C family, six-transmembrane helical domain 2
Type: Domain
Description: This entry represents the six-transmembrane domain 2 (TMD2) of the ABC transporters that belong to the ABCC subfamily [ ]. This TM subunit exhibits the type 3 ATP-binding cassette (ABC) exporter fold, which is characterised by 6 TM helices per subunit (domain), or a total of 12 TM helices for the complete transporter. The type 3 ABC exporters are found in both prokaryotes and eukaryotes, where they mediate the cellular secretion of toxic compounds, a various type of lipids and polypeptides [].ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [ ].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions [ ]. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [, ]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [, , ]; (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1).
Protein Domain
Name: Mediator complex, subunit Med7 superfmaily
Type: Homologous_superfamily
Description: This superfamily consists of several eukaryotic proteins, which are homologues of the yeast MED7 protein. Activation of gene transcription in metazoans is a multistep process that is triggered by factors that recognise transcriptional enhancer sites in DNA. These factors work with co-activators such as MED7 to direct transcriptional initiation by the RNA polymerase II apparatus [ ].The Mediator complex is a coactivator involved in the regulated transcription of nearly all RNA polymerase II-dependent genes. Mediator functions as a bridge to convey information from gene-specific regulatory proteins to the basal RNA polymerase II transcription machinery. The Mediator complex, having a compact conformation in its free form, is recruited to promoters by direct interactions with regulatory proteins and serves for the assembly of a functional preinitiation complex with RNA polymerase II and the general transcription factors. On recruitment the Mediator complex unfolds to an extended conformation and partially surrounds RNA polymerase II, specifically interacting with the unphosphorylated form of the C-terminal domain (CTD) of RNA polymerase II. The Mediator complex dissociates from the RNA polymerase II holoenzyme and stays at the promoter when transcriptional elongation begins. The Mediator complex is composed of at least 31 subunits: MED1, MED4, MED6, MED7, MED8, MED9, MED10, MED11, MED12, MED13, MED13L, MED14, MED15, MED16, MED17, MED18, MED19, MED20, MED21, MED22, MED23, MED24, MED25, MED26, MED27, MED29, MED30, MED31, CCNC, CDK8 and CDC2L6/CDK11. The subunits form at least three structurally distinct submodules. The head and the middle modules interact directly with RNA polymerase II, whereas the elongated tail module interacts with gene-specific regulatory proteins. Mediator containing the CDK8 module is less active than Mediator lacking this module in supporting transcriptional activation. The head module contains: MED6, MED8, MED11, SRB4/MED17, SRB5/MED18, ROX3/MED19, SRB2/MED20 and SRB6/MED22. The middle module contains: MED1, MED4, NUT1/MED5, MED7, CSE2/MED9, NUT2/MED10, SRB7/MED21 and SOH1/MED31. CSE2/MED9 interacts directly with MED4. The tail module contains: MED2, PGD1/MED3, RGR1/MED14, GAL11/MED15 and SIN4/MED16. The CDK8 module contains: MED12, MED13, CCNC and CDK8. Individual preparations of the Mediator complex lacking one or more distinct subunits have been variously termed ARC, CRSP, DRIP, PC2, SMCC and TRAP.
Protein Domain
Name: Prion, copper binding octapeptide repeat
Type: Repeat
Description: Prion protein (PrP-c) [ , , ] is a small glycoprotein found in high quantity in the brain of animals infected with certain degenerative neurological diseases, such as sheep scrapie and bovine spongiform encephalopathy (BSE), and the human dementias Creutzfeldt-Jacob disease (CJD) and Gerstmann-Straussler syndrome (GSS). PrP-c is encoded in the host genome and is expressed both in normal and infected cells. During infection, however, the PrP-c molecule become altered (conformationally rather than at the amino acid level) to an abnormal isoform, PrP-sc. In detergent-treated brain extracts from infected individuals, fibrils composed of polymers of PrP-sc, namely scrapie-associated fibrils or prion rods, can be evidenced by electron microscopy. The precise function of the normal PrP isoform in healthy individuals remains unknown. Several results, mainly obtained in transgenic animals, indicate that PrP-c might play a role in long-term potentiation, in sleep physiology, in oxidative burst compensation (PrP can fix four Cu2+ through its octarepeat domain), in interactions with the extracellular matrix (PrP-c can bind to the precursor of the laminin receptor, LRP), in apoptosis and in signal transduction (costimulation of PrP-c induces a modulation of Fyn kinase phosphorylation) [ ].The normal isoform, PrP-c, is anchored at the cell membrane, in rafts, through a glycosyl phosphatidyl inositol (GPI); its half-life at the cell surface is 5 h, after which the protein is internalised through a caveolae-dependent mechanism and degraded in the endolysosome compartment. Conversion between PrP-c and PrP-sc occurs likely during the internalisation process. The N-terminal domain of the prion protein includes the N-terminal, positively charged polybasic region and the octapeptide repeat (OR) region The latter, represented by this entry, has been shown to bind to copper, which has been related with novel inter-domain interaction. Both are important for the convesion of PrP-c into PrP-sc [ , , , ]. The number and organization of the repeats varies and is thought to be related to phenotypic variation. In this way, the insertion of four or more octapeptide repeats has been proved to be pathogenic, while smaller repeat insertions have an unclear pathogenicity. In any case, the presence of this insertion repeat in the protein may slightly increase risk of developing sporadic CJD [].
Protein Domain
Name: U2A'/phosphoprotein 32 family A, C-terminal
Type: Domain
Description: This motif occurs C-terminal to leucine-rich repeats in "sds22-like"and "typical"LRR-containing proteins. Examples from the metazoa are described as either "Acidic leucine-rich nuclear phosphoprotein 32 family member A"or have been characterised as U2A', the protein that interacts with U2B'' facilitating the interaction with U2 snRNA. U2A' is required for the spliceosome assembly and the efficient addition of U2 snRNP onto the pre-mRNA [ ]. The crystal structure of the spliceosomal U2B"-U2A' protein complex bound to a fragment of U2 small nuclear RNA has been described [ ].
Protein Domain
Name: SCAR/WAVE family
Type: Family
Description: This entry represents the SCAR/WAVE family. Members in this family include actin-binding protein WASF1-3 (WAVE 1-3), protein SCAR 1-4, protein WAVE5 and SCAR-like proteins.SCAR/WAVE family members are downstream effector molecules receiving information from multiple signalling pathways and responding by promoting the actin nucleating activity of the ubiquitous Arp2/3 complex [ ]. They are part of the WAVE complex that regulates lamellipodia formation in animals and maintains cell shape in plants [, , ]. WAVE1 is also involved in the regulation of mitochondrial dynamics [].
Protein Domain
Name: ClpA/B, conserved site 1
Type: Conserved_site
Description: The ClpA/B family of ATP-binding proteins includes the regulatory subunit of the ATP-dependent protease Clp, ClpA; heat shock proteins ClpB, 104 and 78; and chloroplast proteins CD4a (ClpC) and CD4b [ , ]. The proteins are thought to protect cells from stress by controlling the aggregation and denaturation of vital cellular structures. They vary in size, but share two conserved regions of about 200 amino acids that each contains an ATP-binding site [].This entry represents a conserved site found in the first conserved region.
Protein Domain
Name: PAZ domain
Type: Domain
Description: This domain is named after the proteins Piwi Argonaut and Zwille. It is also found in the CAF protein from Arabidopsis thaliana. The function of the domain is unknown but has been found in the middle region of a number of members of the Argonaute protein family, which also contain the Piwi domain ( ) in their C-terminal region [ ]. Several members of this family have been implicated in thedevelopment and maintenance of stem cells through the RNA-mediated gene-quelling mechanisms associated with the protein DICER.
Protein Domain
Name: TPX2, C-terminal
Type: Domain
Description: This entry represents a conserved region approximately 60 residues long within the eukaryotic targeting protein for Xklp2 (TPX2). Xklp2 is a kinesin-like protein localised on centrosomes throughout the cell cycle and on spindle pole microtubules during metaphase. In Xenopus, it has been shown that Xklp2 protein is required for centrosome separation and maintenance of spindle bi-polarity [ , ]. TPX2 is a microtubule-associated protein that mediates the binding of the C-terminal domain of Xklp2 to microtubules. It is phosphorylated during mitosis in a microtubule-dependent way [].
Protein Domain
Name: Lipoprotein NlpA family
Type: Family
Description: This entry represents bacterial lipoproteins that belong to the NlpA family [ ]. It contains several antigenic members, that may be involved in bacterial virulence. This entry includes the D-methionine binding lipoprotein MetQ, which is the substrate-binding component of a D-methionine permease, a binding protein-dependent, ATP-driven transport system [, ]. Other members of this family, such as NlpA, have been identified as putative substrate-binding components of ABC transporters. NlpA, is an inner-membrane-anchored lipoprotein that has been shown to have a minor role in methionine import [, ].
Protein Domain
Name: Retrovirus capsid, N-terminal
Type: Homologous_superfamily
Description: The Gag polyprotein from retroviruses is processed by viral protease to produce the major structural proteins, including the capsid protein. The newly formed capsid protein rearranges to form the capsid core particle that surrounds the viral genome of the mature virus. The capsid is composed of two domains, the N-terminal domain (NTD), which contributes to viral core formation, and the C-terminal domain (CTD), which is required for capsid dimerisation, Gag oligomerization and viral formation. The NTD is composed of a five-helix bundle [ , ].
Protein Domain
Name: Omega-3 polyunsaturated fatty acid synthase-like
Type: Domain
Description: This entry represents a group of proteins based on a PfaB protein family. The protein PfaB family is part of a four-gene locus, which is similar to polyketide biosynthesis systems, responsible for omega-3 polyunsaturated fatty acid biosynthesis in several high pressure and/or cold-adapted bacteria. The fairly permissive trusted cut off set for this HMM allows detection of homologues encoded near homologues to other proteins of the locus: PfaA, PfaC, and/or PfaD. The likely role in every case is either polyunsaturated fatty acid or polyketide biosynthesis.
Protein Domain
Name: AP-2 complex subunit sigma
Type: Family
Description: Adaptor protein complexes are vesicle coat components and appear to be involved in cargo selection and vesicle formation. AP-2 complex subunit sigma (APS2, also known as AP17) is a component of the adaptor protein complex which links clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration [ ]. The AP-2 alpha and AP-2 sigma subunits are thought to contribute to the recognition of the [ED]-X-X-X-L-[LI] motif [].
Protein Domain
Name: CC2D2A, N-terminal, C2 domain
Type: Domain
Description: A C2 domain is usually involved in targeting proteins to cell membranes. Ciliary CC2D2A protein has two C2 domains and an inactive transglutaminase-like peptidase domain (CC2D2A-TGL) [ ]. This entry represents the first C2 domain. CC2D2A (coiled-coil and C2 domain-containing protein 2A) is a component of the tectonic-like complex, a complex localised at the transition zone of primary cilia and acting as a barrier that prevents diffusion of transmembrane proteins between the cilia and plasma membranes []. It is required for ciliogenesis and sonic hedgehog/SHH signalling [].
Protein Domain
Name: CshA domain
Type: Domain
Description: This entry represents a domain that is found in a variety of bacterial cell surface proteins. Many proteins with this domain have a C-terminal LPXTG anchor. The best studied protein containing this domain is the CshA adhesin protein from Streptococcus gordonii that contains 17 tandem copies of the domain [ ]. The S. gordonii fibrillar adhesin CshA plays an important role in host colonization. The structure of the CshA domain shows a variant of the immunoglobulin fold lacking the typical A and B strands [].
Protein Domain
Name: Golgi phosphoprotein 3-like domain superfamily
Type: Homologous_superfamily
Description: This superfamily represents a domain found in several eukaryotic GPP34 like proteins. GPP34 (also known as golgi phosphoprotein 3) localises to the Golgi complex and is conserved from Saccharomyces cerevisiae to humans. The cytosolically exposed location of GPP34 predicts a role for a novel coat protein in Golgi trafficking [ ]. The budding yeast GPP34 homologue, also known as Vps74, is a phosphatidylinositol-4-phosphate-binding protein that links Golgi membranes to the cytoskeleton and may participate in the tensile force required for vesicle budding from the Golgi [].
Protein Domain
Name: Mesothelin
Type: Family
Description: This family consists of several mammalian pre-pro-megakaryocyte potentiating factor precursor (MPF) or mesothelin proteins. Mesothelin is a glycosylphosphatidylinositol-linked glycoprotein highly expressed in mesothelial cells, mesotheliomas, and ovarian cancer, which participates in cell adhesion, tumour progression, metastasis, and drug resistance [ , , ]. This protein is predicted have superhelical structures with ARM-type repeats which suggests that it may act as superhelical lectins to bind the carbohydrate moieties of extracellular glycoproteins []. Due to its overexpression in various tumours, it constitutes a diagnostic and therapy target [].
Protein Domain
Name: PABC (PABP) domain
Type: Homologous_superfamily
Description: The polyadenylate-binding protein (PABP) has a conserved C-terminal domain (PABC), which is also found in the hyperplastic discs protein (HYD) family of ubiquitin ligases that contain HECT domains ( ) [ ]. PABP recognises the 3' mRNA poly(A) tail and plays an essential role in eukaryotic translation initiation and mRNA stabilisation/degradation. PABC domains of PABP are peptide-binding domains that mediate PABP homo-oligomerisation and protein-protein interactions. In mammals, the PABC domain of PABP functions to recruit several different translation factors to the mRNA poly(A) tail [].
Protein Domain
Name: Golgi associated RAB2 interactor protein-like, Rab2B-binding domain
Type: Domain
Description: This domain is found in Golgi-associated RAB2 interactor proteins and it is likely a Rab2B-binding domain [ , , ]. Some members included in this entry, such as GAR1A/B andGAR2, are RAB2B effector proteins required for accurate acrosome formation and normal male fertility [, ]. GARIL5 is a Rab2B effector protein which promotes cytosolic DNA-induced innate immune responses [] and GARI-L4 is a RAB2B effector protein required for the compacted Golgi morphology []. This domain is also found in TASOR2, whose function is unknown.
Protein Domain
Name: Vesiculovirus matrix
Type: Family
Description: This family consists of several Vesiculovirus matrix proteins. The matrix (M) protein of vesicular stomatitis virus (VSV) expressed in the absence of other viral components causes many of the cytopathic effects of VSV, including an inhibition of host gene expression and the induction of cell rounding. It has been shown that M protein also induces apoptosis in the absence of other viral components. It is thought that the activation of apoptotic pathways causes the inhibition of host gene expression and cell rounding by M protein [ ].
Protein Domain
Name: Porin, Oms28 type
Type: Family
Description: The outer membrane-spanning (Oms) proteins of Borrelia burgdorferi have been isolated and their porin activities characterised; 0.6-nS porin activity was found to reside in a 28kDa protein, designated Oms28 [ ]. The gene sequence of oms28 was found to encode a 257-amino-acid precursor protein with a putative 24-amino-acid leader peptidase I signal sequence []. The Oms28 protein partly fractionated to the outer membrane, and was characterised by an average single-channel conductance of 1.1 nS in a planar lipid bilayer assay, confirming Oms28 to be a porin [].
Protein Domain
Name: Coatomer subunit gamma, C-terminal
Type: Domain
Description: This entry represents the very C-terminal domain of the eukaryotic Coatomer subunit gamma proteins. It acts as a platform domain to the C-terminal appendage. It carries one single protein/protein interaction site, which is the binding site for ARFGAP2 or ADP-ribosylation factor GTPase-activating protein. COPI-coated vesicles mediate retrograde transport from the Golgi back to the ER and intra-Golgi transport. The gamma subunit is part of one of two subcomplexes that make up the heptameric coatomer complex along with the beta, delta and zeta subunits [ ].
Protein Domain
Name: Mastadenovirus E4/Orf3
Type: Family
Description: This family consists of several Mastadenovirus E4 ORF3 proteins. Early proteins E4 ORF3 and E4 ORF6 have complementary functions during viral infection. Both proteins facilitate efficient viral DNA replication, late protein expression, and prevention of concatenation of viral genomes. A unique function of E4 ORF3 is the reorganisation of nuclear structures known as PML oncogenic domains (PODs). The function of these domains is unclear, but PODs have been implicated in a number of important cellular processes, including transcriptional regulation, apoptosis, transformation, and response to interferon [ ].
Protein Domain
Name: KANL3/Tex30, alpha/beta hydrolase-like domain
Type: Domain
Description: This entry represents an α/β hydrolase-fold domain of the animal KAT8 regulatory NSL complex subunit 3 (KANSL3 or NSL3, also known as testis development protein PRTD) and the testis-expressed sequence 30 proteins.KAT8 regulatory NSL complex subunit 3 is part of the NSL complex that is involved in acetylation of nucleosomal histone H4 on several lysine residues and therefore may be involved in the regulation of transcription [ ]. The function of the testis-expressed sequence 30 protein is not known.This entry also includes uncharacterized bacterial proteins.
Protein Domain
Name: Mitochondrial import receptor subunit Tom40, fungi
Type: Family
Description: Tom40 is a mitochondrion outer membrane protein and a component of the TOM (translocator of the outer mitochondrial membrane) complex, which is essential for import of protein precursors into mitochondria [ ]. In Saccharomyces cerevisiae, TOM complex is composed of the subunits Tom70, Tom40, Tom22, Tom20, Tom7, Tom6, and Tom5 [, ]. Tom40 is an integral membrane protein and the main structural component of the protein-conducting channel formed by the TOM complex []. It is stabilised by other components, such as Tom5, Tom6, and Tom7 [].
Protein Domain
Name: NLRC4, helical domain
Type: Domain
Description: This is a helical domain found in NLRC4, an nucleotide-binding and oligomerization domain-like receptor (NLR) protein. Structural and functional studies indicate that the helical domain HD2 repressively contacted a conserved and functionally important α-helix of the NBD (nucleotide binding domain) in NLRC4. Furthermore, the HD2 domain was shown to cap the N-terminal side of the LRR (leucine-rich repeat) domain via extensive interactions [ ]. Other proteins carrying this domain include baculoviral IAP repeat-containing protein 1 (Birc1), also known as neuronal apoptosis inhibitory protein (Naip).
Protein Domain
Name: WWE domain superfamily
Type: Homologous_superfamily
Description: The WWE domain is named after three of its conserved residues and is predicted to mediate specific protein-protein interactions in ubiquitin and ADP ribose conjugation systems. This domain is found as a tandem repeat at the N-terminal of Deltex, a cytosolic effector of Notch signalling thought to bind the N-terminal of the Notch receptor [ ]. It is also found as an interaction module in protein ubiquination and ADP ribosylation proteins [].Structurally, the WWE domain has a beta(2)-α-β(4) fold with antiparallel folded β-sheet (half-barrel).
Protein Domain
Name: FNBP1L, SH3 domain
Type: Domain
Description: This entry represents the SH3 domain of FNBP1L.Formin-binding protein 1-like (FNBP1L, also known as Toca-1) is part of the Toca-1-N-WASP complex that induces the formation of filopodia and endocytic vesicles [ ]. FNBP1L consists of an F-BAR domain, a Cdc42 binding site and an SH3 domain []. It belongs to the CIP4 (Cdc42 interacting protein-4) subfamily of the F-BAR (F for FCH, Fer-CIP4 homology domain) protein family. The F-BAR proteins have been implicated in cell membrane processes such as membrane invagination, tubulation and endocytosis [].
Protein Domain
Name: FNBP1L, F-BAR domain
Type: Domain
Description: This entry represents the F-BAR domain of FNBP1L.Formin-binding protein 1-like (FNBP1L, also known as Toca-1) is part of the Toca-1-N-WASP complex that induces the formation of filopodia and endocytic vesicles [ ]. FNBP1L consists of an F-BAR domain, a Cdc42 binding site and an SH3 domain []. It belongs to the CIP4 (Cdc42 interacting protein-4) subfamily of the F-BAR (F for FCH, Fer-CIP4 homology domain) protein family. The F-BAR proteins have been implicated in cell membrane processes such as membrane invagination, tubulation and endocytosis [].
Protein Domain
Name: Rhabdovirus nucleocapsid, C-terminal
Type: Homologous_superfamily
Description: All negative-strand RNA viruses contain a ribonucleoprotein (RNP) complex that consists of the viral genome RNA completely enwrapped by the nucleoprotein (N). The crystal structure of the N protein in complex with a single-strand RNA has been determined for two members of the rhabdovirus family, VSV [] and RABV []. The regions in contact with RNA were mapped to a cavity formed between two structural lobes of the nucleoprotein: the N-terminal arm and the extended loop in the C-terminal lobe [].This superfamily represents the C-terminal domain.
Protein Domain
Name: Rhabdovirus nucleocapsid, N-terminal
Type: Homologous_superfamily
Description: All negative-strand RNA viruses contain a ribonucleoprotein (RNP) complex that consists of the viral genome RNA completely enwrapped by the nucleoprotein (N). The crystal structure of the N protein in complex with a single-strand RNA has been determined for two members of the rhabdovirus family, VSV [] and RABV []. The regions in contact with RNA were mapped to a cavity formed between two structural lobes of the nucleoprotein: the N-terminal arm and the extended loop in the C-terminal lobe [].This superfamily represents the N-terminal domain.
Protein Domain
Name: TcpC, C-terminal
Type: Domain
Description: This entry represents the C-terminal domain of TcpC.TcpC is required for efficient conjugative transfer, localizing to the cell membrane independently of other conjugation proteins, where membrane localization is important for its function, oligomerization and interaction with the conjugation proteins TcpA, TcpH, and TcpG [ ]. Its C-terminal domain is critical for interactions with these other conjugation proteins []. TcpC has low level sequence identity to proteins encoded by the conjugative transposon Tn916, which is responsible for a large proportion of the tetracycline resistance in different pathogens [].
Protein Domain
Name: EXPERA domain
Type: Domain
Description: The EXPERA (EXPanded EBP superfamilly) domain is conserved among the following protein families:Vertebrate TM6SF1 (Transmembrane 6 Superfamily Member 1).Vertebrate TM6SF2 (Transmembrane 6 Superfamily Member 2).Eukaryotic MAC30 (Meningioma-associated protein 30) also known as TMEM97 (Transmembrane protein 97), mainly localized in the endoplasmic reticulum(ER).Eukaryotic EBP (Emopamil binding protein), an enzyme with a D8, D7 sterol isomerase activity that catalyzes the transposition of a double bond fromC8=C9 to C7=C8 in the sterol B-ring.The EXPERA domain contains four transmembrane regions and is likely to possess a sterol isomerase catalytic activity [].
Protein Domain
Name: FRAS1-related extracellular matrix protein, N-terminal domain
Type: Domain
Description: This entry represents a domain found at the N terminus of FRAS1-related extracellular matrix proteins (also known as Frem related proteins) predominantly in chordates. In humans, the FRAS1-related extracellular matrix protein 1 was reported to be essential for the normal adhesion of the embryonic epidermis [ ]. FRAS1-related extracellular matrix protein 2 is involved in the development of eyelids and the anterior segment of the eyeballs and has been related to cryptophthalmos (a rare congenital disorder characterized by ocular dysplasia with eyelid malformation) [, ].
Protein Domain
Name: Peptidase G2, IMC autoproteolytic cleavage domain
Type: Domain
Description: This domain is found at the very C terminus of bacteriophage parallel β-helical tailspike proteins. It carries the enzymic residues that induce autoproteolytic cleavage to bring about maturation of the folding process of the helix in a chaperone-like manner. The domain thus mediates the assembly of a large tailspike protein and then releases itself after maturation. These C-terminal regions that autoproteolytically release themselves after maturation are exchangeable between functionally unrelated N-terminal proteins and have been identified in a number of bacteriophage tailspike proteins [ ].
Protein Domain
Name: Brdt, bromodomain, repeat II
Type: Domain
Description: Human Brdt is a testis-specific member of the BET subfamily of bromodomain proteins. The BET proteins contain two bromodomains and a region of homology in the C-terminal region, designated the extra terminal (ET) motif. The first bromodomain in Brdt has been shown to be essential for male germ cell differentiation [ ]. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine [].This entry represents the second bromodomain found in Brdt and related proteins.
Protein Domain
Name: Brdt, bromodomain, repeat I
Type: Domain
Description: Human Brdt is a testis-specific member of the BET subfamily of bromodomain proteins. The BET proteins contain two bromodomains and a region of homology in the C-terminal region, designated the extra terminal (ET) motif. The first bromodomain in Brdt has been shown to be essential for male germ cell differentiation [ ]. Bromodomains are 110 amino acid long domains, that are found in many chromatin associated proteins. Bromodomains can interact specifically with acetylated lysine [].This entry represents the first bromodomain found in Brdt and related proteins.
Protein Domain
Name: Coronavirus Orf3a/b
Type: Family
Description: Members of this family are non-structural proteins that are found in alphacoronavirus, including Transmissible gastroenteritis virus (TGEV) and Porcine respiratory coronavirus (PRCV). These proteins are found on the same mRNA as another product, designated ORF3a and they are referred to as 3a-like accessory proteins found in multiple alpha and betacoronavirus lineages that infect bats and humans. They are transmembrane proteins of the viroporin family that form ion channels in the host membrane and have been implicated in inducing apoptosis, pathogenicity, and virus release [ , , ].
Protein Domain
Name: Costars domain superfamily
Type: Homologous_superfamily
Description: This entry represents the Costars domain superfamily.This domain is found both alone (in the costars family of proteins) and at the C terminus of actin-binding Rho-activating protein (ABRA). It binds to actin, and in muscle regulates the actin cytoskeleton and cell motility [ , ]. It has a winged helix-like fold consisting of three α-helices and four antiparallel beta strands. Unlike typical winged helix proteins it does not bind to DNA, but contains a hydrophobic groove which may be responsible for interaction with other proteins [].
Protein Domain
Name: AhpD-like
Type: Homologous_superfamily
Description: This superfamily represents the α-helical domain found in alkyl-hydroperoxide reductase AhpD [ ]. AhpD is a component of alkyl-hydroperoxide reductase participating in defense against ROS (reactive oxygen species) []. The C-terminal α-helical domain of the AhpD protein from Mycobacterium tuberculosis shares protein sequence and structural similarity with the N-terminal of Sesns from animals []. Sestrins (Sesns) are involved in ROS (reactive oxygen species) regulation []. A similar fold can also be found in other proteins such as 4-carboxymuconolactone decarboxylase (CMD) and some uncharacterised proteins [ ].
Protein Domain
Name: Mga helix-turn-helix domain
Type: Domain
Description: M regulator protein trans-acting positive regulator (Mga) is a DNA-binding protein that activates the expression of several important virulence genes in group A streptococcus in response to changing environmental conditions [ ]. This domain is found in the centre of the Mga proteins. This domain is also found in a number of bacterial RofA transcriptional regulators that seem to be largely restricted to streptococci. These proteins have been shown to regulate the expression of important bacterial adhesins []. This is presumably a DNA-binding domain.
Protein Domain
Name: Peroxiredoxin OsmC-like protein, Firmicutes
Type: Family
Description: Osmotically inducible protein C (OsmC) is a stress-induced protein found in Escherichia coli. The transcription of the osmC gene of E. coli is regulated as a function of the phase of growth and is induced during the late exponential phase when the growth rate slows before entry into stationary phase. The transcription is initiated by two overlapping promoters, osmCp1 and osmCp2 [ ].This entry represents proteins belongs to the OsmC/Ohr family and is restricted to Firmicutes. Proteins in this family include YmaD from Bacillus subtilis.
Protein Domain
Name: M protein-type anchor domain
Type: Domain
Description: Viruses, parasites and bacteria are covered in protein and sugar molecules that help them gain entry into a host by counteracting the host's defences. One such molecule is the M protein produced by certain streptococcal bacteria [ ]. M proteins embody an anchor motif that is now known to be shared by many Gram-positive bacterial surface proteins. The motif includes a conserved hexapeptide, with the consensus sequence LPXTGE, which precedes a hydrophobic region, which itself precedes a cluster of basic residues [ , ].
Protein Domain
Name: Histone-arginine methyltransferase CARM1, N-terminal
Type: Domain
Description: Histone-arginine methyltransferase CARM1 (also known as coactivator-associated arginine methyltransferase 1) methylates arginine residues in several proteins involved in DNA packaging, transcription regulation, and mRNA stability [ ]. CARM1 is recruited by several transcription factors and plays a critical role in gene expression as a positive regulator. This entry represents the N-terminal domain of CARM1. Structurally this domain adopts a PH domain-like fold, a common structural scaffold found in a broad range proteins with diverse activities, which is frequently found to regulate protein-protein interactions [].
Protein Domain
Name: TATA element modulatory factor 1 DNA binding
Type: Family
Description: This is the middle region of a family of TATA element modulatory factor 1 (TMF1) proteins conserved in eukaryotes that contains at its N-terminal section a number of leucine zippers that could potentially form coiled coil structures. The whole proteins bind to the TATA element of some RNA polymerase II promoters and repress their activity by competing with the binding of TATA binding protein. TMFs are evolutionarily conserved golgins that bind Rab6, a ubiquitous ras-like GTP-binding Golgi protein, and contribute to Golgi organisation in animal [ ] and plant [] cells.
Protein Domain
Name: Ferritin, conserved site
Type: Conserved_site
Description: Ferritin is one of the major non-heme iron storage proteins in animals, plants and microorganisms [ , ]. It consistsof a mineral core of hydrated ferric oxide, and a multi-subunit protein shell which encloses the former and assures its solubility in an aqueousenvironment. In animals, the protein is mainly cytoplasmic and there are generally two or more genes that encode closely related subunits (in mammals there are twosubunits which are known as H(eavy) and L(ight)). In plants ferritin is found in the chloroplast [].This entry represents the central region of this protein.
Protein Domain
Name: Glutamine-Leucine-Glutamine, QLQ
Type: Domain
Description: The QLQ domain is characterised by the conserved Gln-Leu-Gln residues. Another feature of this domain is the absolute conservation of bulky aromatic/hydrophobic and acidic amino acid residues such as Phe, Trp, Tyr, Leu, Glu, or their equivalents in terms of chemical and radial properties. The Pro residue is also absolutely conserved. These amino acid residues are critical for the function of the QLQ domain, probably for protein-protein interaction [ ].Some proteins known to conatin a QLQ domain are listed below:Plant GROWTH-REGULATING FACTOR (GRF) proteins, putative transcription factors. Eukaryotic SWI2/SNF2, transcriptional coactivators.
Protein Domain
Name: tRNA N(3)-methylcytidine methyltransferase METTL2/6/8-like
Type: Family
Description: This family includes tRNA N(3)-methylcytidine methyltransferases METTL2, 6 and 8, O-methyltransferase 3 and methyltransferase-like protein Metl from Drosophila. These proteins are S-adenosyl-L-methionine-dependent methyltransferases that mediate N3-methylcytidine modification of residue 32 of the tRNA anticodon loop of tRNA(Thr) and tRNA(Ser) [ , , , , , , ].tRNA N(3)-methylcytidine methyltransferase METTL2, 6 and 8 are part of a group of proteins known as TIPs (from tension-induced/inhibited protein) required for the recruitment of histone acetyltransferase p300 to specific promoters. TIP-6 is involved in the adipogenic cascade [ ]. O-methyltransferase 3 is up-regulated by phagocytic stimuli [].
Protein Domain
Name: SRR1-like domain
Type: Domain
Description: This domain is found in SRR1-like proteins.SRR1 are signalling proteins thought to be involved in regulating the circadian clock input pathway, which is required for normal oscillator function. In Arabidopsis thaliana it regulates the expression of clock-regulated genes such as CCA1 and TOC1. It is also involved in both the phytochrome B (PHYB) and PHYB-independent signaling pathways [ ]. The mouse homologue of the plant circadian-regulating protein SRR1 plays roles in heme-regulated circadian rhythms and cell proliferation [].The yeast SRR1-like protein Ber1 is involved in microtubule stability [ ].
Protein Domain
Name: RecF/RecN/SMC, N-terminal
Type: Domain
Description: This domain is found at the N terminus of structural maintenance of chromosomes (SMC) proteins, which function together with other proteins in a range of chromosomal transactions, including chromosome condensation, sister-chromatid cohesion, recombination, DNA repair and epigenetic silencing of gene expression [ ]. The eukaryotic SMC proteins form two kind of heterodimers: the SMC1/SMC3 and the SMC2/SMC4 types. These heterodimers constitute an essential part of higher order complexes, which are involved in chromatin and DNA dynamics []. This domain is also found in RecF and RecN proteins, which are involved in DNA metabolism and recombination.
Protein Domain
Name: Splicing factor, RBM39-like
Type: Family
Description: This entry represents RBM39 (also known as CAPER) proteins from mammals and a group of putative RNA splicing factors including the Pad1 protein from fungi. All are characterised by an N-terminal arginine-rich, low complexity domain followed by three (or in the case of 4 human paralogs, two) RNA recognition domains. These proteins are closely related to the U2AF splicing factor family ( ). In mice, CAPER is a transcriptional coactivator of activating protein-1 (AP-1) and estrogen receptors (ERs) [ ]. It may be involved in pre-mRNA splicing process.
Protein Domain
Name: Aromatic amino acid beta-eliminating lyase/threonine aldolase
Type: Domain
Description: This domain is found in many tryptophanases (tryptophan indole-lyase, TNase), tyrosine phenol-lyases (TPL) and threonine aldolases. It is involved in the degradation of amino acids. The glycine cleavage system is composed of four proteins: P, T, L and H. In Bacillus subtilis, the P 'protein' is an heterodimer of two subunits. The glycine cleavage system catalyses the degradation of glycine. The P protein binds the alpha-amino group of glycine through its pyridoxal phosphate cofactor; CO(2) is released and the remaining methylamine moiety is then transferred to the lipoamide cofactor of the H protein
Protein Domain
Name: Gamma tubulin
Type: Family
Description: Gamma-tubulins constitute a ubiquitous and highly-conserved subfamily of the tubulin family. Gamma is a low abundance protein present within the cells in both various types of microtubule-organizing centers and cytoplasmic protein complexes. The protein recruits the alpha/beta-tubulin dimers that form the minus ends of microtubules and is found at microtubule organising centres (MTOC) such as the spindle poles or the centrosome, suggesting that it is involved in the microtubule nucleation [ , , ]. It exists in two main protein complexes: gamma-tubulin ring complexes (gamma-TuRCs) and the the gamma-tubulin small complex (gamma-TuSC) [].
Protein Domain
Name: Glycylpeptide N-tetradecanoyltransferase, conserved site
Type: Conserved_site
Description: Glycylpeptide N-tetradecanoyltransferase, also known as Myristoyl-CoA:protein N-myristoyltransferase ( ) (Nmt), is the enzyme responsible for transferring a myristate group on the N-terminal glycine of a number of cellular eukaryotics and viral proteins [ ]. Nmt is a monomeric protein of about 50 to 60kDa whose sequence appears to be well conserved. In Drosophila, this protein is critical for the developmental processes that involve cell shape changes and movement [].Two highly conserved regions are found. The first one is located in the central section, the second in the C-terminal part.
Protein Domain
Name: Glycylpeptide N-tetradecanoyltransferase, C-terminal
Type: Domain
Description: Glycylpeptide N-tetradecanoyltransferase, also known as Myristoyl-CoA:protein N-myristoyltransferase ( ) (Nmt), is the enzyme responsible for transferring a myristate group on the N-terminal glycine of a number of cellular eukaryotics and viral proteins [ ]. Nmt is a monomeric protein of about 50 to 60kDa whose sequence appears to be well conserved. In Drosophila, this protein is critical for the developmental processes that involve cell shape changes and movement [].The N and C-terminal domains of NMT are structurally similar, each adopting an acyl-CoA N-acyltransferase-like fold. This entry represents the C-terminal region [ ].
Protein Domain
Name: Glycylpeptide N-tetradecanoyltransferase, N-terminal
Type: Domain
Description: Glycylpeptide N-tetradecanoyltransferase, also known as Myristoyl-CoA:protein N-myristoyltransferase ( ) (Nmt), is the enzyme responsible for transferring a myristate group on the N-terminal glycine of a number of cellular eukaryotics and viral proteins [ ]. Nmt is a monomeric protein of about 50 to 60kDa whose sequence appears to be well conserved. In Drosophila, this protein is critical for the developmental processes that involve cell shape changes and movement [].The N and C-terminal domains of NMT are structurally similar, each adopting an acyl-CoA N-acyltransferase-like fold. This entry represents the N-terminal region [ ].
Protein Domain
Name: Sugar/inositol transporter
Type: Family
Description: The sugar transporters belong to a superfamily of membrane proteins responsible for the binding and transport of various carbohydrates, organic alcohols, and acids in a wide range of prokaryotic and eukaryotic organisms [ ]. These integral membrane proteins are predicted to comprise twelve membrane spanning domains. It is likely that the transporters have evolved from an ancient protein present in living organisms before the divergence into prokaryotes and eukaryotes []. In mammals, these proteins are expressed in a number of organs [].This family includes sugar transporters and the myo-inositol transporters.
Protein Domain
Name: Hexapeptide repeat
Type: Repeat
Description: A variety of bacterial transferases contain a repeat structure composed of tandem repeats of a [LIV]-G-X(4) hexapeptide, which, in the tertiary structure of LpxA (Acyl-[acyl-carrier-protein]-UDP-N-acetylglucosamine O-acyltransferase) [ ], has been shown to form a left-handed parallel β-helix. A number of different transferase protein families contain this repeat, such as the bifunctional protein GlmU, galactoside acetyltransferase-like proteins [], the gamma-class of carbonic anhydrases [], and tetrahydrodipicolinate-N-succinlytransferases (DapD), the latter containing an extra N-terminal 3-helical domain []. It has been shown that most hexapeptide acyltransferases form catalytic trimers with three symmetrical active sites [].
Protein Domain
Name: RIN4, pathogenic type III effector avirulence factor Avr cleavage site
Type: Domain
Description: This domain is conserved in small families of otherwise unrelated proteins in both mono-cots and di-cots, suggesting that it has a conserved, plant-specific function. It is found in the plant RIN4 (RPM1-interacting protein 4) where it appears to contribute to the binding of the protein to RCS (AvrRpt2 auto-cleavage site) and AvrB, the virulence factors from the infecting bacterium [ ]. The cleavage site for the AvrRpt2 avirulence protein would appear to be the sequence motifs VPQFGDW and LPKFGEW, both of which are highly conserved within the domain [].
Protein Domain
Name: Transcription regulator HTH, LysR
Type: Domain
Description: Numerous bacterial transcription regulatory proteins bind DNA via a helix-turn-helix (HTH) motif. These proteins are very diverse, but for convenience may be grouped into subfamilies on the basis of sequence similarity. One such family, the lysR family, groups together a range of proteins, including AmpR, CatM, CatR, CynR, CysB, GltC, IlvY, IrgB, LysR, MetR, NhaR, SyrM, TcbR, TfdS and TrpI [, , , , ]. The majority of these proteins appear to be transcription activatorsand most are known to negatively regulate their own expression. All possess a potential HTH DNA-binding motif towards their N-terminal end.
Protein Domain
Name: TonB-system energizer ExbB type-1
Type: Family
Description: This entry describes ExbB proteins, part of the MotA/TolQ/ExbB protein family. The paired proteins MotA and MotB, TolQ and TolR, and ExbB and ExbD harness the proton-motive force to drive the flagellar motor, energize the Tol-Pal system, or energize TonB, respectively. Tol-Pal and TonB are both active at the outer membrane. Genomes may have many different TonB-dependent receptors, of which many of those characterised are involved in siderophore transport across the outer membrane.In E. coli, ExbB together with the ExbD and TonB proteins is involved in energy-coupled transport across the outer membrane [ ].
Protein Domain
Name: Thiaminase II
Type: Family
Description: The TenA protein of Bacillus subtilis and Staphylococcus aureus [ , ], and the C-terminal region of trifunctional protein Thi20p from Saccharomyces cerevisiae [], perform cleavages on thiamine and related compounds to produce 4-amino-5-hydroxymethyl-2-methylpyrimidine (HMP), a substrate of a salvage pathway for thiamine biosynthesis. The gene symbol tenA, for Transcription ENhancement A, reflects a misleading early characterisation as a regulatory protein. This family is related to PqqC from the PQQ biosynthesis system , heme oxygenase , and CADD (Chlamydia protein Associating with Death Domains), a putative folate metabolism enzyme .
Protein Domain
Name: AP-1 complex subunit sigma
Type: Family
Description: The heterotetrameric adaptor protein (AP)-1 complex consists of one large gamma-, one beta-, one mu-, and one sigma-subunit. AP complexes link the cytosolic domains of the cargo proteins to the protein coat that induces vesicle budding in the donor compartment during vesicle transport [ , ]. In the case of AP-1 the coat protein is clathrin. AP-1 binds the phospholipid PI(4)P which plays a role in its localisation to the trans-Golgi network (TGN)/endosome []. This entry represents the sigma subunit, which comprises a single longin domain and plays a role in binding dileucine-based sorting signals.
Protein Domain
Name: TGF beta-induced protein/periostin
Type: Family
Description: This entry includes TGF beta-induced protein (TGFBI, also known as betaig-H3 and keratoepithelin) and Periostin. They are are paralogues that contain a single emilin (EMI) and four fasciclin-1 (FAS1) modules and are secreted from cells [ ]. Transforming growth factor-beta-induced protein is an extracellular matrix protein that plays a role in a wide range of physiological and pathological conditions including diabetes, corneal dystrophy and tumorigenesis [ ].Periostin (also known as osteoblast-specific factor 2), is a secreted cell adhesion protein, which has been linked to cancer, and degenerative/allergic diseases [ , ].
Protein Domain
Name: Virus X resistance protein-like, coiled-coil domain
Type: Domain
Description: The potato virus X resistance protein (RX) confers resistance against potato virus X. It is a member of a family of resistance proteins with a domain architecture that includes an N-terminal coiled-coil domain (modeled here), a nucleotide-binding domain, and leucine-rich repeats (CC-NB-LRR). These intracellular resistance proteins recognize pathogen effector proteins and will subsequently trigger a response that may be as severe as localized cell death [ ]. The N-terminal coiled-coil domain of RX has been shown to interact with RanGAP2, which is a necessary co-factor in the resistance response [].
Protein Domain
Name: Bacteriochlorophyll c-binding superfamily
Type: Homologous_superfamily
Description: Chlorosomes, which are attached to the inner surface of the cytoplasmic membrane, consist of four polypeptides and associated pigments and lipids. The principal light-harvesting pigment of the green filamentous bacterium Chloroflexus aurantiacus is bacteriochlorophyll (Bchl) c. This pigment is either bound to, or constrained by, a small approximately 80-residue polypeptide designated Bchlc-binding protein. In C. aurantiacus, a C-terminal extension is believed to play a role in proper incorporation of the protein during chlorosome assembly [ ]. The protein has a high degree of similarity to Bchlc-binding proteins of other photosynthetic bacteria.
Protein Domain
Name: Extracellular HAF
Type: Repeat
Description: This repeat is approximately 40 amino acids in length and the spacing between repeats is usually is four residues. Proteins generally have a least two tandem copies, and can have as many as seven. This repeat is named after a conserved tripeptide motif, HAF, found in most of the proteins. Some proteins containing the repeat are found in species with no outer membrane (archaea and Gram-positive bacteria) while others have C-terminal autotransporter domains that suggest that the repeat region is transported across the outer membrane. This repeat seems likely to be associated with extracellular proteins.
Protein Domain
Name: Alpha-2-macroglobulin, bacteria
Type: Family
Description: This family includes uncharacterised protein UPF0192 and uncharacterised lipoprotein yfhM. They contain alpha-2-macroglobulin domains and may be bacterial alpha-2-macroglobulin homologues. The alpha-macroglobulin (aM) family of proteins includes protease inhibitors. The protease inhibiting mechanism involves protease cleavage of the bait region, a segment of the aM that is particularly susceptible to proteolytic cleavage, which initiates a conformational change such that the aM collapses about the protease. In the resulting aM-protease complex, the active site of the protease is sterically shielded, thus substantially decreasing access to protein substrates [ ].
Protein Domain
Name: TEL2, C-terminal domain superfamily
Type: Homologous_superfamily
Description: This entry represents a conserved domain found in a group of proteins called telomere-length regulation TEL2, or clock abnormal protein-2, which are conserved from plants to humans. These proteins regulate telomere length and contribute to silencing of sub-telomeric regions [ ]. In vitro the protein binds to telomeric DNA repeats. Tel2 acts at an early step of the TEL1/ATM pathway of DNA damage signaling [ ]. The structure of Tel2 consists of HEAT-like helical repeats that assemble into two separate α-solenoids []. This entry represents the C-terminal solenoid which consists of eleven helices.
Protein Domain
Name: Neuraxin/MAP1B repeat
Type: Repeat
Description: In microtubule-associated protein 1B (MAP1B) the basic region containing the KKEE and KKEVI motifs is responsible for the interaction between MAP1B and microtubules in vivo. This region bears no sequence relationship to the microtubule binding domains of kinesin, MAP2, or tau [ ].Neuraxin is a putative structural protein of the rat central nervous system that is immunologically related to microtubule-associated protein 5 (MAP5). Neuraxin may be implicated in neuronal membrane-microtubule interactions [ ].Both proteins contain a region that consists of 12 tandem repeats of a 17 residues motif.
Protein Domain
Name: Hri1, N-terminal
Type: Domain
Description: Saccharomyces cerevisiae Hri1 (Hrr25-interacting protein 1, YLR301w) is a non-essential gene product named for its interaction with the yeast protein kinase Hrr25. It has also been characterized as an interaction partner for Sec72p, but does not seem to be required for protein translocation into the ER. It may be a cytosolic protein. Hri1p contains a tandem repeat of a structural unit that forms a β-barrel with structural similarity to nitrobindin. This N-terminal repeat is involved in homodimerization and may contain a ligand binding site [ , , ].
Protein Domain
Name: C2CD5, C2 domain
Type: Domain
Description: C2CD5, also known as CDP138 or KIAA0528, is a C2 domain-containing phosphoprotein. It is a substrate for protein kinase Akt2, and it may be involved in the regulation of GLUT4 vesicle-plasma membrane fusion in response to insulin. The C2 domain of C2CD5 was shown to be capable of binding Ca(2+) and lipid membranes [ ]. Other studies indicate that C2CD5 is a CDK5- and FIBP-interacting protein, forming a complex with these proteins that is involved in cell proliferation and migration [].This entry represents the C2 domain of C2CD5.
Protein Domain
Name: Quinolinate synthetase A superfamily
Type: Homologous_superfamily
Description: Quinolinate synthetase catalyses the second step of the de novobiosynthetic pathway of pyridine nucleotide formation. In particular, quinolinate synthetase is involved in the condensation of dihydroxyacetone phosphate and iminoaspartate to form quinolinic acid [ ]. This synthesis requires two enzymes, an FAD-containing "B protein"and an "A protein". B protein converts L-aspartate to iminoaspartate. The A protein, NadA, converts iminoaspartate to quinolate. NadA harbours a [4Fe-4S] cluster []. The structure of NadA is composed of three similar domains related by pseudo threefold symmetry. Each domain has three layers (alpha/beta/alpha) with parallel beta sheet.
Protein Domain
Name: PAZ domain superfamily
Type: Homologous_superfamily
Description: This domain superfamily is named after the proteins Piwi Argonaut and Zwille. It is also found in the CAF protein from Arabidopsis thaliana. The function of the domain is unknown but has been found in the middle region of a number of members of the Argonaute protein family, which also contain the Piwi domain ( ) in their C-terminal region [ ]. Several members of this family have been implicated in the development and maintenance of stem cells through the RNA-mediated gene-quelling mechanisms associated with the protein DICER.
Protein Domain
Name: Head-to-tail connector protein, podovirus-type
Type: Family
Description: The tail of bacteriophage T7, a member of the Podoviridae family, is composed of at least four proteins; one of them is gene product 8 (gp8), known as the connector [ ]. DNA enters the viral head during DNA packaging through a channel formed by the connector. The connector sits at a unique 5-fold vertex of the icosahedral capsid [], which is also involved in the delivery of the genome during DNA ejection. Purified gp8 protein has been shown to assemble as a dodecamer [].This family includes connector proteins from Podoviridae and bacterial proteins.
Protein Domain
Name: Winged helix DNA-binding domain superfamily
Type: Homologous_superfamily
Description: Winged helix DNA-binding proteins share a related winged helix-turn-helix DNA-binding motif, where the "wings", or loops, are small β-sheets. The winged helix motif consists of two wings (W1, W2), three α-helices (H1, H2, H3) and three β-sheets (S1, S2, S3) arranged in the order H1-S1-H2-H3-S2-W1-S3-W2 [ ]. The DNA-recognition helix makes sequence-specific DNA contacts with the major groove of DNA, while the wings make different DNA contacts, often with the minor groove or the backbone of DNA. Several winged-helix proteins display an exposed patch of hydrophobic residues thought to mediate protein-protein interactions.
Protein Domain
Name: G-protein coupled receptor 183-like
Type: Family
Description: This family includes G-protein coupled receptor 183 (GP183, also known as EBV-induced G-protein coupled receptor 2) and similar proteins such as the orphan receptor GP141 and G-protein coupled receptor homologue FPV206 from Fowlpox virus.GP183 is a receptor for oxysterol 7-alpha,25-dihydroxycholesterol (7-alpha,25-OHC) and other related oxysterols [ , , ]. It regulates migration of astrocytes and is involved in the communication between astrocytes and macrophages []. It may also act as a chemotactic receptor for some T-cells upon binding to 7-alpha,25-OHC ligand. Together with CXCR5, it mediates B-cell migration [].
Protein Domain
Name: ABC transporter, permease protein, BtuC-like
Type: Family
Description: This entry represents a family of ABC transporter permease proteins. It includes the vitamin B12 import system permease protein BtuC, and the Fe(3+) dicitrate transport system permease proteins FecC and FecD, among many others.BtuC is a component of the ABC transporter complex BtuCD, which facilitates uptake of vitamin B12 into the cytoplasm of Escherichia coli [ ].FecC and FecD are components of of the fec operon, a Periplasmic-Binding-Protein (PBP)-dependent transport system for ferric dicitrate in Escherichia coli [ ]. FecC and FecD exhibit homology to each other and are localized in the cytoplasmic membrane.
Protein Domain
Name: DNAJC17, RNA recognition motif
Type: Domain
Description: The entry represents the RNA recognition motif (RRM) of eukaryotic DnaJ homologue subfamily C member 17 (DNAJC17) and similar proteins [ ]. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family []. DNAJC17 may negatively affect PAX8-induced thyroglobulin/TG transcription [ ]. It contains an N-terminal DnaJ domain or J-domain, which mediates the interaction with Hsp70 [], and a RNA recognition motif (RRM) at the C terminus.
Protein Domain
Name: Olfactory receptor subfamily 6C-like
Type: Family
Description: This family includes human olfactory receptor proteins from subfamily 6C and similar proteins such as 6X, 6J, 6T, 6V, as well as related proteins from other vertebrates. Olfactory receptors (ORs) play a central role in olfaction. ORs belong to the class A rhodopsin-like family of G protein-coupled receptors and constitute the largest multigene family in mammals of approximately 1,000 genes. More than 60% of human ORs are non-functional pseudogenes compared to only about 20% in mouse. Each OR can recognize structurally similar odorants, and a single odorant can be detected by several ORs [ ].
Protein Domain
Name: Testis-specific serine/threonine-protein kinase 3, catalytic domain
Type: Domain
Description: STKs (serine/threonine-protein kinases) catalyse the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. TSSK (testis-specific serine kinase) proteins are almost exclusively expressed postmeiotically in the testis and play important roles in spermatogenesis and/or spermiogenesis. There are five mammalian TSSK proteins which show differences in their localization and timing of expression [ ]. TSSK3 has been reported to be expressed in the interstitial Leydig cells of adult testis. Its mRNA levels is low at birth, increases at puberty, and remains high throughout adulthood [, ].
Protein Domain
Name: Polyhydroxyalkanoate synthesis repressor PhaR
Type: Family
Description: This entry identifies the polyhydroxyalkanoate synthesis repressor, PhaR and related proteins. The gene for PhaR regulatory protein is found in general near other genes encoding proteins associated with polyhydroxyalkanoate (PHA) granule biosynthesis and utilization. It is found to be a DNA-binding homotetramer that is also capable of binding short chain hydroxyalkanoic acids and PHA granules. PhaR may regulate the expression of itself, of the phasins that coat granules, and of enzymes that direct carbon flux into polymers stored in granules [ , ]. The C-terminal region is poorly conserved in this family of proteins.
Protein Domain
Name: Nck2, SH2 domain
Type: Domain
Description: Cytoplasmic proteins Nck are non-enzymatic adaptor proteins composed of three SH3 (Src homology 3) domains and a C-terminal SH2 domain [ ]. They regulate actin cytoskeleton dynamics by linking proline-rich effector molecules to protein tyrosine kinases and phosphorylated signaling intermediates []. They function downstream of the PDGFbeta receptor and are involved in Rho GTPase signaling and actin dynamics []. They associate with tyrosine-phosphorylated growth factor receptors or their cellular substrates [, ]. There are two vertebrate Nck proteins, Nck1 and Nck2. This entry represents the SH2 domain of Nck2.
Protein Domain
Name: Actin remodeling regulator NHS
Type: Family
Description: Nance-Horan syndrome is an X-linked disorder characterised by congenital cataracts, dental anomalies, dysmorphic features, and, in some cases, mental retardation [ ]. The syndrome is caused by defects in the NHS gene [], which appears to play a key role in the regulation of eye, tooth, brain, and craniofacial development []. It may function in cell morphology by maintaining the integrity of the circumferential actin ring and controlling lamellipodia formation []. However, the protein's exact function is unknown.This entry represents the NHS protein family, which includes NHS protein and NHS-like proteins 1 and 2.
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom