Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 12501 to 12600 out of 30763 for seed protein

Category restricted to ProteinDomain (x)

0.035s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Name: ASAP3, ArfGAP domain
Type: Domain
Description: The Arf GAPs (GTPase-activating proteins) are a family of multidomain proteins with the common function of accelerating the hydrolysis of GTP bound to Arf proteins. ASAP proteins are a subtype of Arf GAPs. ASAP3 (Arf-GAP with SH3 domain, ANK repeat and PH domain-containing protein 3, also known as ACAP4, DDEFL1 (Development and Differentiation Enhancing Factor-Like 1), or centaurin beta-6), is a focal adhesion-associated Arf GAP that functions in cell migration and invasion of cancers [ , ]. It is an Arf6-specific GTPase activating protein (GAP) and is co-localized with Arf6 in ruffling membranes upon EGF stimulation []. ASAP3 promotes cell proliferation [], being implicated in the pathogenesis of hepatocellular carcinoma and plays a role in regulating cell migration and invasion [].ASAP3 (Arf-GAP with SH3 domain, ANK repeat and PH domain-containing protein 3) contains an N-terminal BAR domain, followed by a Pleckstrin homology (PH) domain, an Arf GAP domain and ankyrin (ANK) repeats. Unlike ASAP1 and ASAP2, ASAP3 do not have an SH3 domain at the C-terminal. This entry represents the ArfGAP domain of ASAP3. Similar to ASAP1, the GAP activity of ASAP3 is strongly enhanced by PIP2 via PH domain []. Like ASAP1, ASAP3 associates with focal adhesions and circular dorsal ruffles. However, unlike ASAP1, ASAP3 does not localize to invadopodia or podosomes. ASAP1 and 3 have been implicated in oncogenesis, as ASAP1 is highly expressed in metastatic breast cancer and ASAP3 in hepatocellular carcinoma.
Protein Domain
Name: Virulence factor YopE, GAP domain superfamily
Type: Homologous_superfamily
Description: Secretion of virulence factors in Gram-negative bacteria involves transportation of the protein across two membranes to reach the cell exterior. There have been four secretion systems described in animal enteropathogens, such as Salmonella and Yersinia, with further sequence similarities in plant pathogens like Ralstonia and Erwinia [ ].The type III secretion system is of great interest, as it is used to transport virulence factors from the pathogen directly into the host cell and is only triggered when the bacterium comes into close contact with the host. The protein subunits of the system are very similar to those of bacterial flagellar biosynthesis. However, while the latter forms a ring structure to allow secretion of flagellin and is an integral part of the flagellum itself [ ], type III subunits in the outer membrane translocate secreted proteins through a channel-like structure.Exotoxins secreted by the type III system do not possess a secretion signal, and are considered unique for this reason [ ]. Yersinia secrete a Rho GTPase-activating protein, YopE [, ], that disrupts the host cell actin cytoskeleton. YopE is regulated by another bacterial gene, SycE [], that enables the exotoxin to remain soluble in the bacterial cytoplasm. A similar protein, exoenzyme S (ExoS) from Pseudomonas aeruginosa, has both ADP-ribosylation and GTPase activity [, ].This entry represents the bacterial GAP (GTPase-activating protein) domain found in YopE, ExoS, and also SptP (Secreted effector protein) [].
Protein Domain
Name: VAV1 protein, second SH3 domain
Type: Domain
Description: VAV1 (also known as proto-oncogene vav) is expressed predominantly in the hematopoietic system and it plays an important role in the development and activation of B and T cells [ , , ]. It is activated by tyrosine phosphorylation to function as a guanine nucleotide exchange factor (GEF) for Rho GTPases following cell surface receptor activation, triggering various effects such as cytoskeletal reorganization, transcription regulation, cell cycle progression, and calcium mobilization [, ]. It also serves as a scaffold protein and has been shown to interact with Ku70, Socs1, Janus kinase 2, SIAH2, S100B, Abl gene, ZAP-70, SLP76, and Syk, among others []. The VAV protein family members are multiple domain proteins, including Vav from flies and VAV1/2/3 from mammals. VAV1 predominates in hematopoietic cells, whereas VAV2 and VAV3 are more broadly expressed. They have a calponin homology (CH) domain, an acidic domain (AC), a Dbl homology (DH) domain, a pleckstrin homology (PH) domain, a cysteine-rich (CR) domain containing a zinc finger, and a complex region with SH2 and SH3 domains. Therefore they may participate in the activity of several pathways [ , ]. They are signal transducer proteins that couple tyrosine kinase signals with the activation of the Rho/Rac GTPases, [, , ]. This entry represents the second SH3 domain of VAV1. This domain interacts with a wide variety of proteins including cytoskeletal regulators (zyxin), RNA-binding proteins (Sam68), transcriptional regulators, viral proteins, and dynamin 2 [ ].
Protein Domain
Name: NECAP, PHear domain
Type: Domain
Description: This PH-like domain can be found in the N-terminal region of NECAPs (also known as adaptin ear-binding coat-associated proteins). NECAPs are alpha-ear-binding proteins that enrich on clathrin-coated vesicles (CCVs). NECAP-1 is expressed in brain and non-neuronal tissues and cells while NECAP-2 is ubiquitously expressed. The PH-like domain of NECAPs is a protein-binding interface that mimics the FxDxF motif binding properties of the alpha-ear and is called PHear (PH fold with ear-like function) domain [ ].PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes [ , , , , ].
Protein Domain
Name: TLV/ENV coat polyprotein
Type: Family
Description: Enveloped viruses such as Human immunodeficiency virus 1, influenza virus, and Ebola virus sp. express a surface glycoprotein that mediates both cell attachment and fusion of viral and cellular membranes. The ENV polyprotein (coat polyprotein) usually contains two coat proteins which differ depending on the source.The structure of a number of the ENV polyprotein domains have been determined: The crystal structure of an extraviral segment of the Moloney murine leukemia virus (MoMuLV) transmembrane (TM) subunit has been determined to 1.7-A resolution. This segment contains a trimeric coiled coil, with a hydrophobic cluster at its base and a strand that packs in an antiparallel orientation against the coiled coil. This structure serves as a model for a wide range of viral fusion proteins; key residues in this structure are conserved among C- and D-type retroviruses and the filovirus ebola [].An essential step in retrovirus infection is the binding of the virus to its receptor on a target cell. The structure of the receptor-binding domain of the envelope glycoprotein from Friend murine leukemia virus (F-MuLV) has been determined determined to 2.0-A resolution. The core of the domain is an antiparallel beta sandwich, with two interstrand loops forming a helical subdomain atop the sandwich. The residues in the helical region, but not in the beta sandwich, are highly variable among mammalian C-type retroviruses with distinct tropisms, indicating that the helical subdomain determines the receptor specificity of the virus [].
Protein Domain
Name: DENN domain, C-terminal lobe
Type: Homologous_superfamily
Description: The tripartite DENN (after differentially expressed in neoplastic versus normal cells) domain is found in several proteins involved in Rab-mediatedprocesses or regulation of MAPKs (Mitogen-activated preotein kinases) signaling pathways. It actually consists of three parts as the original DENNdomain is always encircled on both sides by more divergent domains, called uDENN (after upstream DENN) and dDENN (for downstream DENN). The tripartiteDENN domain is found associated with other domains, such as RUN, PLAT, PH, PPR, WD-40, GRAM or C1. The function of DENN domain remains to date unclear, although it appears to represent a good candidate for a GTP/GDP exchange activity [ , ].The general characteristics of DENN domains - three regions dDENN, DENN itself, and uDENN having different patterns of sequence conservation andseparated by sequences of variable length - suggest that they are composed of at least three sub-domains which may feature distinct folds but which arealways associated due to functional and/or structural constraints [ ].Some proteins known to contain a tripartite DENN domain are listed below:Rat Rab3 GDP/GTP exchange protein (Rab3GEP) Human mitogen-activated protein kinase activating protein containing death domain (MADD). It is orthologous to Rab3GEP Caenorhabditis elegans regulator of presynaptic activity aex-3, the ortholog of Rab3GEP Mouse Rab6 interacting protein 1 (Rab6IP1) Human SET domain-binding factor 1(SBF1) Human suppressor of tumoreginicity 5 (ST5) Human C-MYC promoter-binding protein IRLB This entry represents the C-terminal lobe of the DENN domain which consists of a 5-stranded beta sheet surrounded by helices [ ].
Protein Domain
Name: Peptidase C58, Yersinia/Haemophilus virulence surface antigen
Type: Family
Description: This group of cysteine peptidases correspond to MEROPS peptidase family C58 (clan CA). They are found in bacteria that include plant pathogens (Pseudomonas syringae), root nodule bacteria, and intracellular pathogens (e.g. Yersinia pestis, Haemophilus ducreyi, Pasteurella multocida, Chlamydia trachomatis) of animal hosts. The peptidase domain features a catalytic triad of Cys, His, and Asp. Sequences can be extremely divergent outside of a few well-conserved motifs. YopT, a virulence effector protein of Y. pestis, cleaves and releases host cell Rho GTPases from the membrane, thereby disrupting the actin cytoskeleton. Members of the family from pathogenic bacteria are likely to be pathogenesis factors [ ].Secretion of virulence factors in Gram-negative bacteria involves transportation of the protein across two membranes to reach the cell exterior. There have been four secretion systems described in animal enteropathogens such as Salmonella and Yersinia, with further sequence homologies in plant pathogens like Ralstonia and Erwinia [ ]. The type III secretion system is of great interest, as it is used to transport virulence factors from the pathogen directly into the host cell and is only triggered when the bacterium comes into close contact with the host. The protein subunits of the system are very similar to those of bacterial flagellar biosynthesis. However, while the latter forms a ring structure to allow secretion of flagellin and is an integral part of the flagella itself, type III subunits in the outer membrane translocate secreted proteins through a channel-like structure [ ].Exotoxins secreted by the type III system do not possess a secretion signal, and are considered unique because of this [ ]. Y. pestis secretes such a protein, YopT []. YopT is injected into the host cell upon contact, and is therefore considered to be a virulence factor. Haemophilus spp. express a similar toxin on their surface, a 76kDa antigen [].A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families [ ]. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid, N-ethylmaleimide or p-chloromercuribenzoate. Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [ ].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [ ]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [ ]. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.
Protein Domain
Name: Peptidase C62, gill-associated virus
Type: Family
Description: This protease is found in polyproteins from the positive-stranded RNA virus of prawns called yellow head virus or gill-associated virus (GAV). The GAV cysteine proteinase (3C-like proteinase) is predicted to be the key enzyme in the processing of the GAV replicase polyprotein precursors, polyprotein 1a and polyprotein 1ab. This protease employs a Cys(2968)-His(2879) catalytic dyad [ ]. It is classified as family C62 in the MEROPS database.
Protein Domain
Name: RND efflux system, outer membrane lipoprotein, NodT
Type: Family
Description: Members of group are outer membrane lipoproteins from the NodT family of the RND (Resistance-Nodulation-cell Division) type efflux systems. These proteins work with an inner membrane ABC transporter ATPase and an adapter called a membrane fusion protein. Most members of this family are likely to export primarily small molecules rather than proteins, but are related to the type I protein secretion outer membrane proteins TolC and PrtF.
Protein Domain
Name: UBAP2/protein lingerer
Type: Family
Description: This entry includes ubiquitin-associated protein 2 (UBAP2) from humans and protein lingerer (lig) from Drosophila melanogaster. Lig is a cytoplasmic protein involved in initiation and termination of copulation [ ]. Lig can act as a growth suppressor that associates with the RNA-binding proteins Fragile X messenger ribonucleoprotein 1 (FMR1) and Caprin (Capr) and directly interacts with and regulates the RNA-binding protein Rasputin (Rin) [].
Protein Domain
Name: Chaperonin Cpn60, conserved site
Type: Conserved_site
Description: The assembly of proteins has been thought to be the sole result of properties inherent in the primary sequence of polypeptides themselves. In some cases, however, structural information from other protein molecules is required for correct folding and subsequent assembly into oligomers [ ]. These `helper' molecules are referred to as molecular chaperones, a subfamily of which are the chaperonins []. They are required for normal cell growth (as demonstrated by the fact that no temperature sensitive mutants for the chaperonin genes can be found in the temperature range 20 to 43 degrees centigrade []), and are stress-induced, acting to stabilise or protect disassembled polypeptides under heat-shock conditions []. Type I chaperonins present in eubacteria, mitochondria and chloroplasts require the concerted action of 2 proteins, chaperonin 60 (cpn60) and chaperonin 10 (cpn10). Type II chaperonins, found in eukaryotic cytosol and in Archaebacteria, comprise only a cpn60 member.The 10kDa chaperonin (cpn10 - or groES in bacteria) exists as a ring-shaped oligomer of between 6 to 8 identical subunits, whereas the 60kDa chaperonin (cpn60 - or groEL in bacteria) forms a structure comprising 2 stacked rings, each ring containing 7 identical subunits [ ]. These ring structures assemble by self-stimulation in the presence of Mg2+-ATP. The central cavity of the cylindrical cpn60 tetradecamer provides as isolated environment for protein folding whilst cpn-10 binds to cpn-60 and synchronizes the release of the folded protein in an Mg2+-ATP dependent manner [ , ]. The binding of cpn10 to cpn60 inhibits the weak ATPase activity of cpn60.The 60kDa form of chaperonin is the immunodominant antigen of patients with Legionnaire's disease [ ], and is thought to play a role in the protection of the Legionella spp. bacteria from oxygen radicals within macrophages. This hypothesis is based on the finding that the cpn60 gene is upregulated in response to hydrogen peroxide, a source of oxygen radicals. Cpn60 has also been found to display strong antigenicity in many bacterial species [], and has the potential for inducing immune protection against unrelated bacterial infections. The RuBisCO subunit binding protein (which has been implicated in the assembly of RuBisCO) and cpn60 have been found to be evolutionary homologues, the RuBisCO subunit binding protein having the C-terminal Gly-Gly-Met repeat found in all bacterial cpn60 sequences. Although the precise function of this repeat is unknown, it is thought to be important as it is also found in 70kDa heat-shock proteins []. The crystal structure of Escherichia coli GroEL has been resolved to 2.8A [].
Protein Domain
Name: Matrin/U1-C-like, C2H2-type zinc finger
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few [ ]. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short β hairpin and an α helix (β/β/α structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf's can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2 [ ]. C2H2 Znf's are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes []. Transcription factors usually contain several Znf's (each with a conserved β/β/α structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short α-helix in the Znf, the Znf spanning 3-4 bases of the DNA []. C2H2 Znf's can also bind to RNA and protein targets [].This entry represents U1-type zinc finger domains, a family of C2H2-type zinc fingers present in matrin, U1 small nuclear ribonucleoprotein C and other RNA-binding proteins [ , ].
Protein Domain
Name: Amyloidogenic glycoprotein, heparin-binding domain superfamily
Type: Homologous_superfamily
Description: Amyloid-beta precursor protein (APP, or A4) is associated with Alzheimer's disease (AD), because one of its breakdown products, amyloid-beta (A-beta), aggregates to form amyloid or senile plaques [ , , ]. Mutations in APP or in proteins that process APP have been linked with early-onset, familial AD. Individuals with Down's syndrome carry an extra copy of chromosome 21, which contains the APP gene, and almost invariably develop amyloid plaques and Alzheimer's symptoms.APP is important for the neurogenesis and neuronal regeneration, either through the intact protein, or through its many breakdown products [ , ]. APP consists of a large N-terminal extracellular region containing heparin-binding and copper-binding sites, Kunitz domain, E2 domain, a short hydrophobic transmembrane domain, and a short C-terminal intracellular domain. The N-terminal region is similar in structure to cysteine-rich growth factors and appears to function as a cell surface receptor, contributing to neurite growth, neuronal adhesion, axonogenesis and cell mobility []. APP acts as a kinesin I membrane receptor to mediate the axonal transport of beta-secretase and presenilin 1. The N-terminal domain can regulate neurite outgrowth through its binding to heparin and collagen I and IV, which are components of the extracellular matrix. APP is also coupled to apoptosis-inducing pathways, and is involved in copper homeostasis/oxidative stress through copper ion reduction, where copper-metallated APP induces neuronal death [, ]. The C-terminal intracellular domain appears to be involved in transcription regulation through protein-protein interactions. APP can promote transcription activation through binding to APBB1/Tip60, and may bind to the adaptor protein FE65 to transactivate a wide variety of different promoters.APP can be processed by different sets of enzymes:In the non-amyloidogenic (non-plaque-forming) pathway, APP is cleaved by alpha-secretase to yield a soluble N-terminal sAPP-alpha (neuroprotective) and a membrane-bound CTF-alpha. CTF-alpha is broken-down by presenilin-containing gamma-secretase to yield soluble p3 and membrane-bound AICD (nuclear signalling). In the amyloidogenic pathway (plaque-forming), APP is broken down by beta-secretase to yield soluble sAPP-beta and membrane-bound CTF-beta. CTF-beta is broken down by gamma-secretase to yield soluble amyloid-beta and membrane-bound AICD. Amyloid-beta is required for neuronal function, but can aggregate to form amyloid plaques that seem to disrupt brain cells by clogging points of cell-cell contact.This superfamily represents a heparin-binding domain found at the N-terminal of the extracellular domain, which is itself found at the N-terminal of amyloidogenic glycoproteins such as amyloid-beta precursor protein (APP, or A4). The core of the heparin-binding domain has an unusual disulphide-rich fold, consisting of a beta-x-α-β-loop-beta topology [ ].
Protein Domain
Name: Prion/Doppel beta-ribbon domain superfamily
Type: Homologous_superfamily
Description: This entry represents the C-terminal β-ribbon domain superfamily found in prion proteins [ ] and the prion-like Doppel proteins []. This domain has a beta-α-β-alpha(2) structure that contains an antiparallel β-ribbon.Prion protein (PrP-c) [ , , ] is a small glycoprotein found in high quantity in the brain of animals infected with certain degenerative neurological diseases, such as sheep scrapie and bovine spongiform encephalopathy (BSE), and the human dementias Creutzfeldt-Jacob disease (CJD) and Gerstmann-Straussler syndrome (GSS). PrP-c is encoded in the host genome and is expressed both in normal and infected cells. During infection, however, the PrP-c molecule become altered (conformationally rather than at the amino acid level) to an abnormal isoform, PrP-sc. In detergent-treated brain extracts from infected individuals, fibrils composed of polymers of PrP-sc, namely scrapie-associated fibrils or prion rods, can be evidenced by electron microscopy. The precise function of the normal PrP isoform in healthy individuals remains unknown. Several results, mainly obtained in transgenic animals, indicate that PrP-c might play a role in long-term potentiation, in sleep physiology, in oxidative burst compensation (PrP can fix four Cu2+ through its octarepeat domain), in interactions with the extracellular matrix (PrP-c can bind to the precursor of the laminin receptor, LRP), in apoptosis and in signal transduction (costimulation of PrP-c induces a modulation of Fyn kinase phosphorylation) [].The normal isoform, PrP-c, is anchored at the cell membrane, in rafts, through a glycosyl phosphatidyl inositol (GPI); its half-life at the cell surface is 5 h, after which the protein is internalised through a caveolae-dependent mechanism and degraded in the endolysosome compartment. Conversion between PrP-c and PrP-sc occurs likely during the internalisation process. In humans, PrP is a 253 amino acid protein, which has a molecular weight of 35-36kDa. It has two hexapeptides and repeated octapeptides at the N terminus, a disulphide bond and is associated at the C terminus with a GPI, which enables it to anchor to the external part of the cell membrane. The secondary structure of PrP-c is mainly composed of α-helices, whereas PrP-sc is mainly β-sheets: transconformation of α-helices into β-sheets has been proposed as the structural basis by which PrP acquires pathogenicity in TSEs. The three-dimensional structures shows the protein to be made of a globular domain which includes three α-helices and two small antiparallel β-sheet structures, and a long flexible tail whose conformation depends on the biophysical parameters of the environment. Crystals of the globular domain of PrP have recently been obtained; their analysis suggests a possible dimerisation of the protein through the three-dimensional swapping of the C-terminal helix 3 and rearrangement of the disulphide bond.
Protein Domain
Name: Prion/Doppel protein, beta-ribbon domain
Type: Domain
Description: This entry represents the C-terminal β-ribbon domain found in prion proteins [ ] and the prion-like Doppel proteins []. This domain has a beta-α-β-alpha(2) structure that contains an antiparallel β-ribbon.Prion protein (PrP-c) [ , , ] is a small glycoprotein found in high quantity in the brain of animals infected with certain degenerative neurological diseases, such as sheep scrapie and bovine spongiform encephalopathy (BSE), and the human dementias Creutzfeldt-Jacob disease (CJD) and Gerstmann-Straussler syndrome (GSS). PrP-c is encoded in the host genome and is expressed both in normal and infected cells. During infection, however, the PrP-c molecule become altered (conformationally rather than at the amino acid level) to an abnormal isoform, PrP-sc. In detergent-treated brain extracts from infected individuals, fibrils composed of polymers of PrP-sc, namely scrapie-associated fibrils or prion rods, can be evidenced by electron microscopy. The precise function of the normal PrP isoform in healthy individuals remains unknown. Several results, mainly obtained in transgenic animals, indicate that PrP-c might play a role in long-term potentiation, in sleep physiology, in oxidative burst compensation (PrP can fix four Cu2+ through its octarepeat domain), in interactions with the extracellular matrix (PrP-c can bind to the precursor of the laminin receptor, LRP), in apoptosis and in signal transduction (costimulation of PrP-c induces a modulation of Fyn kinase phosphorylation) [].The normal isoform, PrP-c, is anchored at the cell membrane, in rafts, through a glycosyl phosphatidyl inositol (GPI); its half-life at the cell surface is 5 h, after which the protein is internalised through a caveolae-dependent mechanism and degraded in the endolysosome compartment. Conversion between PrP-c and PrP-sc occurs likely during the internalisation process. In humans, PrP is a 253 amino acid protein, which has a molecular weight of 35-36kDa. It has two hexapeptides and repeated octapeptides at the N terminus, a disulphide bond and is associated at the C terminus with a GPI, which enables it to anchor to the external part of the cell membrane. The secondary structure of PrP-c is mainly composed of α-helices, whereas PrP-sc is mainly β-sheets: transconformation of α-helices into β-sheets has been proposed as the structural basis by which PrP acquires pathogenicity in TSEs. The three-dimensional structures shows the protein to be made of a globular domain which includes three α-helices and two small antiparallel β-sheet structures, and a long flexible tail whose conformation depends on the biophysical parameters of the environment. Crystals of the globular domain of PrP have recently been obtained; their analysis suggests a possible dimerisation of the protein through the three-dimensional swapping of the C-terminal helix 3 and rearrangement of the disulphide bond.
Protein Domain
Name: Amyloidogenic glycoprotein, heparin-binding
Type: Domain
Description: Amyloid-beta precursor protein (APP, or A4) is associated with Alzheimer's disease (AD), because one of its breakdown products, amyloid-beta (A-beta), aggregates to form amyloid or senile plaques [ , , ]. Mutations in APP or in proteins that process APP have been linked with early-onset, familial AD. Individuals with Down's syndrome carry an extra copy of chromosome 21, which contains the APP gene, and almost invariably develop amyloid plaques and Alzheimer's symptoms.APP is important for the neurogenesis and neuronal regeneration, either through the intact protein, or through its many breakdown products [ , ]. APP consists of a large N-terminal extracellular region containing heparin-binding and copper-binding sites, Kunitz domain, E2 domain, a short hydrophobic transmembrane domain, and a short C-terminal intracellular domain. The N-terminal region is similar in structure to cysteine-rich growth factors and appears to function as a cell surface receptor, contributing to neurite growth, neuronal adhesion, axonogenesis and cell mobility []. APP acts as a kinesin I membrane receptor to mediate the axonal transport of beta-secretase and presenilin 1. The N-terminal domain can regulate neurite outgrowth through its binding to heparin and collagen I and IV, which are components of the extracellular matrix. APP is also coupled to apoptosis-inducing pathways, and is involved in copper homeostasis/oxidative stress through copper ion reduction, where copper-metallated APP induces neuronal death [, ]. The C-terminal intracellular domain appears to be involved in transcription regulation through protein-protein interactions. APP can promote transcription activation through binding to APBB1/Tip60, and may bind to the adaptor protein FE65 to transactivate a wide variety of different promoters. APP can be processed by different sets of enzymes:In the non-amyloidogenic (non-plaque-forming) pathway, APP is cleaved by alpha-secretase to yield a soluble N-terminal sAPP-alpha (neuroprotective) and a membrane-bound CTF-alpha. CTF-alpha is broken-down by presenilin-containing gamma-secretase to yield soluble p3 and membrane-bound AICD (nuclear signalling). In the amyloidogenic pathway (plaque-forming), APP is broken down by beta-secretase to yield soluble sAPP-beta and membrane-bound CTF-beta. CTF-beta is broken down by gamma-secretase to yield soluble amyloid-beta and membrane-bound AICD. Amyloid-beta is required for neuronal function, but can aggregate to form amyloid plaques that seem to disrupt brain cells by clogging points of cell-cell contact.This entry represents a heparin-binding domain found at the N-terminal of the extracellular domain, which is itself found at the N-terminal of amyloidogenic glycoproteins such as amyloid-beta precursor protein (APP, or A4). The core of the heparin-binding domain has an unusual disulphide-rich fold, consisting of a beta-x-α-β-loop-beta topology [ ].
Protein Domain
Name: Antimicrobial/protein inhibitor, gamma-crystallin-like
Type: Homologous_superfamily
Description: This structural domain is found in a family of proteins with a Greek key motif [ ] that are related in structure to the beta/gamma crystallins. This group of proteins includes:Streptomyces killer toxin-like protein SKLPAntifungal protein, AFP1Plant antimicrobial protein, MIAMP1Streptomyces metalloproteinase inhibitor, SMPI [ ].Note: SMPI belongs to MEROPS inhibitor I36, clan IU. It inhibits members of the thermolysin family, MEROPS peptidase family M4, ( ) [ ].
Protein Domain
Name: Equine arteritis virus GP2b envelope glycoprotein
Type: Family
Description: Equine arteritis virus (EAV) is an enveloped, positive-strand RNA virus belonging to the family Arteriviridae. EAV virions contain six different envelope proteins. GP5 (previously named GL) and the unglycosylated membrane protein M are the major envelope proteins, while the glycoproteins GP2b (previously named Gs), GP3, and GP4 are minor structural proteins [ ]. GP2b is a class I transmembrane protein which adopts a number of different conformations [, ].
Protein Domain
Name: GH3 family
Type: Family
Description: GH3 protein was first isolated from Glycine max (soybean) as an early auxin-responsive gene [ , ]. Later, several plant GH3 family proteins have been identified and classified into three groups: group I proteins synthesise JA-amino acid conjugates [], group II proteins produce indole-3-acetic acid (IAA) conjugates [], group III protein are involved in the conjugation of amino acids to 4-substituted benzoate []. This entry also includes proteins from bacteria, fungi and animals.
Protein Domain
Name: Yippee/Mis18/Cereblon
Type: Domain
Description: This domain is found in both Yippee-type proteins and Mis18 kinetochore proteins. Yippee are putative zinc-binding/DNA-binding proteins [ ]. Mis18 are proteins involved in the priming of centromeres for recruiting CENP-A [, ]. Mis18-alpha and beta form part of a small complex with Mis18-binding protein. Mis18-alpha is found to interact with DNA de-methylases through a Leu-rich region located at its carboxyl terminus []. This domain includes the CULT domain from proteins such as Cereblon [].
Protein Domain
Name: Terminal EAR1-like, RNA recognition motif 3
Type: Domain
Description: This entry represents the RNA recognition motif 3 (RRM3) of terminal EAR1-like proteins, including terminal EAR1-like protein 1 and 2 (TEL1 and TEL2) found in land plants [ ]. They may play a role in the regulation of leaf initiation []. The terminal EAR1-like proteins are putative RNA-binding proteins carrying three RNA recognition motifs (RRMs), and TEL characteristic motifs that allow sequence and putative functional discrimination between the terminal EAR1-like proteins and Mei2-like proteins [].
Protein Domain
Name: CDI inhibitor, Bp1026b-like
Type: Domain
Description: CDI toxins are expressed by Gram-negative bacteria as part of a mechanism to inhibit the growth of neighbouring cells [ ]. Secretion of the CdiA effector protein is dependent on the outer membrane protein CdiB. An additional small immunity protein (CdiI) protects cells from autoinhibition. They are intracellular proteins that inactivate the toxin/effector protein.This entry represents the inhibitor of the CdiA effector protein from Burkholderia pseudomallei 1026b (which is a tRNAse) [ ].
Protein Domain
Name: Cholesteryl ester transfer
Type: Family
Description: This group represents cholesteryl ester transfer protein, also known as lipid transfer protein 1. This protein shuttles various lipids between lipoproteins, resulting in the net transfer of cholesteryl esters from atheroprotective, high-density lipoproteins (HDL) to atherogenic, lower-density species. Thus, inhibtion of this enzyme raises the concentration of protective HDL and could conceivably be used as a treatment for cardiovascular disease. The protein forms a long tunnel which can accomodate four lipid molecules [ ].
Protein Domain
Name: Serum albumin, conserved site
Type: Conserved_site
Description: A number of serum transport proteins are known to be evolutionarily related, including albumin, alpha-fetoprotein, vitamin D-binding protein and afamin [ , , ]. Albumin is the main protein of plasma; it binds water, cations (such as Ca2+, Na +and K +), fatty acids, hormones, bilirubin and drugs - its main function is to regulate the colloidal osmotic pressure of blood. Alphafeto- protein (alpha-fetoglobulin) is a foetal plasma protein that binds various cations, fatty acids and bilirubin. Vitamin D-binding protein binds to vitamin D and its metabolites, as well as to fatty acids. The biological role of afamin (alpha-albumin) has not yet been characterised. The 3D structure of human serum albumin has been determined by X-ray crystallography to a resolution of 2.8A [ ]. It comprises three homologous domains that assemble to form a heart-shaped molecule []. Each domain is a product of two subdomains that possess common structural motifs []. The principal regions of ligand binding to human serum albumin are located in hydrophobic cavities in subdomains IIA and IIIA, which exhibit similar chemistry. Structurally, the serum albumins are similar, each domain containing five or six internal disulphide bonds, as shown schematically below:+---+ +----+ +-----+ | | | | | |xxCxxxxxxxxxxxxxxxxCCxxCxxxxCxxxxxCCxxxCxxxxxxxxxCxxxxxxxxxxxxxxCCxxxxCxxxx | | | | | |+-----------------+ +-----+ +---------------+ This entry represents a conserved site that covers the three conserved cysteines at the end of the Serum albumin domain. It is built in such a way that it can detect all 3 repeats in albumin and human afamin, the first two in AFP and the first one in VDB and rat afamin.
Protein Domain
Name: Histidine triad, conserved site
Type: Conserved_site
Description: The Histidine Triad (HIT) motif, His-x-His-x-His-x-x (x, a hydrophobic amino acid) was identified as being highly conserved in a variety of organisms []. Crystal structure of rabbit Hint, purified as an adenosine and AMP-binding protein, showed that proteins in the HITsuperfamily are conserved as nucleotide-binding proteins and that Hint homologues, which are found in all forms of life, are structurally related to Fhit homologues and GalT-related enzymes, which have more restricted phylogenetic profiles [ ]. Hint homologues including rabbit Hint and yeastHnt1 hydrolyse adenosine 5' monophosphoramide substrates such as AMP-NH2 and AMP-lysine to AMP plus the amine product and function as positive regulatorsof Cdk7/Kin28 in vivo [ ]. Fhit homologues are diadenosine polyphosphate hydrolases [] and function as tumour suppressors in human and mouse [] though the tumour suppressing function of Fhit does not depend on ApppA hydrolysis []. The third branch of the HIT superfamily, which includesGalT homologues, contains a related His-X-His-X-Gln motif and transfers nucleoside monophosphate moieties to phosphorylated second substrates ratherthan hydrolysing them [ ].The bovine protein kinase C inhibitor, PKCI-1, is an inhibitor protein that binds zinc without the use of zinc-finger motifs [ ]. Each protein molecule binds one zinc ion via a novel binding site containing 3 closely-spaced histidine residues []. This region, referred to as the histidine triad (HIT) [], has been identified in various prokaryotic and eukaryotic proteins of uncertain function [].The signature pattern used in this entry contains the region of the histidine triad and includes the three conserved histidine residues which are thought to bind the zinc ion.
Protein Domain
Name: Cupredoxin
Type: Homologous_superfamily
Description: Copper is one of the most prevalent transition metals in living organisms and its biological function is intimately related to its redox properties. Since free copper is toxic, even at very low concentrations, its homeostasis in living organisms is tightly controlled by subtle molecular mechanisms. In eukaryotes, before being transported inside the cell via the high-affinity copper transporters of the CTR family, the copper (II) ion is reduced to copper (I). In blue copper proteins such as Cupredoxin, the copper (I) ion form is stabilised by a constrained His2Cys coordination environment.This entry represents cupredoxin proteins, as well as structural homologues to cupredoxin. Structurally, the cupredoxin-like fold consists of a β-sandwich with 7 strands in 2 β-sheets, which is arranged in a Greek-key β-barrel [ ]. Some of these proteins have lost the ability to bind copper. Proteins with a cupredoxin-type fold are found in the following family groups: Mono-domain cupredoxins, such as amicyanin, plastocyanin, pseudoazurin, plantacyanin, azurin, auracyanin, rusticyanin, stellacyanin, and mavicyanin.Multi-domain cupredoxins, such as nitrite reductase (2 domains of this fold), multicopper oxidase CueO, spore coat protein A, ascorbate oxidase (3 domains of this fold), laccase (3 domains of this fold), ceruloplamin (6 domains of this fold), and coagulation factor V.Red copper protein nitrocyanin and the C-terminal of nitrous oxide reductase.Quinol oxidase and the periplasmic domain of cytochrome c oxidase subunit II.Ephrin-a5 and ephrin-b2 ectodomain, which are related to cupredoxins but lack the metal-binding site.The N-terminal domain of protein arginine deiminase Pad4, which is related to cupredoxin but lacks the metal-biding site.
Protein Domain
Name: DnaJ domain, conserved site
Type: Conserved_site
Description: The hsp70 chaperone machine performs many diverse roles in the cell, including folding of nascent proteins, translocation of polypeptides across organelle membranes, coordinating responses to stress, and targeting selected proteins for degradation. DnaJ is a member of the hsp40 family of molecular chaperones, which is also called the J-protein family, the members of which regulate the activity of hsp70s. DnaJ (hsp40) binds to dnaK (hsp70) and stimulates its ATPase activity, generating the ADP-bound state of dnaK, which interacts stably with the polypeptide substrate [ , ]. Structurally, the DnaJ protein consists of an N-terminal conserved domain (called 'J' domain) of about 70 amino acids, a glycine-rich region ('G' domain') of about 30 residues, a central domain containing four repeats of a CXXCXGXG motif ('CRR' domain) and a C-terminal region of 120 to 170 residues.Such a structure is shown in the following schematic representation: +------------+-+-------+-----+-----------+--------------------------------+ | J-domain | | Gly-R | | CXXCXGXG | C-terminal |+------------+-+-------+-----+-----------+--------------------------------+ The structure of the J-domain has been solved [ ]. The J domain consists of four helices, the second of which has a charged surface that includes basic residues that are essential for interaction with the ATPase domain of hsp70 []. J-domains are found in many prokaryotic and eukaryotic proteins [ ]. In yeast, three J-like proteins have been identified containing regions closely resembling a J-domain, but lacking the conserved HPD motif - these proteins do not appear to act as molecular chaperones [ ]. This entry represents a conserved site found within the J-domain.
Protein Domain
Name: Histone-lysine N-methyltransferase, plant
Type: Family
Description: Members of this family are polycomb group (PcG) proteins from plants. They act as the catalytic subunit of some PcG multiprotein complex, which methylates 'Lys-27' of histone H3, leading to transcriptional repression of the affected target genes. These enzymes are also required to regulate floral development by repressing the AGAMOUS homeotic gene in leaves, inflorescence stems and flowers. They regulate the antero-posterior organisation of the endosperm, as well as the division and elongation rates of leaf cells. PcG proteins act by forming multiprotein complexes, which are required to maintain the transcriptionally repressive state of homeotic genes throughout development. PcG proteins are not required to initiate repression, but to maintain it during later stages of development [ , , , , , , , , ].Methyltransferases (EC [intenz:2.1.1.-]) constitute an important class of enzymes present in every life form. They transfer a methyl group most frequently from S-adenosyl L-methionine (SAM or AdoMet) to a nucleophilic acceptor such as oxygen leading to S-adenosyl-L-homocysteine (AdoHcy) and a methylated molecule [, , ]. All these enzymes have in common a conserved region of about 130 amino acid residues that allow them to bind SAM []. The substrates that are methylated by these enzymes cover virtually every kind of biomolecules ranging from small molecules, to lipids, proteins and nucleic acids [, , ]. Methyltransferase are therefore involved in many essential cellular processes including biosynthesis, signal transduction, protein repair, chromatin regulation and gene silencing [, , ]. More than 230 families of methyltransferases have been described so far, of which more than 220 use SAM as the methyl donor.
Protein Domain
Name: Type VI secretion system effector Hcp
Type: Family
Description: This entry consists of Hcp-like proteins. Hcp appears to be part of the type VI secretion system of Gram-negative bacteria. Hcp is not only a secreted effector protein, but also might act as machine component [ ].Several bacterial pathogens mediate interactions with their hosts through protein secretion, often involving Hcp-like virulence loci, which are widely distributed among pathogenic bacteria. Homologues of Hcp are found in various bacteria of which most, but not all, are known pathogens. Many bacteria have two copies of hcp genes [ , ]. In Pseudomonas syringae, Hcp1 is a virulence protein, while Hcp2 seems to be required for survival in competition with enterobacteria and yeasts, and its function is associated with the suppression of the growth of these competitors [].Hcp1 monomers form a hexameric ring with a large internal diameter. Assembly of this particle is likely to occur following secretion, and could have a role in building a channel for the transport of other macromolecules [ ].The type VI secretion system (T6SS) is a supra-molecular bacterial complex that resembles phage tails. It is a toxin delivery systems which fires toxins into target cells upon contraction of its TssBC sheath [ ]. Thirteen essential core proteins are conserved in all T6SSs: the membrane associated complex TssJ-TssL-TssM, the baseplate proteins TssE, TssF, TssG, and TssK, the bacteriophage-related puncturing complex composed of the tube (Hcp), the tip/puncturing device VgrG, and the contractile sheath structure (TssB and TssC). Finally, the starfish-shaped dodecameric protein, TssA, limits contractile sheath polymerization at its distal part when TagA captures TssA [].
Protein Domain
Name: Innexin
Type: Family
Description: This entry includes pannexins from vertebrates and innexins from invertebrate [ ]. Gap junctions are composed of membrane proteins,which form a channel permeable for ions and small molecules connecting cytoplasm of adjacent cells. Although gap junctions provide similar functionsin all multicellular organisms, until recently it was believed that vertebrates and invertebrates use unrelated proteins for this purpose. Whilethe connexins family of gap junction proteins is well- characterised in vertebrates, no homologues have been found in invertebrates. Inturn, gap junction molecules with no sequence homology to connexins have been identified in insects and nematodes. It has been suggested that these proteinsare specific invertebrate gap junctions, and they were thus named innexins (invertebrate analog of connexins) []. As innexin homologues were recently identified in other taxonomic groups including vertebrates, indicating their ubiquitous distribution in the animal kingdom, they were called pannexins(from the Latin pan-all, throughout, and nexus-connection, bond) [ , , ].Genomes of vertebrates carry probably a conserved set of 3 pannexin paralogs (PANX1, PANX2 and PANX3). Invertebrate genomes may contain more than a dozenpannexin (innexin) genes. Vinnexins, viral homologues of pannexins/innexins, were identified in Polydnaviruses that occur in obligate symbioticassociations with parasitoid wasps. It was suggested that virally encoded vinnexin proteins may function to alter gap junction proteins in infected hostcells, possibly modifying cell-cell communication during encapsulation responses in parasitized insects [, ]. Structurally pannexins are simillar to connexins. Both types of proteinconsist of a cytoplasmic N-terminal domain, followed by four transmembrane segments that delimit two extracellular and one cytoplasmic loops; the C-terminal domain is cytoplasmic.
Protein Domain
Name: PLEKHM1, PH domain
Type: Domain
Description: PLEKHM1 is a ubiquitously expressed protein involved in the regulation of osteoclast function and bone resorption [ ]. It may function as an adaptor protein that acts as a central hub to integrate endocytic and autophagic pathways at the lysosome []. PLEKHM1 contains an N-terminal RUN domain (RPIP8/RaP2 interacting protein 8, UNC-14 and NESCA/new molecule containing SH3 at the carboxyl-terminus), followed by a PH domain, and either a C1 domain or a DUF4206 domain at its C terminus. The RUN domain is thought to be involved in Rab-mediated membrane trafficking, possibly as a Rab-binding site. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner [ ]. They share little sequence conservation, but all have a common fold, which is electrostatically polarized. Less than 10% of PH domains bind phosphoinositide phosphates (PIPs) with high affinity and specificity []. PH domains are distinguished from other PIP-binding domains by their specific high-affinity binding to PIPs with two vicinal phosphate groups: PtdIns(3,4)P2, PtdIns(4,5)P2 or PtdIns(3,4,5)P3 which results in targeting some PH domain proteins to the plasma membrane []. A few display strong specificity in lipid binding. Any specificity is usually determined by loop regions or insertions in the N terminus of the domain, which are not conserved across all PH domains. PH domains are found in cellular signaling proteins such as serine/threonine kinase, tyrosine kinases, regulators of G-proteins, endocytotic GTPases, adaptors, as well as cytoskeletal associated molecules and in lipid associated enzymes [ ].
Protein Domain
Name: ATP-binding cassette sub-family C member 4, six-transmembrane helical domain 2
Type: Domain
Description: The ABC transporter family is a group of membrane proteins that use the hydrolysis of ATP to power the translocation of a wide variety of substrates across cellular membranes. ABC transporters minimally consist of two conserved regions: a highly conserved nucleotide-binding domain (NBD) and a less conserved transmembrane domain (TMD). Eukaryotic ABC proteins are usually organised either as full transporters (containing two NBDs and two TMDs), or as half transporters (containing one NBD and one TMD), that have to form homo- or heterodimers in order to constitute a functional protein [ ].This group of proteins includes ATP-binding cassette sub-family C member 4 (ABCC4, also known as Multidrug resistance-associated protein 4, MRP4) from animals. It belongs to the MRP (multidrug resistance protein) subfamily of the ATP-binding cassette (ABC) transporter family [ ]. MRP4 is a broad specificity organic anion exporter that seems to be able to mediate the transport of conjugated steroids, prostaglandins, and glutathione []. It can localise to both basolateral and apical membranes in polarised cells, depending on the tissue where it is found []. Together with CFTR, MRP4 functions in the regulation of cAMP and beta-adrenergic contraction in cardiac myocytes [ ]. MRP4 has been implicated in the high proliferative growth of some tumours including prostate tumours and neuroblastoma [ , ]. It confers resistance to anticancer agents including thiopurine analogues, MTX and topotecan []. This protein has a typical ABC transporter structure and each composed of two transmembrane domains (TMD1 and TMD2) and two nucleotide domains (NBD1 and NBD2).This entry represents the six-transmembrane helical domain 2 (TMD2) of ABCC4.
Protein Domain
Name: TryX and NRX, thioredoxin domain
Type: Domain
Description: This entry represents the thioredoxin domain found in tryparedoxin (TryX) from trypanosomes and nucleoredoxin (NRX) from animals and plants. They belong to the Thioredoxin (TRX) family, whose members are evolutionary conserved proteins involved in various biologic processes by regulating the response to oxidative stress [ ]. TryX and NRX are disulfide oxidoreductases that alter the redox state of target proteins via the reversible oxidation of an active centre CXXC motif [, ].TryX is involved in the regulation of oxidative stress in parasitic trypanosomatids by reducing TryX peroxidase, which in turn catalyses the reduction of hydrogen peroxide and organic hydroperoxides. TryX derives reducing equivalents from reduced trypanothione, a polyamine peptide conjugate unique to trypanosomatids, which is regenerated by the NADPH-dependent flavoprotein trypanothione reductase [ , , ]. Vertebrate NRX is a 400-amino acid nuclear protein with one redox active TRX domain containing a CPPC active site motif followed by one redox inactive TRX-like domain. Mouse NRX transcripts are expressed in all adult tissues but is restricted to the nervous system and limb buds in embryos [ , , ]. It has been shown to interacts with Dishevelled (Dvl), an essential adaptor protein for Wnt signalling, and blocks the activation of the Wnt pathway []. Plant NRX, longer than the vertebrate NRX by about 100-200 amino acids, is a nuclear protein containing a redox inactive TRX-like domain between two redox active TRX domains. Both vertebrate and plant NRXs show thiol oxidoreductase activity in vitro. Their localization in the nucleus suggests a role in the redox regulation of nuclear proteins such as transcription factors [, ].
Protein Domain
Name: MTMR4, PH-GRAM domain
Type: Domain
Description: Myotubularin-related protein 4 (MTMR4) is a member of the myotubularin (MTM) family. It is the only family member that possesses a FYVE domain (a zinc finger domain) at its C terminus [ ]. MTMR4 has dual-specificity phosphatase activity []; some studies have shown that it can dephosphorylate PI3P or PI(3,5)P2, suggesting that MTMR4 is also a lipid phosphatase []. MTMR4 has a unique distribution to endosomes [] and has been shown to function in early and recycling endosomes [, ]. MTMR4 attenuates TGF-beta signalling by dephosphorylating intracellular signalling mediator R-Smads []. Similarly, it acts as a negative modulator for the homeostasis of bone morphogenetic proteins (BMPs) signalling [].Both MTMR3 and MTMR4 contain a N-terminal PH-GRAM domain, a Rac-induced recruitment domain (RID) domain, an active PTP domain, a SET-interaction domain, a coiled-coil region, and a C-terminal lipid-binding FYVE domain which binds phosphotidylinositol-3-phosphate. Myotubularin-related proteins are a subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids. Mutations in this family cause the human neuromuscular disorders myotubular myopathy and type 4B Charcot-Marie-Tooth syndrome [ ]. Six of the 13 MTMRs (MTMRs 5, 9-13) contain naturally occurring substitutions of residues required for catalysis by PTP family enzymes. Although these proteins are predicted to be enzymatically inactive, they are thought to function as antagonists of endogenous phosphatase activity or interaction modules []. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold [ ]. This entry represents the PH-GRAM domain of myotubularin-related protein 4.
Protein Domain
Name: Hook, C-terminal
Type: Domain
Description: The Hook family consists of several proteins from different eukaryotic organisms, first identified in Drosophila melanogaster in which play a role in endocytic cargo sorting [ ]. In Drosophila and fungi there is a single Hook gene, whereas mammals have three Hook genes, Hook1, Hook2 and Hook3. Endogenous Hook3 binds to Golgi membranes while both Hook1 and Hook2 are localised to discrete but unidentified cellular structures [, ]. In mice the Hook1 gene is predominantly expressed in the testis. Hook1 function is necessary for the correct positioning of microtubule structures within the haploid germ cell. Disruption of Hook1 function in mice causes abnormal sperm head shape and fragile attachment of the flagellum to the sperm head []. They are a widely expressed class of dynein-associated cargo adaptor proteins which include different domains. The N-terminal part of these proteins is sufficient to form a stable complex with dynein-dynactin and includes the most conserved region within the first 160 amino acids, termed the Hook domain. This domain is followed by three coiled-coil domains, important for dimerization and activation of dynein-dynactin complex motility, and then a C-terminal domain that binds a variety of proteins specific for each Hook isoform, involved in binding to specific organelles (organelle-binding domains). All mammalian Hook isoforms form a complex with Fused Toes and the Fused Toes- and Hook-interacting protein; fungal homologues of these proteins are important for dynein-mediated early endosome transport by linking Hook to the cargo [ ].This entry represents the central coiled-coiled region and the divergent C-terminal domain from Hook proteins.
Protein Domain
Name: B30.2/SPRY domain superfamily
Type: Homologous_superfamily
Description: The B30.2 domain was first identified as a protein domain encoded by an exon (named B30-2) in the Homo sapiens class I major histocompatibility complex region [ ], whereas the SPRY domain was first identified in a Dictyostelium discoideum kinase splA and mammalian calcium-release channels ryanodine receptors []. B30.2 domain consists of PRY and SPRY subdomains. The SPRY domains (after SPla and the RYanodine Receptor) are shorter at the N terminus than the B30.2 domains. The ~200-residue B30.2/SPRY (for B30.2 and/or SPRY) domain is present in a large number of proteins with diverse individual functions in different biological processes. The B30.2/SPRY domain in these proteins is likely to function through protein-protein interaction [].The N-terminal ~60 residues of B30.2/SPRY domains are poorly conserved and, as a consequence, a new domain name PRY was coined for a group of similar sequence segments N-terminal to the SPRY domains [ ]. The B30.2/SPRY domain contains three highly conserved motifs (LDP, WEVE and LDYE) []. The B30.2/SPRY domain adopts a highly distorted, compact β-sandwich fold with two additional short β-helices at the N terminus. The β-sandwich of the B30.2/SPRY domain consists of two layers of β-sheets: sheet A composed of eight strands and sheet B composed of seven strands. All the β-strands are in antiparallel arrangement []. The 5th β-strand corresponding to WEVE motif []. Both the N- and C-terminal ends of the B30.2/SPRY domains in general are close to each other [].This superfamily also matches diverse E3 ubiquitin-protein ligases. This protein is involved in the apoptosis process [ , ].
Protein Domain
Name: Herpesvirus viron egress-type
Type: Family
Description: During primary envelopment Human herpesvirus 1 (HHV-1, HSV-1) nucleocapsids translocate from the nucleus to the cytoplasm. Lining the inside of the INM is the nuclear lamina, which is composed of a meshwork of proteins with spaces too small for the capsid to move through without some disruption of the lamina. The lamina is mainly made up of lamin A/C and lamin B proteins, with smaller amounts of other proteins also present; this lamina must be disrupted before the nucleocapsids can egress. UL31, nuclear egress protein 2 (also known as UL34) and US3 proteins of herpes simplex virus type 1 form a complex that accumulates at the nuclear rim and is required for envelopment of nucleocapsids and successful egress of the nucleocapsids [ ]. Although UL34 has been shown to interact directly with lamin A it cannot disrupt lamin structure by itself. Its interaction with UL31 and US3 appears to be crucial for lamin disruption, though the mechanism is not yet clear [, ].This entry represents several proteins (U34, UL50, nuclear egress protein 2 or UL34, 24, 26) that play a major role in virion nuclear egress, the first step of virion release from infected cell. Viral capsids are initially assembled within the nucleus and induce capsids budding and envelopment into the perinuclear space. Then the virion egress complex promotes fusion of perinuclear virion envelope with the outer nuclear membrane, releasing viral capsid into the cytoplasm where it will engages budding sites in the Golgi or trans-Golgi network [ , , ].
Protein Domain
Name: ATP-binding cassette sub-family C member 4, six-transmembrane helical domain 1
Type: Domain
Description: The ABC transporter family is a group of membrane proteins that use the hydrolysis of ATP to power the translocation of a wide variety of substrates across cellular membranes. ABC transporters minimally consist of two conserved regions: a highly conserved nucleotide-binding domain (NBD) and a less conserved transmembrane domain (TMD). Eukaryotic ABC proteins are usually organised either as full transporters (containing two NBDs and two TMDs), or as half transporters (containing one NBD and one TMD), that have to form homo- or heterodimers in order to constitute a functional protein [ ].This group of proteins includes ATP-binding cassette sub-family C member 4 (ABCC4, also known as Multidrug resistance-associated protein 4, MRP4) from animals. It belongs to the MRP (multidrug resistance protein) subfamily of the ATP-binding cassette (ABC) transporter family [ ]. MRP4 is a broad specificity organic anion exporter that seems to be able to mediate the transport of conjugated steroids, prostaglandins, and glutathione []. It can localise to both basolateral and apical membranes in polarised cells, depending on the tissue where it is found []. Together with CFTR, MRP4 functions in the regulation of cAMP and beta-adrenergic contraction in cardiac myocytes []. MRP4 has been implicated in the high proliferative growth of some tumours including prostate tumours and neuroblastoma [ , ]. It confers resistance to anticancer agents including thiopurine analogues, MTX and topotecan []. This protein has a typical ABC transporter structure and each composed of two transmembrane domains (TMD1 and TMD2) and two nucleotide domains (NBD1 and NBD2).This entry represents the six-transmembrane helical domain 1 (TMD1) of ABCC4.
Protein Domain
Name: Subtilisin-like protease
Type: Family
Description: This entry represents a group of subtilisin-like proteases mostly from plants and bacteria. Proteins in this entry include melon cucumisin, Arabidopsis thaliana Ara12, a nodule specific serine protease from Alnus glutinosa ag12, members of the tomato P69 family, and tomato LeSBT2. These proteins belong to the peptidase S8 family. Cucumisin from the juice of melon fruits is a thermostable serine peptidase, with a broad substrate specificity for oligopeptides and proteins. A. thaliana Ara12 is a thermostable, extracellular serine protease, found chiefly in silique tissue and stem tissue. Ara12 is stimulated by Ca2+ ions. A. glutinosa ag12 is expressed at high levels in the nodules, and at low levels in the shoot tips; it is implicated in both symbiotic and non-symbiotic processes in plant development. The tomato P69 protease family is comprised of various protein isoforms of approximately 69kDa. These isoforms accumulate extracellularly. Some of the P69 genes are tightly regulated in a tissue specific fashion, and by environmental and developmental signals. For example: infection with avirulent bacteria activates transcription of the genes for the P69 B and C isoforms, the P69 E transcript was detected only in roots, and the P69F transcript only in hydathodes. The Tomato LeSBT2 subtilase transcript was not detected in flowers and roots, but was present in cotyledons and leaves. The significance of the PA domain to these proteins has not been ascertained. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate.
Protein Domain
Name: Ubiquinol-cytochrome C reductase hinge domain
Type: Domain
Description: The ubiquinol-cytochrome C reductase complex (cytochrome bc1 complex) is a respiratory multienzyme complex [ ]. The bc1 complex contains 11 subunits; 3 respiratory subunits (cytochrome B, cytochrome C1, Rieske protein), 2 core proteins and 6 low molecular weight proteins. This family represents the 'hinge' protein of the complex which is thought to mediate formation of the cytochrome c1 and cytochrome c complex. Proteins in this entry from an α-helical hairpin.This entry represents the structural domain found in these proteins.
Protein Domain
Name: Potassium channel tetramerisation-type BTB domain
Type: Domain
Description: This domain can be found at the N terminus of voltage-gated potassium channel proteins, where represents a cytoplasmic tetramerisation domain (T1) involved in assembly of alpha-subunits into functional tetrameric channels [ ]. This domain can also be found in proteins that are not potassium channels, like KCTD1 (potassium channel tetramerisation domain-containing protein 1). KCTD1 is though to be a nuclear protein that functions as a transcriptional repressor. In KCTD1, the T1-type BTB domain mediates homomeric protein-protein interactions [, ].
Protein Domain
Name: Substrate-binding orphan protein, GRRM family
Type: Family
Description: This subfamily belongs to bacterial extracellular solute-binding protein family 3. In that family, most members are ABC transporter periplasmic substrate-binding proteins. However, members of the present subfamily are orphans, in the sense of being adjacent to neither ABC transporter ATP-binding proteins or permease subunits. Instead, most members are encoded next to the two signature proteins of the proposed Glycine-Rich Repeat Modification (GRRM) system, a radical SAM/SPASM protein GrrM ( ) and the Gly-rich repeat protein itself GrrA ( ).
Protein Domain
Name: Polyomavirus agnoprotein
Type: Family
Description: This family consists of the DNA-binding protein or agnoprotein from various polyomaviruses. This protein is highly basic and can bind single stranded and double stranded DNA [ ]. Mutations in the agnoprotein produce smaller viral plaques, hence its function is not essential for growth in tissue culture cells but something has slowed in the normal replication cycle []. There is also evidence suggesting that the agnogene and agnoprotein act as regulators of structural protein synthesis [].
Protein Domain
Name: Type III secretion system, YscI/HrpB
Type: Family
Description: This entry consists of bacterial type III secretion system proteins which share a conserved C-terminal domain. These proteins are designated YscI (Yop proteins translocation protein I) in Yersinia and HrpB (hypersensitivity response and pathogenicity protein B) in plant pathogens such as Pseudomonas syringae. This entry also includes PscI from Pseudomonas aeruginosa. PscI is a type III secretion needle anchoring protein that polymerizes into flexible and regularly twisted fibrils and plays an essential role in needle assembly [ ].
Protein Domain
Name: NFU1-like
Type: Family
Description: This entry includes NFU1-like proteins from eukaryotes and some uncharacterised proteins from prokaryotes.Nfu functions as a scaffold protein for assembly and delivery of rudimentary Fe-S clusters to target proteins [ , ]. A human Nfu homologue, HIRA-interacting protein 5 (HIRIP5), was first identified in a two-hybrid screen for proteins that interact with the transcription regulator HIRA []. It seems that two human Nfu isoforms are generated by alternative explicing, which are subsequently targeted to different subcellular compartments [].
Protein Domain
Name: BPI/LBP/Plunc family
Type: Family
Description: The BPI (bactericidal permeability-increasing proteins)/LBP (lipopolysaccharide-binding proteins)/PLUNC (palate, lung and nasal epithelium clone) family consists of proteins involved in host defence against bacteria, many acting in early recognition of bacteria in the upper respiratory tract [ , , ]. They share homology with cholesterol ester transfer protein (CETP) and phospholipid transfer protein (PLTP), both of which are involved in lipid transport in blood plasma []. This entry also matches UPF0522 proteins from Dictyostelium (slime mold), whose function is unknown.
Protein Domain
Name: Signal transduction histidine kinase-related protein, C-terminal
Type: Domain
Description: Two-component signal transduction systems enable bacteria to sense, respond, and adapt to a wide range of environments, stressors, and growth conditions [ ]. Some bacteria can contain up to as many as 200 two-component systems that need tight regulation to prevent unwanted cross-talk []. These pathways have been adapted to response to a wide variety of stimuli, including nutrients, cellular redox state, changes in osmolarity, quorum signals, antibiotics, and more []. Two-component systems are comprised of a sensor histidine kinase (HK) and its cognate response regulator (RR) []. The HK catalyses its own auto-phosphorylation followed by the transfer of the phosphoryl group to the receiver domain on RR; phosphorylation of the RR usually activates an attached output domain, which can then effect changes in cellular physiology, often by regulating gene expression. Some HK are bifunctional, catalysing both the phosphorylation and dephosphorylation of their cognate RR. The input stimuli can regulate either the kinase or phosphatase activity of the bifunctional HK.A variant of the two-component system is the phospho-relay system. Here a hybrid HK auto-phosphorylates and then transfers the phosphoryl group to an internal receiver domain, rather than to a separate RR protein. The phosphoryl group is then shuttled to histidine phosphotransferase (HPT) and subsequently to a terminal RR, which can evoke the desired response [ , ].Signal transducing histidine kinases are the key elements in two-component signal transduction systems, which control complex processes such as the initiation of development in microorganisms [ , ]. Examples of histidine kinases are EnvZ, which plays a central role in osmoregulation [], and CheA, which plays a central role in the chemotaxis system []. Histidine kinases usually have an N-terminal ligand-binding domain and a C-terminal kinase domain, but other domains may also be present. The kinase domain is responsible for the autophosphorylation of the histidine with ATP, the phosphotransfer from the kinase to an aspartate of the response regulator, and (with bifunctional enzymes) the phosphotransfer from aspartyl phosphate back to ADP or to water []. The kinase core has a unique fold, distinct from that of the Ser/Thr/Tyr kinase superfamily. HKs can be roughly divided into two classes: orthodox and hybrid kinases [ , ]. Most orthodox HKs, typified by the Escherichia coli EnvZ protein, function as periplasmic membrane receptors and have a signal peptide and transmembrane segment(s) that separate the protein into a periplasmic N-terminal sensing domain and a highly conserved cytoplasmic C-terminal kinase core. Members of this family, however, have an integral membrane sensor domain. Not all orthodox kinases are membrane bound, e.g., the nitrogen regulatory kinase NtrB (GlnL) is a soluble cytoplasmic HK []. Hybrid kinases contain multiple phosphodonor and phosphoacceptor sites and use multi-step phospho-relay schemes instead of promoting a single phosphoryl transfer. In addition to the sensor domain and kinase core, they contain a CheY-like receiver domain and a His-containing phosphotransfer (HPt) domain. This domain is present in many sensor proteins that respond to extra-cytoplasmic stimuli in bacteria, but is also found in many proteins of metazoan origin. Sensors are usually linked to a 2-component regulatory system consisting of the sensor and a cytoplasmic regulator protein [ ].The cytoplasmic C-terminal portions of the sensor proteins show marked sequence similarity and are responsible for kinase activity [ ]. Some sensor proteins are cytoplasmic and may respond to several external stimuli. Sensors also show similarity to some regulatory proteins []. The structure of CheA, a signal-transducing histidine kinase is known []. The catalytic domain consists of several α-helices packed over one face of a large anti-parallel beta sheet forming a loop which closes over the bound ATP. Hydrolysis of ATP is coupled to Mg2 release and conformational changes in the ATP-binding cavity.
Protein Domain
Name: Myoglobin
Type: Family
Description: Globins are haem-containing proteins involved in binding and/or transporting oxygen. They belong to a very large and well studied family that is widely distributed in many organisms [ ]. Globins have evolved from a common ancestor and can be divided into three groups: single-domain globins, and two types of chimeric globins, flavohaemoglobins and globin-coupled sensors. Bacteria have all three types of globins, while archaea lack flavohaemoglobins, and eukaryotes lack globin-coupled sensors []. Several functionally different haemoglobins can coexist in the same species. The major types of globins include:Haemoglobin (Hb): tetramer of two alpha and two beta chains, although embryonic and foetal forms can substitute the alpha or beta chain for ones with higher oxygen affinity, such as gamma, delta, epsilon or zeta chains. Hb transports oxygen from lungs to other tissues in vertebrates [ ]. Hb proteins are also present in unicellular organisms where they act as enzymes or sensors [].Myoglobin (Mb): monomeric protein responsible for oxygen storage in vertebrate muscle [ ].Neuroglobin: a myoglobin-like haemprotein expressed in vertebrate brain and retina, where it is involved in neuroprotection from damage due to hypoxia or ischemia [ ]. Neuroglobin belongs to a branch of the globin family that diverged early in evolution. Cytoglobin: an oxygen sensor expressed in multiple tissues. Related to neuroglobin [ ].Erythrocruorin: highly cooperative extracellular respiratory proteins found in annelids and arthropods that are assembled from as many as 180 subunit into hexagonal bilayers [ ].Leghaemoglobin (legHb or symbiotic Hb): occurs in the root nodules of leguminous plants, where it facilitates the diffusion of oxygen to symbiotic bacteriods in order to promote nitrogen fixation.Non-symbiotic haemoglobin (NsHb): occurs in non-leguminous plants, and can be over-expressed in stressed plants [ ].Flavohaemoglobins (FHb): chimeric, with an N-terminal globin domain and a C-terminal ferredoxin reductase-like NAD/FAD-binding domain. FHb provides protection against nitric oxide via its C-terminal domain, which transfers electrons to haem in the globin [ ].Globin-coupled sensors: chimeric, with an N-terminal myoglobin-like domain and a C-terminal domain that resembles the cytoplasmic signalling domain of bacterial chemoreceptors. They bind oxygen, and act to initiate an aerotactic response or regulate gene expression [ , ]. Protoglobin: a single domain globin found in archaea that is related to the N-terminal domain of globin-coupled sensors [ ].Truncated 2/2 globin: lack the first helix, giving them a 2-over-2 instead of the canonical 3-over-3 α-helical sandwich fold. Can be divided into three main groups (I, II and II) based on structural features [ ].This entry represents myoglobin (Mb). Mb is an intracellular haemoprotein expressed in the heart and oxidative skeletal myofibres of vertebrates that reversibly binds molecular oxygen by its haem. Mb functions as an oxygen storage protein in muscle that is capable of releasing oxygen during periods of hypoxia or anoxia [ ]. Mb is also believed to facilitate oxygen transport from erythrocytes to mitochondria to maintain cellular respiration during periods of high physiological demand, however, mice lacking myoglobin appear to function normally []. Mb does appear to have an additional function scavenging nitric oxide and reactive oxygen species in the heart []. Mb binds oxygen in the reduced [Fe(II)]state. The Mb molecule exists as a monomer that binds haem. The 3D structures of a great number of vertebrate Mb proteins in various states are known. The protein is largely α-helical, eight conserved helices (A to H) providing the scaffold for a well-defined haem-binding pocket. The imidazole ring of the 'proximal' His residue provides the fifth haem iron ligand; the other axial haem iron position remains essentially free for oxygen coordination. Oxygen binding results in a transition from high-spin to low-spin iron, with accompanying changes in the Fe-N bond lengths and coordination geometry.
Protein Domain
Name: Clan AA aspartic peptidase, C-terminal
Type: Domain
Description: Proteins containing this domain are aspartic endopeptidases belonging to MEROPS family A32 and MEROPS clan AA (from conservation of the DTG motif that include the active site aspartic acid and the fold which is similar to retropepsin) [ , ]. This entry describes the well-conserved C-terminal domain, of approximately 120 amino acid residues. The poorly conserved, variable length N-terminal region usually contains a predicted transmembrane helix. In the peptidase from Rickettsia conorii, the active site Asp has been identified by site-directed mutagenesis and the protein is active only as a homodimer [ ]. Proteins containing this domain include PerP from Caulobacter crescentus, which is a periplasmic protein that has been shown to be important for the transition from a motile to a sessile form of the bacterium by processing protein PodJ. The full-length protein PodJ recruits proteins for formation of the pilus, associated with the motile phase, whereas truncated PodJ remains attached to the membrane and recruits proteins for stalk formation which is associated with the sessile phase [ ].Aspartic peptidases, also known as aspartyl proteases ([intenz:3.4.23.-]), are widely distributed proteolytic enzymes [, , ] known to exist in vertebrates, fungi, plants, protozoa, bacteria, archaea, retroviruses and some plant viruses. All known aspartic peptidases are endopeptidases. A water molecule, activated by two aspartic acid residues, acts as the nucleophile in catalysis. Aspartic peptidases can be grouped into five clans, each of which shows a unique structural fold [].Peptidases in clan AA are either bilobed (family A1 or the pepsin family) or are a homodimer (all other families in the clan, including retropepsin from HIV-1/AIDS) [ ]. Each lobe consists of a single domain with a closed β-barrel and each lobe contributes one Asp to form the active site. Most peptidases in the clan are inhibited by the naturally occurring small-molecule inhibitor pepstatin [].Clan AC contains the single family A8: the signal peptidase 2 family. Members of the family are found in all bacteria. Signal peptidase 2 processes the premurein precursor, removing the signal peptide. The peptidase has four transmembrane domains and the active site is on the periplasmic side of the cell membrane. Cleavage occurs on the amino side of a cysteine where the thiol group has been substituted by a diacylglyceryl group. Site-directed mutagenesis has identified two essential aspartic acid residues which occur in the motifs GNXXDRX and FNXAD (where X is a hydrophobic residue) [ ]. No tertiary structures have been solved for any member of the family, but because of the intramembrane location, the structure is assumed not to be pepsin-like.Clan AD contains two families of transmembrane endopeptidases: A22 and A24. These are also known as "GXGD peptidases"because of a common GXGD motif which includes one of the pair of catalytic aspartic acid residues. Structures are known for members of both families and show a unique, common fold with up to nine transmembrane regions [ ]. The active site aspartic acids are located within a large cavity in the membrane into which water can gain access [].Clan AE contains two families, A25 and A31. Tertiary structures have been solved for members of both families and show a common fold consisting of an α-β-alpha sandwich, in which the beta sheet is five stranded [ , ].Clan AF contains the single family A26. Members of the clan are membrane-proteins with a unique fold. Homologues are known only from bacteria. The structure of omptin (also known as OmpT) shows a cylindrical barrel containing ten beta strands inserted in the membrane with the active site residues on the outer surface [ ].There are two families of aspartic peptidases for which neither structure nor active site residues are known and these are not assigned to clans. Family A5 includes thermopsin, an endopeptidase found only in thermophilic archaea. Family A36 contains sporulation factor SpoIIGA, which is known to process and activate sigma factor E, one of the transcription factors that controls sporulation in bacteria [ ].
Protein Domain
Name: Putative quorum-sensing-regulated virulence factor
Type: Family
Description: This is a family of short Pseudomonas proteins that are potential virulence factors. The structure of a secreted protein has been solved and deposited as PDB:3npd, from Pfam:PF13652. It is predicted that these two adjacent proteins form a single transcriptional unit based on the prediction that together they interact with their adjacent protein PotD, which is the putrescine-binding periplasmic protein in the polyamine uptake system comprising PotABCD. These two adjacent proteins are predicted to be quroum-sensing-regulated virulence factors [ ].
Protein Domain
Name: Spike glycoprotein, betacoronavirus
Type: Family
Description: This entry represents the spike glycoprotein mostly from Betacoronavirus. It can be cleaved into three chains: spike protein S1, S2 and S2'. Spike protein S1 attaches the virion to the cell membrane by interacting with host receptor, initiating the infection. Spike protein S2 mediates fusion of the virion and cellular membranes by acting as a class I viral fusion protein. Spike protein S2' acts as a viral fusion peptide which is unmasked following S2 cleavage occurring upon virus endocytosis [ , ].
Protein Domain
Name: Peptidase C2, calpain, catalytic domain
Type: Domain
Description: This group of cysteine peptidases belong to the MEROPS peptidase family C2 (calpain family, clan CA). A type example is calpain, which is an intracellular protease involved in many important cellular functions that are regulated by calcium [ , ]. The protein is a complex of 2 polypeptide chains (light and heavy), with eleven known active peptidases in humans and two non-peptidase homologues known as calpamodulin and androglobin []. These include a highly calcium-sensitive (i.e., micro-molar range) form known as mu-calpain, mu-CANP or calpain I; a form sensitive to calcium in the milli-molar range, known as m-calpain, m-CANP or calpain II; and a third form, known as p94, which is found in skeletal muscle only [].All forms have identical light but different heavy chains. Both mu- and m-calpain are heterodimers containing an identical 28kDa subunit and an 80kDa subunit that shares 55-65% sequence homology between the two proteases [ , ]. The crystallographic structure of m-calpain reveals six "domains"in the 80kDa subunit [ , ]: A 19-amino acid NH2-terminal sequence;Active site domain IIa;Active site domain IIb. Domain 2 shows low levels of sequence similarity to papain; although the catalytic His hasnot been located by biochemical means, it is likely that calpain and papain are related [].Domain III;An 18-amino acid extended sequence linking domain III to domain IV;Domain IV, which resembles the penta EF-hand family of polypeptides, binds calcium and regulates activity [ ]. Ca2+-binding causes a rearrangement of the protein backbone, the net effect of which is that a Trp side chain, which acts as a wedge between catalytic domains IIa and IIb in the apo state, moves away from the active site cleft allowing for the proper formation of the catalytic triad [ ]. Calpain-like mRNAs have been identified in other organisms including bacteria, but the molecules encoded by these mRNAs have not been isolated, so little is known about their properties. How calpain activity is regulated in these organisms cells is still unclear In metazoans, the activity of calpain is controlled by a single proteinase inhibitor, calpastatin ( ). The calpastatin gene can produce eight or more calpastatin polypeptides ranging from 17 to 85kDa by use of different promoters and alternative splicing events. The physiological significance of these different calpastatins is unclear, although all bind to three different places on the calpain molecule; binding to at least two of the sites is Ca2+ dependent. The calpains ostensibly participate in a variety of cellular processes including remodelling of cytoskeletal/membrane attachments, different signal transduction pathways, and apoptosis. Deregulated calpain activity following loss of Ca2+ homeostasis results in tissue damage in response to events such as myocardial infarcts, stroke, and brain trauma [ ]. Calpains are a family of cytosolic cysteine proteinases (see ). Members of the calpain family are believed to function in various biological processes, including integrin-mediated cell migration, cytoskeletal remodeling, cell differentiation and apoptosis [ , ].The calpain family includes numerous members from C. elegans to mammals and with homologues in yeast and bacteria. The best characterised members are the m- and mu-calpains, both proteins are heterodimer composed of a large catalytic subunit and a small regulatory subunit. The large subunit comprises four domains (dI-dIV) while the small subunit has two domains (dV-dVI). Domain dI is a short region cleaved by autolysis, dII is the catalytic core, dIII is a C2-like domain, dIV consists of five calcium binding EF-hand motifs [ ].The crystal structure of calpain has been solved [ , ]. The catalytic region consists of two distinct structural domains (dIIa and dIIb). dIIa contains a central helix flanked on three faces by a cluster of α-helices and is entirely unrelated to the corresponding domain in the typical thiol proteinases. The fold of dIIb is similar to the corresponding domain in other cysteine proteinases and contains two three-stranded anti-parallel β-sheets. The catalytic triad residues (C,H,N) are located in dIIa and dIIb. The activation of the domain is dependent on the binding of two calcium atoms in two non EF-hand calcium binding sites located in the catalytic core, one close to the Cys active site in dIIa and one at the end of dIIb. Calcium-binding induced conformational changes in the catalytic domain which align the active site [ ][].The profile covers the whole catalytic domain.A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families [ ]. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid, N-ethylmaleimide or p-chloromercuribenzoate. Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [ ].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [ ]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [ ]. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.
Protein Domain
Name: Phage P22 tailspike-like, N-terminal domain superfamily
Type: Homologous_superfamily
Description: The tailspike protein of Salmonella bacteriophage P22 is a viral adhesion protein that mediates attachment of the viral protein to host cell-surface lipopolysaccharide. The tailspike protein displays both receptor binding and destroying properties, inactivating the receptor by endoglycosidase activity. The N-terminal, head-binding domain mediates the non-covalent attachment of the six homotrimeric tailspike molecules to the DNA injection apparatus [ ]. The N-terminal domain of the P22 tailspike protein shows significant sequence similarity to the N-terminal domain of the Shigella phage Sf6 tailspike protein [].
Protein Domain
Name: Bacteriophage P22 tailspike, N-terminal
Type: Domain
Description: The tailspike protein of Salmonella bacteriophage P22 is a viral adhesion protein that mediates attachment of the viral protein to host cell-surface lipopolysaccharide. The tailspike protein displays both receptor binding and destroying properties, inactivating the receptor by endoglycosidase activity. The N-terminal, head-binding domain mediates the non-covalent attachment of the six homotrimeric tailspike molecules to the DNA injection apparatus [ ]. The N-terminal domain of the P22 tailspike protein shows significant sequence similarity to the N-terminal domain of the Shigella phage Sf6 tailspike protein [].
Protein Domain
Name: BPM, C-terminal
Type: Domain
Description: This entry represents the C-terminal domain of the plant BTB/POZ and MATH domain-containing (BPM) proteins. Proteins in this entry include Arabidopsis BPM1-6. Arabidopsis BPM proteins may act as a substrate-specific adapter of an E3 ubiquitin-protein ligase complex (CUL3-RBX1-BTB) which mediates the ubiquitination and subsequent proteasomal degradation of target proteins [ ]. BPM proteins have also been shown to interact with members of the ethylene response factor/Apetala2 (ERF/AP2) transcription factor family and may play an important role in plant development and stress tolerance [, ].
Protein Domain
Name: Bacteriophage M13, G5P, DNA-binding
Type: Family
Description: This entry is represented by the Bacteriophage M13, G5P, DNA-binding protein. The characteristics of the protein distribution suggest prophage matches in addition to the phage matches.G5P is the bacteriophage helix-destabilising protein, or single-stranded DNA binding protein, required for DNA synthesis. The protein binds to DNA in a highly cooperative manner without pronounced sequence specificity. In the presence of single-stranded DNA it binds cooperatively to form a helical protein-DNA complex. It prevents the conversion during synthesis of the single-stranded (progeny) viral DNA back into the double-stranded replicative form.
Protein Domain
Name: Globular protein, non-globular alpha/beta subunit
Type: Homologous_superfamily
Description: Certain globular proteins from different functional families contain non-globular alpha+beta subunits. These proteins do not form a true superfamily, but share a similar topology in their non-globular subunits. Proteins that display this structure include phycoerythrin alpha subunits ( ), which are light-harvesting biliproteins found in cryptomonads [ ], the chaperone-binding domain of the virulence effector SptP from Salmonella typhimurium, which acts to modulate host cellular responses [], and the ubiquinol-cytochrome c reductase 8kDa protein (), an integral membrane protein complex essential to cellular respiration [ ].
Protein Domain
Name: Pup ligase/deamidase
Type: Family
Description: Pupylation is a novel protein modification system found in some bacteria [ ]. This entry represents two related groups of proteins involved in this system. Pup ligases, such as PafA, conjugate the prokaryotic ubiquitin-like protein Pup to lysine residues in target proteins, marking them for degradation []. Pup deamidases, such as PafD, catalyse the deamidation of the Pup C-terminal glutamine to glutamate, thereby rendering the protein competent for conjugation []. It has been suggested that proteins in this entry are related to gamma-glutamyl-cysteine synthetases [].
Protein Domain
Name: GREB1
Type: Family
Description: GREB1 (gene regulated in breast cancer 1 protein) is a ESR1 (estrogen receptor 1)-upregulated protein that may regulate estrogen action and it is associated with estrogen-stimulated cell proliferation [ ]. It acts as a regulator of hormone-dependent cancer growth in breast, ovarian and prostate cancers [, , , ]. This protein may be a target to inhibit tumour-promoting pathways both downstream and independent of ESR1 as a possible treatment strategy.This entry represents a protein family that includes Protein GREB1 and similar proteins from chordates.
Protein Domain
Name: UBL3-like
Type: Family
Description: Proteins in this entry possess a ubiquitin-2 like Rad60 SUMO-like domain. These ubiquitin-fold proteins are plasma membrane-anchored by prenylation [ ]. Proteins include mammalian ubiquitin-like protein 3, which has a ubiquitous distribution [], and plant membrane-anchored ubiquitin-fold proteins 1 to 6, which, in Arabidopsis, recruit and dock specific E2 ubiquitin-conjugating enzymes to the plasma membrane. They appear to interact noncovalently with an E2 surface opposite the active site that forms a covalent linkage with Ub []. This entry also includes uncharacterised proteins from fungi.
Protein Domain
Name: Arginine-tRNA-protein transferase 1, eukaryotic
Type: Family
Description: Arginine-tRNA-protein transferase catalyses the post-translational conjugation of arginine to the N terminus of a protein. In eukaryotes, this functions as part of the N-end rule pathway of protein degradation by conjugating a destabilising amino acid to the N-terminal aspartate or glutamate of a protein, targeting the protein for ubiquitin-dependent proteolysis [ ]. In Saccharomyces cerevisiae, Cys20, 23, 94 and/or 95 are thought to be important for activity []. Of these, only Cys 94 appears to be completely conserved in this family. This entry represents eukaryotic type arginyl-tRNA--protein transferase 1 ( ) [ ].
Protein Domain
Name: Bromo adjacent homology (BAH) domain
Type: Domain
Description: The BAH (bromo-adjacent homology) is commonly found in chromatin-associated proteins [ ]. It is found in proteins such as eukaryotic DNA (cytosine-5) methyltransferases , the origin recognition complex 1 (Orc1) proteins, as well as several proteins involved in transcriptional regulation. The BAH domain appears to act as a protein-protein interaction module specialised in gene silencing, as suggested for example by its interaction within yeast Orc1p with the silent information regulator Sir1p. The BAH module might therefore play an important role by linking DNA methylation, replication and transcriptional regulation [ ].
Protein Domain
Name: Outer membrane protein, OmpA-like, conserved site
Type: Conserved_site
Description: Most of the bacterial outer membrane proteins in this group are porin-like integral membrane proteins (such as ompA) [ ], but some are small lipid-anchored proteins (such as pal) []. It is also found in MotB and related proteins. They are present in the outer membrane of many Gram-negative organisms []. This domain is found at the C-terminal half of these proteins and is well conserved. The N-terminal half is variable although some of the proteins in this group have the OmpA-like transmembrane domain at the N terminus.
Protein Domain
Name: Envelope fusion protein-like
Type: Family
Description: This entry includes the Baculovirus envelope fusion protein, which is an envelope glycoprotein mediating the fusion of viral and host endosomal membranes leading to virus entry into the host cell [ ]. Protein AC23, which is a nonfunctional F-like protein from the Autographa californica nuclear polyhedrosis virus, is a pathogenicity factor that accelerates mortality in the host insect [].This entry also includes transposable element that have sequence similarity to the envelope fusion proteins of Baculovirus, such as retrovirus-related Env polyprotein from copia-like transposable element 17.6 (Env 17.6) from Drosophila melanogaster [ ].
Protein Domain
Name: Ribonuclease Z/Hydroxyacylglutathione hydrolase-like
Type: Homologous_superfamily
Description: Proteins in this superfamily contains a fold consisted of duplication of β(4)-α-β-α motif. Apart from the beta-lactamases and metallo-beta-lactamases, a number of other proteins also contain this fold [ ]. These proteins include thiolesterases, members of the glyoxalase II family, that catalyse the hydrolysis of S-D-lactoyl-glutathione to form glutathione and D-lactic acid and a competence protein that is essential for natural transformation in Neisseria gonorrhoeae and could be a transporter involved in DNA uptake. Except for the competence protein, these proteins bind two zinc ions per molecule as cofactor.
Protein Domain
Name: Beta-carotene 15,15'-monooxygenase, Brp/Blh family
Type: Family
Description: This prokaryotic integral membrane protein family includes Brp (bacterio-opsin related protein) and Blh (Brp-like protein). Bacteriorhodopsin is a light-driven proton pump consisting of the membrane apoprotein bacterioopsin and a covalently bound retinal cofactor that appears to be derived of beta-carotene. Blh has been shown to cleave beta-carotene [ ] to produce two all-trans retinal molecules. It has been suggested that Brp and Blh are similar proteins that catalyze or regulate the conversion of beta-carotene to retinal []. Mammalian enzymes with similar enzymatic function are not multiple membrane spanning proteins and are not homologous.
Protein Domain
Name: Putative folate metabolism protein, CADD family
Type: Family
Description: This protein family, related to but outside the family of PqqC proteins involved in PQQ biosynthesis, includes the well-studied Chlamydia protein CADD (Chlamydia protein Associating with Death Domains), which can induce apoptosis in a host cell [ ]. Other members of this family occur in Rickettsia and Wolbachia, unrelated in terms of phylogeny (both are alphaproteobacteria) but similar in living intracellularly. Local gene context in these species, although not in Trichodesmium or Nitrosomonas eutropha, suggests a role in folate metabolism, and some species with this protein lack FolE but have other folate synthesis proteins [].
Protein Domain
Name: Phosducin
Type: Family
Description: The outer and inner segments of vertebrate rod photoreceptor cells contain phosducin, a soluble phosphoprotein that complexes with the beta/gamma-subunits of the GTP-binding protein, transducin. Light-induced changes in cyclic nucleotide levels modulate the phosphorylation of phosducin by protein kinase A [ ]. The protein is thought to participate in the regulation of visual phototransduction or in the integration of photo-receptor metabolism. Similar proteins have been isolated from the pineal gland and it is believed that the functional role of the protein is the same in both retina and pineal gland [].
Protein Domain
Name: E3 ubiquitin-protein ligase CHIP
Type: Family
Description: CHIP is a multifunctional protein that functions both as a co-chaperone and an E3 ubiquitin-protein ligase. It couples protein folding and proteasome mediated degradation by interacting with heat shock proteins (e.g. HSC70) and ubiquitinating their misfolded client proteins thereby targeting them for proteasomal degradation [ , ]. It is also important for cellular differentiation and survival (apoptosis), as well as susceptibility to stress. It targets a wide range of proteins, such as expanded ataxin-1, ataxin-3, huntingtin, and androgen receptor, which play roles in glucocorticoid response, tau degradation, and both p53 and cAMP signaling [, ].
Protein Domain
Name: Leukocyte surface antigen CD47
Type: Family
Description: Leukocyte surface antigen CD47 (also known as integrin-associated protein, IAP) is a widely expressed membrane protein with multiple functions in immunological and neuronal processes. For example, CD47-induces caspase-independent cell death which may be mediated by cytoskeleton reorganisation [ ]. In red blood cells, CD47 acts like a marker of self by ligating the macrophage inhibitory receptor signal regulatory protein alpha []. The protein can also act as a thrombospondin receptor [].This entry also includes a group of CD-47-like proteins found in poxviruses (Poxviridae) such as protein A38 [ ].
Protein Domain
Name: E3 ubiquitin-protein ligase SH3RF2, RING finger, HC subclass
Type: Domain
Description: SH3RF2 is also called POSHER (POSH-eliminating RING protein) or HEPP1 (heart protein phosphatase 1-binding protein). It acts as an anti-apoptotic regulator of the JNK pathway by binding to and promoting the degradation of SH3RF1 (or POSH), a scaffold protein that is required for pro-apoptotic JNK activation [ ]. It may also play a role in cardiac functions together with protein phosphatase 1 []. SH3RF2 contains an N-terminal RING finger domain and three SH3 domains. This entry represents the N-terminal C3HC4-type RING-HC finger of SH3RF2 and similar proteins from vertebrates.
Protein Domain
Name: Electron transfer flavoprotein alpha subunit/FixB
Type: Family
Description: This entry includes electron transfer flavoprotein alpha subunit and the FixB protein.Electron transfer flavoproteins (ETFs) serve as specific electron acceptors for primary dehydrogenases, transferring the electrons to terminal respiratory systems. They can be functionally classified into constitutive, "housekeeping"ETFs, mainly involved in the oxidation of fatty acids (Group I), and ETFs produced by some prokaryotes under specific growth conditions, receiving electrons only from the oxidation of specific substrates (Group II) [ ]. ETFs are heterodimeric proteins composed of an alpha and beta subunit, and contain an FAD cofactor and AMP [ , , , , ]. ETF consists of three domains: domains I and II are formed by the N- and C-terminal portions of the alpha subunit, respectively, while domain III is formed by the beta subunit. Domains I and III share an almost identical α-β-alpha sandwich fold, while domain II forms an α-β-alpha sandwich similar to that of bacterial flavodoxins. FAD is bound in a cleft between domains II and III, while domain III binds the AMP molecule. Interactions between domains I and III stabilise the protein, forming a shallow bowl where domain II resides. The alpha subunit of both Group I and Group II ETFs is composed of domains I and II.Many enterobacteria are able to convert carnitine, via crotonobetaine, to gamma-butyrobetaine in the presence of carbon and nitrogen sources under anaerobic conditions [ ]. In Escherichia coli the enzymes involved in this pathway are encoded by the caiTABCDE operon []. The adjacent but divergent fixABCD operon also appears to be necessary for carnintine meatbolism []. The Fix proteins are homologous to proteins found in known electron transport pathways.
Protein Domain
Name: Timeless, N-terminal
Type: Domain
Description: This entry represents the N-terminal domain of the Timeless protein. The timeless gene in Drosophila melanogasteris involved in circadian rhythm control [ ]. Drosophila contains two paralogs, dTIM and dTIM2, acting in clock/photoreception and chromosome integrity/photoreception respectively. The mammalian TIMELESS (TIM) protein, originally identified based on its similarity to Drosophila dTIM, interacts with the clock proteins dCRY and dPER and is essential for circadian rhythm generation and photo-entrainment in the fly []. However, phylogenetic sequence analysis has demonstrated that dTIM2 is likely to be the orthologue of mammalian TIM and other widely conserved TIM-like proteins in eukaryotes []. These proteins include Saccharomyces cerevisiae Tof1, Schizosaccharomyces pombe Swi1, and Caenorhabditis elegans TIM. These proteins are not involved in the core clock mechanism, but instead play important roles in chromosome integrity, efficient cell growth and/or development [, ], with the exception of dTIM-2, that has an additional function in retinal photoreception [].Saccharomyces cerevisiae Tof1 is a subunit of a replication-pausing checkpoint complex (Tof1-Mrc1-Csm3) that acts at the stalled replication fork to promote sister chromatid cohesion after DNA damage, facilitating gap repair of damaged DNA [ , ]. Schizosaccharomyces pombe Swi1 and Swi3 form the fork protection complex that coordinates leading- and lagging-strand synthesis and stabilizes stalled replication forks []. In humans timeless forms a stable complex with its partner protein Tipin. The Timeless-Tipin complex has been reported to travel along with the replication fork during unperturbed DNA replication. Moreover, the Timeless-Tipin-Claspin complex contributes to full activation of the ATR-Chk1 signaling pathway through the recruitment of Chk1 to arrested replication forks for sufficient ATR-mediated phosphorylation. It also interacts with PARP-1, and this interaction is required for efficient homologous recombination repair [ ].
Protein Domain
Name: Rho GDP-dissociation inhibitor domain superfamily
Type: Homologous_superfamily
Description: The GDP dissociation inhibitor for rho proteins, rho GDI, regulates GDP/GTP exchange by inhibiting the dissociation of GDP from them. The protein contains 204 amino acids, with a calculated Mr value of 23,421. Hydropathy analysis shows it to be largely hydrophilic, with a single hydrophobic region. The protein plays an important role in the activation of the superoxide (O2-)-generating NADPH oxidase of phagocytes. This process requires the interaction of membrane-associated cytochrome b559 with 3 cytosolic components: p47-phox, p67-phox and a heterodimer of the small G-protein p21rac1 and rho GDI [ ]. The association of p21rac and GDI inhibits dissociation of GDP from p21rac, thereby maintaining it in an inactive form. The proteins are attached via a lipid tail on p21rac that binds to the hydrophobic region of GDI []. Dissociation of these proteins might be mediated by the release of lipids (e.g., arachidonate and phosphatidate) from membranes through the action of phospholipases []. The lipids may then compete with the lipid tail on p21rac for the hydrophobic pocket on GDI.Two homologues of rho GDP-dissociation inhibitors have been identified in Dicytostelium: GDI1 and GDI2. They are cytosolic proteins. GDI1 has been found to play a central role in cytokinesis through the regulation of Rho family GTPases Rac1s and/or RacE [ , ].Rho GDI in yeast has been shown to have similar properties as mammalian rho GDI [ ].The rhoGDI structural domain contains both a structured, immunoglobulin-like fold, and a highly flexible N terminus of 50-60 residues [ ]. The N-terminal region becomes ordered upon complex formation and contributes more than 60% to the interface area [].
Protein Domain
Name: C-type lectin fold
Type: Homologous_superfamily
Description: Lectins occur in plants, animals, bacteria and viruses. Initially described for their carbohydrate-binding activity [ ], they are now recognised as a more diverse group of proteins, some of which are involved in protein-protein, protein-lipid or protein-nucleic acid interactions []. There are at least twelve structural families of lectins, of which C-type (Ca+-dependent) lectins is one. C-type lectins can be further divided into seven subgroups based on additional non-lectin domains and gene structure: (I) hyalectans, (II) asialoglycoprotein receptors, (III) collectins, (IV) selectins, (V) NK group transmembrane receptors, (VI) macrophage mannose receptors, and (VII) simple (single domain) lectins [].This entry represents a structural domain found in C-type lectins, as well as in other proteins, including:The N-terminal domain of aerolysin [ ] and the N-terminal domain of the S2/S3 subunit of pertussis toxin [].The C-terminal domain of invasin [ ] and intimin [].Link domain, which includes the Link module of TSG-6 [ ] (a hyaladherin with important roles in inflammation and ovulation) and the hyaluronan binding domain of CD44 (which contains extra N-terminal β-strand and C-terminal β-hairpin) [ ].Endostatin [ ] and the endostatin domain of collagen alpha 1 (XV) [], these domains being decorated with many insertions in the common fold.The noncollagenous (NC1) domain of collagen IV, which consists of a duplication of the C-type lectin domain, with segment swapping within and between individual domains [ ].Sulphatase-modifying factors (C-alpha-formyglycine-generating enzyme), where the fold is decorated with many additional structures [ , ].The C-terminal domain of the major tropism determinant (Mtd), where the fold is decorated with many additional structures, and has an overall similarity to the sulphatase modifying factor family but lacking the characteristic disulphide [ ].
Protein Domain
Name: ADD domain
Type: Domain
Description: One of the largest protein families in the human genome is the zinc finger family that contains members involved in the regulation of transcriptionprocesses. The zinc finger domains have been classified based on the order of cysteine (C) and histidine (H) residues. Zinc fingers are thought to mediateprotein-DNA and protein-protein interactions. The ADD (ATRX, DNMT3, DNMT3L) domain is a cysteine-rich region that consists of a C2C2-type zinc finger anda closely located domain of an imperfect PHD-type zinc finger with C4C4. The region between the two subdomains has a constant length, and it contains identical and conserved amino acids [, ]. The ADD domain binds to the histone H3 tail that is unmethylated at lysine 4 [, ].The ADD domain is present in chromatin-associated proteins that play a role in establishing and/or maintaining a normal pattern of DNA methylation:DNMT3A, DNMT3B, DNA methyltransferases.DNMT3L, a DNMT3-like enzymatically inactive regulatory factor.ATRX, a large nuclear protein predominantly localized to heterochromatin and nuclear PML bodies. At the C terminus is a helicase/ATPase domain, which characterises ATRX as a member of the SNF2 (SWI/SNF) family of chromatin-associated proteins. The ADD domain is composed of three clearly distinguishable modules that pack together through extensive hydrophobic interactions to form a single globulardomain. Packed against this GATA-like finger is a second subdomain,which binds two zinc ions and closely resembles the structure reported for several PHD fingers. Finally, there is a long C-terminal α-helix that runsout from the PHD finger and makes extensive hydrophobic contacts with the N- terminal GATA finger, bringing the N- and C-termini of the ADD domain closetogether. This combination of fused GATA-like and PHD fingers within a single domain is thus far unique [, ].
Protein Domain
Name: Protein-tyrosine phosphatase, low molecular weight
Type: Family
Description: Protein tyrosine (pTyr) phosphorylation is a common post-translational modification which can create novel recognition motifs for protein interactions and cellular localisation, affect protein stability, and regulate enzyme activity. Consequently, maintaining an appropriate level of protein tyrosine phosphorylation is essential for many cellular functions. Tyrosine-specific protein phosphatases (PTPase; ) catalyse the removal of a phosphate group attached to a tyrosine residue, using a cysteinyl-phosphate enzyme intermediate. These enzymes are key regulatory components in signal transduction pathways (such as the MAP kinase pathway) and cell cycle control, and are important in the control of cell growth, proliferation, differentiation and transformation [, ]. The PTP superfamily can be divided into four subfamilies []:(1) pTyr-specific phosphatases(2) dual specificity phosphatases (dTyr and dSer/dThr)(3) Cdc25 phosphatases (dTyr and/or dThr)(4) LMW (low molecular weight) phosphatasesBased on their cellular localisation, PTPases are also classified as:Receptor-like, which are transmembrane receptors that contain PTPase domains [ ] Non-receptor (intracellular) PTPases [ ] All PTPases carry the highly conserved active site motif C(X)5R (PTP signature motif), employ a common catalytic mechanism, and share a similar core structure made of a central parallel β-sheet with flanking α-helices containing a β-loop-α-loop that encompasses the PTP signature motif [ ]. Functional diversity between PTPases is endowed by regulatory domains and subunits. This entry represents the low molecular weight (LMW) protein-tyrosine phosphatases (or acid phosphatase), which act on tyrosine phosphorylated proteins, low-MW aryl phosphates and natural and synthetic acyl phosphates [ , ]. The structure of a LMW PTPase has been solved by X-ray crystallography [] and is found to form a single structural domain. It belongs to the alpha/beta class, with 6 α-helices and 4 β-strands forming a 3-layer α-β-alpha sandwich architecture.
Protein Domain
Name: Glutathione S-transferase, C-terminal-like
Type: Domain
Description: In eukaryotes, glutathione S-transferases (GSTs) participate in the detoxification of reactive electrophillic compounds by catalysing their conjugation to glutathione. GST is found as a domain in S-crystallins from squid, and proteins with no known GST activity, such as eukaryotic elongation factors 1-gamma and the HSP26 family of stress-related proteins, which include auxin-regulated proteins in plants and stringent starvation proteins in Escherichia coli. The major lens polypeptide of cephalopods is also a GST [ ]. Bacterial GSTs of known function often have a specific, growth-supporting role in biodegradative metabolism: epoxide ring opening and tetrachlorohydroquinone reductive dehalogenation are two examples of the reactions catalysed by these bacterial GSTs. Some regulatory proteins, like the stringent starvation proteins, also belong to the GST family []. GST seems to be absent from Archaea in which gamma-glutamylcysteine substitute to glutathione as major thiol.Glutathione S-transferases form homodimers, but in eukaryotes can also form heterodimers of the A1 and A2 or YC1 and YC2 subunits. The homodimeric enzymes display a conserved structural fold. Each monomer is composed of a distinct N-terminal sub-domain, which adopts the thioredoxin fold, and a C-terminal all-helical sub-domain, which adopts a 4-helical bundle fold. This entry is the C-terminal domain.Glutaredoxin 2 (Grx2), glutathione-dependent disulphide oxidoreductases, is structurally similar to GSTs, even though they lack any sequence similarity. Grx2 is also composed of N and C-terminal subdomains. It is thought that the primary function of Grx2 is to catalyse reversible glutathionylation of proteins with glutathione in cellular redox regulation including the response to oxidative stress. Grx2 is dissimilar to other glutaredoxins apart from containing the conserved active site residues [ ].
Protein Domain
Name: SLC26A/SulP transporter
Type: Family
Description: The SLC26A/SulP family is a large and ubiquitous family with members derived from archaea, bacteria, fungi, plants and animals. Many organisms including Bacillus subtilis, Synechocystis sp, Saccharomyces cerevisiae, Arabidopsis thaliana and Caenorhabditis elegans possess multiple SulP family paralogues. Many of these proteins are functionally characterised, and most are inorganic anion uptake transporters or anion:anion exchange transporters. Some transport their substrate(s) with high affinities, while others transport it or them with relatively low affinities [ , , ].SLC26A/SulP family proteins consist of N- and C- termini flanking a transmembrane domain thought to span the lipid bilayer 10-14 times. In most cases, the C-terminal cytoplasmic region includes a STAS (sulfate transporter and anti-sigma factor antagonist) domain [ ].Malfunctions in members of the SLC26A family of anion transporters are involved in three human diseases: diastrophic dysplasia/achondrogenesis type 1B (DTDST), Pendred's syndrome (PDS) and congenital chloride diarrhoea (CLD). These proteins contain 12 transmembrane helices followed by a cytoplasmic STAS domain at the C terminus. The importance of the STAS domain in these transporters is illustrated by the fact that a number of mutations in PDS and DTDST map to it [ ].Proteins in this family include:Neurospora crassa sulphate permease II (gene cys-14).Yeast sulphate permeases (SUL1 and SUL2).Rat sulphate anion transporter 1 (SAT-1).Mammalian DTDST, a probable sulphate transporter which, in human, is involved in the genetic disease, diastrophic dysplasia (DTD).Sulphate transporters 1, 2 and 3 from the legume Stylosanthes hamata.Human pendrin (gene PDS), which is involved in a number of hearing loss genetic diseases.Human protein DRA (Down-Regulated in Adenoma).Soybean early nodulin 70.Escherichia coli hypothetical protein YchM.Caenorhabditis elegans hypothetical protein F41D9.5.
Protein Domain
Name: PrpE-like, metallophosphatase domain
Type: Domain
Description: This entry represents the metallophosphatase domain of Bacillus subtilis PrpE and related proteins. PrpE (protein phosphatase E) is a bacterial member of the PPP (phosphoprotein phosphatase) family of serine/threonine phosphatases and a key signal transduction pathway component controlling the expression of spore germination receptors GerA and GerK in Bacillus subtilis. PrpE is closely related to ApaH (also known symmetrical Ap(4)A hydrolase and bis(5'nucleosyl)-tetraphosphatase). PrpE has specificity for phosphotyrosine only, unlike the serine/threonine phosphatases to which it is related [ ]. The Bacilli members of this family are single domain proteins while the other members have N- and C-terminal domains in addition to this phosphatase domain. Polynucleotide kinase/phosphatase (Pnkp) is the end-healing and end-sealing component of an RNA repair system present in bacteria []. It is composed of three catalytic modules: an N-terminal polynucleotide 5' kinase, a central 2',3' phosphatase, and a C-terminal ligase. Pnkp is a Mn(2+)-dependent phosphodiesterase-monoesterase that dephosphorylates 2',3'-cyclic phosphate RNA ends. An RNA binding site is suggested by a continuous tract of positive surface potential flanking the active site [].The PPP (phosphoprotein phosphatase) family, to which PrpE belongs, is one of two known protein phosphatase families specific for serine and threonine. The PPP family also includes: PP1, PP2A, PP2B (calcineurin), PP4, PP5, PP6, PP7, Bsu1, RdgC, PrpA/PrpB, and ApA4 hydrolase. The PPP catalytic domain is defined by three conserved motifs (-GDXHG-, -GDXVDRG- and -GNHE-). The PPP enzyme family is ancient with members found in all eukaryotes, and in most bacterial and archeal genomes. Dephosphorylation of phosphoserines and phosphothreonines on target proteins plays a central role in the regulation of many cellular processes [ , ]. PPPs belong to the metallophosphatase (MPP) superfamily.
Protein Domain
Name: Fetuin-A-type cystatin domain
Type: Domain
Description: The cystatin superfamily consists of a large group of cystatin domain- containing proteins, most of which are reversible and tight-binding inhibitorsof the papain (C1) and legumain (C13) families of cysteine proteases. Fetuins have been identified as main members of the cystatin superfamily and arecomposed of fetuin-A and fetuin-B. Fetuins are characterised by the presence of 2 N-terminally located cystatin-like repeats and a uniqueC-terminal domain which is not present in other proteins of the cystatin family [, , ].Fetuin-A [ , , , , ] also called alpha-2-HS-glycoprotein, bone sialic acid-containing protein (BSP), countertrypin or PP63, is expressed in a tissue- and development-specific pattern which seems to be significantly different betweenspecies. A wide functional diversity of fetuin-A has been observed. It has been shown to function in many physiological aspects, such as fatty acidtransport, regulation of insulin activity and hepatocyte-growth-factor activity, response to systemic inflammation, and inhibition of unwantedmineralization. It has been demonstrated that fetuin-A plays important roles during developmental processes, including osteogenesis, myotubule, fetal brainand nervous system development. Human fetuin is a heterodimer of chain A and B, which are derived by cleavage of a connecting peptide from a commonprecursor. Snake fetuin family proteins (antihemorrhagic proteins HSF, BJ46a and MSF and HLP-A and HLP-B) show a significant degree of sequence homology tofetuin-A [ ].The cystatin fold is formed by a five stranded anti-parallel β-sheet wrapped around a five-turn α-helix []. Fetuin contains twelve conserved cysteines involved in six disulphide bonds. Eleven of the twelve invariant cysteines are located within the cystatin-likerepeats. The 12th cysteine is located near the C terminus of the protein, separated by a region of variable length.
Protein Domain
Name: Glutathione S-transferase, C-terminal domain superfamily
Type: Homologous_superfamily
Description: In eukaryotes, glutathione S-transferases (GSTs) participate in the detoxification of reactive electrophillic compounds by catalysing their conjugation to glutathione. GST is found as a domain in S-crystallins from squid, and proteins with no known GST activity, such as eukaryotic elongation factors 1-gamma and the HSP26 family of stress-related proteins, which include auxin-regulated proteins in plants and stringent starvation proteins in Escherichia coli. The major lens polypeptide of cephalopods is also a GST [ ]. Bacterial GSTs of known function often have a specific, growth-supporting role in biodegradative metabolism: epoxide ring opening and tetrachlorohydroquinone reductive dehalogenation are two examples of the reactions catalysed by these bacterial GSTs. Some regulatory proteins, like the stringent starvation proteins, also belong to the GST family []. GST seems to be absent from Archaea in which gamma-glutamylcysteine substitute to glutathione as major thiol.Glutathione S-transferases form homodimers, but in eukaryotes can also form heterodimers of the A1 and A2 or YC1 and YC2 subunits. The homodimeric enzymes display a conserved structural fold. Each monomer is composed of a distinct N-terminal sub-domain, which adopts the thioredoxin fold, and a C-terminal all-helical sub-domain, which adopts a 4-helical bundle fold. This entry is the C-terminal domain.Glutaredoxin 2 (Grx2), glutathione-dependent disulphide oxidoreductases, is structurally similar to GSTs, even though they lack any sequence similarity. Grx2 is also composed of N and C-terminal subdomains. It is thought that the primary function of Grx2 is to catalyse reversible glutathionylation of proteins with glutathione in cellular redox regulation including the response to oxidative stress. Grx2 is dissimilar to other glutaredoxins apart from containing the conserved active site residues [ ].
Protein Domain
Name: Hemocyanin, N-terminal domain superfamily
Type: Homologous_superfamily
Description: Crustacean and cheliceratan hemocyanins (oxygen-transport proteins) and insect hexamerins (storage proteins) are homologous gene products, although the latter do not bind oxygen [ ].Haemocyanins are found in the haemolymph of many invertebrates. They are divided into 2 main groups, arthropodan and molluscan. These have structurally similar oxygen-binding centres, which are similar to the oxygen-binding centre of tyrosinases, but their quaternary structures are arranged differently. The arthropodan proteins exist as hexamers comprising 3 heterogeneous subunits (a, b and c) and possess 1 oxygen-binding centre per subunit; and the molluscan proteins exist as cylindrical oligomers of 10 to 20 subunits and possess 7 or 8 oxygen-binding centres per subunit [ ]. Although the proteins have similar amino acid compositions, the only real similarity in their primary sequences is in the region corresponding to the second copper-binding domain, which also shows similarity to the copper-binding domain of tyrosinases. Hexamerins are proteins from the hemolymph of insects, which may serve as a store of amino acids for synthesis of adult proteins. They do not possess the copper-binding histidines present in hemocyanins []. Homologues are also present in other kinds of organism, for example, Cyclopenase asqI from the yeast Emericella nidulans and Cyclopenase penL from Penicillium thymicola. AsqL is a tyrosinase involved in biosynthesis of the aspoquinolone mycotoxins, though its exact function is unknown [ ]. PenL is part of the gene cluster that mediates the biosynthesis of penigequinolones, potent insecticidal alkaloids that contain a highly modified 10-carbon prenyl group [].This entry represents the N-terminal domain of hemocyanin and hexamerin. This domain is composed of six helices arranged in a bundle fold where one central helix is surrounded by 5 others.
Protein Domain
Name: Sortase B, Firmicutes
Type: Family
Description: Members of this transpeptidase family are, in most cases, designated sortase B, product of the srtB gene. This protein shows only distant similarity to the sortase A family, for which there may be several members in a single bacterial genome. Typical SrtB substrate motifs include NAKTN, NPKSS, etc, and otherwise resemble the LPXTG sorting signals recognised by sortase A proteins. Sortase B sortases are membrane cysteine transpeptidases found in Gram-positive bacteria that anchor surface proteins to peptidoglycans of the bacterial cell wall envelope [, ]. This involves a transpeptidation reaction in which the surface protein substrate is cleaved at a conserved cell wall sorting signal and covalently linked to peptidoglycan for display on the bacterial surface. Sortases are grouped into different classes and subfamilies based on sequence, membrane topology, genomic positioning, and cleavage site preference []. Sortase B cleaves surface protein precursors between threonine and asparagine at a conserved NPQTN motif with subsequent covalent linkage to peptidoglycan []. It is required for anchoring the heme-iron binding surface protein IsdC to the cell wall envelope and the gene encoding Sortase B is located within the isd locus in S. aureus [, , ] and B. anthracis []. It may also play a role in pathogenesis []. Sortase B contains an N-terminal region that functions as both a signal peptide for secretion and a stop-transfer signal for membrane anchoring. At the C terminus, it contains the catalytic TLXTC signature sequence, where X is usually a serine []. Genes encoding SrtB and its targets are generally clustered in the same locus.This entry represents the Firmicute sortase B proteins.
Protein Domain
Name: Internalin I, Ig-like domain
Type: Domain
Description: Internalins (Inl) are virulence factors exposed in the surface of many Gram-positive bacteria and are involved in several processes such as recognition of cellular receptors and pathogen entry to escape from autophagy. These surface proteins were found as markers for Listeria monocytogenes species and essential for infection. Internalins are modular proteins composed by the characteristic N-terminal LRR repeats followed by domains that may vary among the members internalins family, explaining the distinct roles they play during infection. LRR repeats are followed by a conserved immunoglobulin-like domain [ ], a variable region that corresponds to MucBP (mucin binding protein) domains, and a C-terminal LPXTG motif, also characteristic of these proteins and is expected to anchor the protein to the peptidoglycan as seen in internalins structures such as InlA, InlI, InlJ or InlK [, ].The solved structure of InlK, revealed the classical LRR domain at its N-terminal, followed by three domains generating a structure that resembles a "bent arm", with a potential partner recognition domain localised at the "elbow"region. Domains D1 and D2 are potentially involved in binding to protein partners while D3 and D4 most probably serve as pedestals. Elbow and pedestal domains might act as platforms for binding partner molecules [ ]. InlK recruits the Major vault protein (MVP) in an interaction that allows L. monocytogenes to escape from autophagy [].This entry corresponds to a conserved Ig-like fold domain, also known as the inter-repeat region that follows the LRRs [ ], and corresponds to the elbow domain described in the InlK structure []. This domain and the LRRs are required and sufficient for attachment to host cells.
Protein Domain
Name: Nuclear factor of activated T-cells 5, Rel homology domain, DNA-binding domain
Type: Domain
Description: Nuclear factor of activated T-cells 5 (NFAT5) is a member of the nuclear factors of activated T cells (NFAT) of transcription factors. Proteins belonging to this family play a central role in inducible gene transcription during the immune response. This protein regulates gene expression induced by osmotic stress in mammalian cells. NFAT5, regulated by DDX5/DDX17, plays a role in the migratory capacity of breast cancer cells [ ]. Unlike monomeric members of this protein family, this protein exists as a homodimer and forms stable dimers with DNA elements. Five transcript variants encoding three different isoforms have been identified for this gene [].NFAT proteins appear to be regulated primarily at the level of their subcellular localisation [ ]. They are found exclusively in the cytoplasm of resting T cells, and consist of 2 components: a pre-existing cytoplasmic component that translocates into the nucleus on calcium mobilisation, and an inducible nuclear component comprising members of the activating protein-1 (AP-1) family of transcription factors. In response to antigen receptor signalling, the calcium-regulated phosphatase calcineurin acts directly to dephosphorylate NFAT proteins, causing their rapid translocation from the cytoplasm to the nucleus, where they cooperatively bind their target.The Rel homology domain (RHD) is found in a family of eukaryotic transcription factors, which includes NF-kappaB, Dorsal, Relish, NFAT, among others. The RHD is composed of two structural domains: the N-terminal DNA binding domain that is similar to that found in P53, the C-terminal domain has an immunoglobulin-like fold (See ) that functions as a dimerisation domain. This entry represents the N-terminal DNA binding domain [ , ]. This entry represents he N-terminal DNA binding domain of NFAT5.
Protein Domain
Name: Vgr protein, OB-fold domain superfamily
Type: Homologous_superfamily
Description: This domain occurs in a family of phage (and bacteriocin) proteins related to the phage P2 V gene product, which forms the small spike at the tip of the tail [ ]. Homologs in general are annotated as baseplate assembly protein V. At least one member is encoded within a region of Pectobacterium carotovorum (Erwinia carotovora) described as a bacteriocin, a phage tail-derived module able to kill bacteria closely related to the host strain.It is also found in Vgr-related proteins. Genes encoding type VI secretion systems (T6SS) are widely distributed in pathogenic Gram-negative bacterial species. In Vibrio cholerae, T6SS have been found to secrete three related proteins extracellularly, VgrG-1, VgrG-2, and VgrG-3. VgrG-1 can covalently cross-link actin in vitro, and this activity was used to demonstrate that V. cholerae can translocate VgrG-1 into macrophages by a T6SS-dependent mechanism. VgrG-related proteins likely assemble into a trimeric complex that is analogous to that formed by the two trimeric proteins gp27 and gp5 that make up the baseplate "tail spike"of Escherichia coli bacteriophage T4. The VgrG components of the T6SS apparatus might assemble a "cell-puncturing device"analogous to phage tail spikes to deliver effector protein domains through membranes of target host cells [ ].Gp5 is an integral component of the virion baseplate of bacteriophage T4. T4 Gp5 consists of 3 domains connected via long linkers: the N-terminal oligosaccharide/oligonucleotide-binding (OB)-fold domain, the middle lysozyme domain, and the C-terminal triplestranded-helix. The equivalent of the Gp5 OB-fold domain in the structure of VgrG is the domain of unknown function comprising residues 380-470 and conserved in all known VgrGs. This entry represents the OB-fold domain which consists of a 5-stranded antiparallel-barrel with a Greek-key topology [ ].
Protein Domain
Name: MAGE homology domain
Type: Domain
Description: The first mammalian members of the MAGE (melanoma-associated antigen) gene family were originally described as completely silent in normal adult tissues,with the exception of male germ cells and, for some of them, placenta. By contrast, these genes were expressed in various kinds of tumors. However, othermembers of the family were recently found to be expressed in normal cells, indicating that the family is larger and more disparate than initiallyexpected. MAGE-like genes have also been identified in non-mammalian species, including Drosophila melanogaster (Fruit fly) and Danio rerio (Zebrafish). Although no MAGE homologous sequences have been identified in Caenorhabditis elegans, Saccharomyces cerevisiae (Baker's yeast) or Schizosaccharomyces pombe (Fission yeast), MAGE sequences have been found inseveral vegetal species, including Arabidopsis thaliana (Mouse-ear cress) [ ].The only region of homology shared by all of the members of the family is astretch of about 200 amino acids which has been named the MAGE homology domain. The MAGE homology domain is usually located close to the C-terminal,although it can also be found in a more central position in some proteins. The MAGE homology domain is generally present as a single copy but it isduplicated in some proteins. It has been proposed that the MAGE homology domain of MAGE-D proteins might interact with p75 neurotrophin or relatedreceptors [ ].Proteins known to contain a MAGE domain are listed below:Human MAGE-A, -B and -C proteins. MAGE-A, -B and -C genes are silent in all normal tissues with the exception of testis.Human MAGE-D to -L proteins. MAGE-D to -L genes are expressed in normal adult tissues.Mouse Mage-a and -b proteins.Mouse Mage-d, -e, -g, -h, -k and -l.Mammalian Necdin. The human Necdin gene is a candidate for the Prader-Willi syndrome.
Protein Domain
Name: Rhomboid protease GlpG
Type: Family
Description: This entry consists of rhomboid protease GlpG and its homologues without the conserved peptidase active sites. GlpG in E. coli is a rhomboid family intramembrane serine protease that has been extensively characterised as a proxy for rhomboid family proteases in animals. It efficiently cleaves eukaryote-derived model substrates. This multiple membrane-spanning protein excludes inappropriate substrates from access to its cleavage site, and shows activity against truncated versions, but not full-length versions, of the E. coli multidrug transporter MdfA. This finding suggests a housekeeping function in removing faulty proteins. In contrast, several eukaryotic rhomboid family proteases release peptide hormones for signaling functions, and the Shewanella and Vibrio protein rhombosortase appears to be part of a protein-sorting system, cleaving a C-terminal anchoring helix domain []. GlpG belongs to the MEROPS peptidase family S54 (Rhomboid, clan ST). The tertiary structure from the GlpG protein from E. coli has been determined [ ]. The GlpG protein six transmembrane domains (other members of the family are predicted to have seven), with the N- and C-terminal ends anchored in the cytoplasm. One transmembrane domain is shorter than the rest, creating an internal, aqueous cavity just below the membrane surface and it is here were proteolysis occurs. There is also a membrane-embedded loop between the first and second transmembrane domains which is postulated to act as a gate controlling substrate access to the active site. No other family of serine peptidases is known to have active site residues within transmembrane domains (although transmembrane active sites are known for aspartic peptidase and metallopeptidases), and the GlpG protein has the type structure for clan ST.
Protein Domain
Name: Transcription factor, Skn-1-like, DNA-binding domain superfamily
Type: Homologous_superfamily
Description: The DNA-binding domain of certain eukaryotic transcription factors displays a distinctive helix-turn-helix (HTH) motif. The MafG basic region-leucine zipper (bZIP) protein and the Caenorhabditis elegans Skn-1 transcription factor share this HTH motif. MafG is a member of the Maf family of proteins, which are a subgroup of bZIP proteins that function as transcriptional regulators of cellular differentiation. Mafs can form either homodimers, or heterodimers with other bZIP proteins through their leucine zipper domains. MafG proteins are small Mafs that lack a putative transactivation domain. The DNA-binding domain of MafG contains the conserved Maf extended homology region (EHR), which is not present in other bZIP proteins. The EHR together with the basic region are responsible for the DNA-binding specificity of Mafs. Skn-1 is a transcription factor that specifies mesodermal development in C. elegans. Skn-1 and MafG share a conserved DNA-binding motif, however Skn-1 lacks the leucine zipper dimerisation domain that is found in all bZIP proteins. Skn-1 acts as a monomer. The DNA-binding domains in MafG [ ] and Skn-1 [] share structural similarity, despite a sequence identity of only 25%. The domain fold consists of three (MafG) to four (Skn-1) helices, where the long C-terminal helix protrudes from the domain and binds to DNA. MafG lacks the N-terminal helix of Skn-1. A basic cluster of residues is present on the surface of the domain, which together with the amino acid sequence motif, NXXYAXXCR, forms a DNA-binding surface. MafG and Skn-1 may use a common DNA-binding mode. However, the involvement of helix 2 (H2) in DNA recognition differs between MafG and Skn-1, with two residues at the beginning of H2 in MafG contributing to the unique DNA-binding specificity of Mafs.
Protein Domain
Name: Gp5/Type VI secretion system Vgr protein, OB-fold domain
Type: Domain
Description: This domain occurs in a family of phage (and bacteriocin) proteins related to the phage P2 V gene product, which forms the small spike at the tip of the tail [ ]. Homologs in general are annotated as baseplate assembly protein V. At least one member is encoded within a region of Pectobacterium carotovorum (Erwinia carotovora) described as a bacteriocin, a phage tail-derived module able to kill bacteria closely related to the host strain.It is also found in Vgr-related proteins. Genes encoding type VI secretion systems (T6SS) are widely distributed in pathogenic Gram-negative bacterial species. In Vibrio cholerae, T6SS have been found to secrete three related proteins extracellularly, VgrG-1, VgrG-2, and VgrG-3. VgrG-1 can covalently cross-link actin in vitro, and this activity was used to demonstrate that V. cholerae can translocate VgrG-1 into macrophages by a T6SS-dependent mechanism. VgrG-related proteins likely assemble into a trimeric complex that is analogous to that formed by the two trimeric proteins gp27 and gp5 that make up the baseplate "tail spike"of Escherichia coli bacteriophage T4. The VgrG components of the T6SS apparatus might assemble a "cell-puncturing device"analogous to phage tail spikes to deliver effector protein domains through membranes of target host cells [ ].Gp5 is an integral component of the virion baseplate of bacteriophage T4. T4 Gp5 consists of 3 domains connected via long linkers: the N-terminal oligosaccharide/oligonucleotide-binding (OB)-fold domain, the middle lysozyme domain, and the C-terminal triplestranded-helix. The equivalent of the Gp5 OB-fold domain in the structure of VgrG is the domain of unknown function comprising residues 380-470 and conserved in all known VgrGs. This entry represents the OB-fold domain which consists of a 5-stranded antiparallel-barrel with a Greek-key topology [ ].
Protein Domain
Name: TetR transcriptional regulator CgmR-like, C-terminal domain
Type: Domain
Description: TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity [ ]. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response []. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibiotic out of the cell before it can attach to the ribosomes and inhibit protein synthesis []. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain [].This entry represents the C-terminal domain present in CgmR (C. glutamicum multidrug-responsive transcriptional repressor), previously called CGL2612 protein. CgmR (CGL2612) from Corynebacterium glutamicum is a multidrug-resistance-related transcription factor belonging to the TetR family. It regulates expression of the immediately upstream gene cgmA (cgl2611) by binding to the operator cgmO in the cgmA promoter. The cgmA gene encodes a permease belonging to the major facilitator superfamily, a protein family composed of bacterial multidrug exporters, and the pair of CgmR and CgmA confers multidrug resistance on C. glutamicum [ ].
Protein Domain
Name: Dihydroorotase
Type: Family
Description: Dihydroorotase belongs to MEROPS peptidase family M38 (clan MJ), and includes peptides classified as a non-peptidase homologues. DHOase catalyses the third step in the de novobiosynthesis of pyrimidine, the conversion of ureidosuccinic acid (N-carbamoyl-L-aspartate) into dihydroorotate. Dihydroorotase binds a zinc ion which is required for its catalytic activity [ ].In bacteria, DHOase is a dimer of identical chains of about 400 amino-acid residues (gene pyrC). In higher eukaryotes, DHOase is part of a large multi-functional protein known as 'rudimentary' in Drosophila melanogaster and CAD in mammals and which catalyzes the first three steps of pyrimidine biosynthesis [ ]. The DHOase domain is located in the central part of this polyprotein. In yeasts, DHOase is encoded by a monofunctional protein (gene URA4). However, a defective DHOase domain [] is found in a multifunctional protein (gene URA2) that catalyzes the first two steps of pyrimidine biosynthesis.The comparison of DHOase sequences from various sources shows [ ] that there are two highly conserved regions. The first located in the N-terminal extremity contains two histidine residues suggested [] to be involved in binding the zinc ion. The second is found in the C-terminal part. Members of this family of proteins are predicted to adopt a TIM barrel fold [].Dihydroorotase 'multifunctional complex type' , in contrast to the homodimeric type of dihydroorotase found in Escherichia coli, tends to appear in a large multifunctional complex with aspartate transcarbamoylase. Homologous domains appear in multifunctional proteins of higher eukaryotes. In some species, including Pseudomonas putida and Pseudomonas aeruginosa, this protein is inactive but is required as a non-catalytic subunit of aspartate transcarbamoylase (ATCase). In these species, a second, active dihydroorotase is also present.
Protein Domain
Name: Protein-tyrosine phosphatase, low molecular weight, mammalian
Type: Family
Description: Protein tyrosine (pTyr) phosphorylation is a common post-translational modification which can create novel recognition motifs for protein interactions and cellular localisation, affect protein stability, and regulate enzyme activity. Consequently, maintaining an appropriate level of protein tyrosine phosphorylation is essential for many cellular functions. Tyrosine-specific protein phosphatases (PTPase; ) catalyse the removal of a phosphate group attached to a tyrosine residue, using a cysteinyl-phosphate enzyme intermediate. These enzymes are key regulatory components in signal transduction pathways (such as the MAP kinase pathway) and cell cycle control, and are important in the control of cell growth, proliferation, differentiation and transformation [ , ]. The PTP superfamily can be divided into four subfamilies []:(1) pTyr-specific phosphatases(2) dual specificity phosphatases (dTyr and dSer/dThr)(3) Cdc25 phosphatases (dTyr and/or dThr)(4) LMW (low molecular weight) phosphatasesBased on their cellular localisation, PTPases are also classified as:Receptor-like, which are transmembrane receptors that contain PTPase domains [ ] Non-receptor (intracellular) PTPases [ ] All PTPases carry the highly conserved active site motif C(X)5R (PTP signature motif), employ a common catalytic mechanism, and share a similar core structure made of a central parallel β-sheet with flanking α-helices containing a β-loop-α-loop that encompasses the PTP signature motif [ ]. Functional diversity between PTPases is endowed by regulatory domains and subunits. This entry represents mammalian low molecular weight (LMW) phosphotyrosine protein phosphatase (or acid phosphatase), which act on tyrosine phosphorylated proteins, low-MW aryl phosphates and natural and synthetic acyl phosphates [ , ]. The structure of a LMW PTPase has been solved by NMR []. It belongs to the alpha/beta class, with 6 α-helices and 4 β-strands forming a 3-layer α-β-alpha sandwich architecture.
Protein Domain
Name: CTF transcription factor/nuclear factor 1, N-terminal
Type: Domain
Description: Nuclear factor I (NF-I) or CCAAT box-binding transcription factor (CTF) [ , , ] (also known as TGGCA-binding proteins) are a family of vertebrate nuclear proteins which recognise and bind, as dimers, the palindromic DNA sequence 5'-TGGCANNNTGCCA-3'. This family was first described for its role in stimulating the initiation of adenovirus DNA replication []. In vertebrates there are four members NFIA, NFIB, NFIC, and NFIX and an orthologue from Caenorhabditis elegans has been described, called Nuclear factor I family protein (NFI-I) []. The CTF/NF-I proteins are individually capable of activating transcription and DNA replication, thus they function by regulating cell proliferation and differentiation. They are involved in normal development and have been associated with developmental abnormalities and cancer in humans []. In a given species, there are a large number of different CTF/NF-I proteins, generated both by alternative splicing and by the occurrence of four different genes. CTF/NF-1 proteins contain 400 to 600 amino acids. The N-terminal 200 amino-acid sequence, almost perfectly conserved in all species and genes sequenced, mediates site-specific DNA recognition, protein dimerisation and Adenovirus DNA replication. The C-terminal 100 amino acids contain the transcriptional activation domain. This activation domain is the target of gene expression regulatory pathways elicited by growth factors and it interacts with basal transcription factors and with histone H3 [].This entry represents the N terminus, of which 200 residues contain the DNA-binding and dimerisation domain, but also has an 8-47 residue highly conserved region 5' of this, whose function is not known. Deletion of the N-terminal 200 amino acids removes the DNA-binding activity, dimerisation-ability and the stimulation of adenovirus DNA replication [ ].
Protein Domain
Name: Transcription intermediary factor 1-beta, PHD domain
Type: Domain
Description: TIF1-beta, also known as Kruppel-associated Box (KRAB)-associated protein 1 (KAP-1), belongs to the C-VI subclass of TRIM (tripartite motif) family of proteins that are defined by their N-terminal RBCC (RING, Bbox, and coiled coil) domains, including three consecutive zinc-binding domains, a C3HC4-type RING-HC finger, Bbox1 and Bbox2, and a coiled coil region, as well as a plant homeodomain (PHD), and a bromodomain (Bromo) positioned C-terminal to the RBCC domain. It acts as a nuclear co-repressor that plays a role in transcription and in the DNA damage response [, , ]. Upon DNA damage, the phosphorylation of KAP-1 on serine 824 by the ataxia telangiectasia-mutated (ATM) kinase enhances cell survival and facilitates chromatin relaxation and heterochromatic DNA repair [ ]. It also regulates CHD3 nucleosome remodelling during the DNA double-strand break (DSB) response []. Meanwhile, KAP-1 can be dephosphorylated by protein phosphatase PP4C in the DNA damage response []. Moreover, KAP-1 is a co-activator of the orphan nuclear receptor NGFI-B (or Nur77) and is involved in NGFI-B-dependent transcription []. It is also a coiled-coil binding partner, substrate and activator of the c-Fes protein tyrosine kinase []. The N-terminal RBCC domains of TIF1-beta are responsible for the interaction with KRAB zinc finger proteins (KRAB-ZFPs), MDM2, MM1, C/EBPbeta, and the regulation of homo- and heterodimerization []. The C-terminal PHD/Bromo domains are involved in interacting with SETDB1, Mi-2alpha and other proteins to form complexes with histone deacetylase or methyltransferase activity [, ].This entry represents the PHD zinc finger from TIF1-beta, which is involved in interacting with SETDB1, Mi-2alpha and other proteins to form complexes with histone deacetylase or methyltransferase activity.
Protein Domain
Name: rRNA small subunit methyltransferase B
Type: Family
Description: RNA (C5-cytosine) methyltransferases (RCMTs) catalyse the transfer of a methyl group to the 5th carbon of a cytosine base in RNA sequences to produce C5-methylcytosine. RCMTs use the cofactor S-adenosyl-L-methionine (SAM) as a methyl donor [ ]. The catalytic mechanism of RCMTs involves an attack by the thiolate of a Cys residue on position 6 of the target cytosine base to form a covalent link, thereby activating C5 for methyl-group transfer. Following the addition of the methyl group, a second Cys residue acts as a general base in the beta-elimination of the proton from the methylated cytosine ring. The free enzyme is restored and the methylated product is released [].Numerous putative RCMTs have been identified in archaea, bacteria and eukaryota [ , ]; most are predicted to be nuclear or nucleolar proteins []. The Escherichia coli Ribosomal RNA Small-subunit Methyltransferase Beta (RSMB) FMU (FirMicUtes) represents the first protein identified and characterised as a cytosine-specific RNA methyltransferase. RSMB was reported to catalyse the formation of C5-methylcytosine at position 967 of 16S rRNA [, ].A classification of RCMTs has been proposed on the basis of sequence similarity [ ]. According to this classification, RCMTs are divided into 8 distinct subfamilies []. Recently, a new RCMT subfamily, termed RCMT9, was identified []. Members of the RCMT contain a core domain, responsible for the cytosine-specific RNA methyltransferase activity. This 'catalytic' domain adopts the Rossman fold for the accommodation of the cofactor SAM []. The RCMT subfamilies are also distinguished by N-terminal and C-terminal extensions, variable both in size and sequence [].The rRNA small subunit methyltransferase B (RsmB) protein, often referred to as Fmu, has been demonstrated to methylate only C967 of the 16S ribosomal RNA and to produce only m5C at that position [ ]. The structure of the E. coli protein has been determined []. It contains three subdomains which share structural homology to DNA m5C methyltransferases and two RNA binding protein families. The N-terminal sequence shares homology to another (noncatalytic) RNA binding protein, e.g. the ribosomal RNA antiterminator protein NusB (). The catalytic lobe of the N1 domain, comprises the conserved core identified in all of the putative RNA m5C MTase sequences. Although the N1 domain is structurally homologous to known RNA binding proteins, there is no clear sequence motif that defines its role in RNA binding and recognition. At the functional centre of the catalytic lobe is the MTase domain of Fmu (residues 232-429), which adopts a fold typical of known AdoMet-dependent methyltransferases. In spite of the lack of a conserved RNA binding motif in the N1 domain, the close association of the N1 and MTase domains suggest that any RNA bound in the active site of the MTase domain is likely to interact with the N1 domain.
Protein Domain
Name: FAM175 family
Type: Family
Description: Members of protein family FAM175 include the BRCA1-A complex subunit Abraxas 1 [ , ], BRISC complex subunit Abraxas 2 or Abro1 (Abraxas brother protein 1) [, ], and uncharacterised plant proteins.It is thought that BRCA1-A complex subunit Abraxas acts as a central scaffold protein responsible for assembling the various components of the BRCA1-A complex, and mediates recruitment of BRCA1 [ , ]. Similarly, Abro1 probably acts as a scaffold facilitating assembly of the various components of BRISC [] - the protein does not interact with BRCA1, but binds polyubiquitin []. The primary sequences of these proteins contain an MPN-like domain [].
Protein Domain
Name: Ammonium transporter, conserved site
Type: Conserved_site
Description: A number of evolutionarily-related proteins have been found to be involved in the transport of ammonium ions across membranes [ , ].Members of this family include: Saccharomyces cerevisiae ammonium transporters MEP1, MEP2 and MEP3.Arabidopsis thaliana high affinity ammonium transporter (gene AMT1).Corynebacterium glutamicum ammonium and methylammonium transport system.Escherichia coli putative ammonium transporter amtB.Bacillus subtilis nrgA.Mycobacterium tuberculosis hypothetical protein MtCY338.09c.Synechocystis sp. (strain PCC 6803) hypothetical proteins sll0108, sll0537 and sll1017.Methanocaldococcus jannaschii (Methanococcus jannaschii) hypothetical proteins MJ0058 and MJ1343.Caenorhabditis elegans hypothetical proteins C05E11.4, F49E11.3 and M195.3.As expected by their transport function, these proteins are highly hydrophobic and seem to contain from 10 to 12 transmembrane domains.
Protein Domain
Name: PEP-CTERM system TPR-repeat lipoprotein, putative
Type: Family
Description: This protein family occurs in strictly within a subset of Gram-negative bacterial species with the proposed PEP-CTERM/exosortase system [ ], analogous to the LPXTG/sortase system common in Gram-positive bacteria. The proteins in this entry occur in a species if, and only if, a transmembrane histidine kinase () and a DNA-binding response regulator ( ) also occur. The present of tetratricopeptide repeats (TPR) suggests they may be involved in protein-protein interaction, possibly for the regulation of PEP-CTERM protein expression, since many PEP-CTERM proteins in these genomes are preceded by a proposed DNA binding site for the response regulator.
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom