Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 12101 to 12200 out of 30763 for seed protein

Category restricted to ProteinDomain (x)

0.046s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Name: MAD homology, MH1
Type: Domain
Description: Smad proteins are signal transducers and transcriptional comodulators of the TGF-beta superfamily of ligands, which play a central role in regulating a broad range of cellular responses, including cell growth, differentiation, and specification of developmental fate, in diverse organisms from Caenorhabditis elegans to humans. Ligand binding to specific transmembrane receptor kinases induces receptor oligomerisation and phosphorylation of the receptor specific Smad protein (R-Smad) in the cytoplasm. The R-Smad proteins regulate distinct signalling pathways. Smad1, 5 and 8 mediate the signals of bone morphogenetic proteins (BMPs), while Smad2 and 3 mediate the signals of activins and TGF-betas. Upon ligand stimulation, R-Smad proteins are phosphorylated at the conserved C-terminal tail sequence, SS*xS* (where S* denotes a site of phosphorylation). The phosphorylated states of R-Smad proteins form heteromeric complexes with Smad4 and are translocated into the nucleus. In the nucleus, the heteromeric complexes function as gene-specific transcription activators by binding to promoters and interacting with transcriptional coactivators. Smad6 and Smad7 are inhibitory Smad proteins that inhibit TGF-beta signalling by interfering with either receptor-mediated phosphorylation or hetero-oligomerisation between Smad4 and R-Smad proteins. Smad proteins comprise two conserved MAD homology domains, one in the N terminus (MH1) and one in the C terminus (MH2), separated by a more variable, proline-rich linker region. The MH1 domain has a role in DNA binding and negatively regulates the functions of MH2 domain, whereas the MH2 domain is responsible for transactivation and mediates phosphorylation-triggered heteromeric assembly between Smad4 and R-Smad [ , ]. The MH1 domain adopts a compact globular fold, with four alpha helices, six short beta strands, and five loops. The N-terminal half of the sequence consists of three alpha helices, and the C-terminal half contains all six beta strands, which form two small beta sheets and one beta hairpin. The fourth alpha helix is located in the hydrophobic core of the molecule, surrounded by the N-terminal three alpha helices on one side and by the two small beta sheets and the beta hairpin on the other side. These secondary structural elements are connected with five intervening surface loops. The MH1 domain employs a novel DNA-binding motif, an 11-residue β-hairpin formed by strands B2 and B3, to contact DNA in the major groove. Two residues in the L3 loop and immediately preceding strand B2 also contribute significantly to DNA recognition. The beta hairpin appears to protrude outward from the globular MH1 core [].
Protein Domain
Name: Lamin tail domain
Type: Domain
Description: Intermediate filaments (IFs) constitute a major structural element of metazoan cells. They build two distinct systems: one inside the nucleus attached to the inner membrane, and one that is cytoplasmic, which connects intercellular junctional complexes situated at the plasma membrane with the outer nuclear membrane. In both cases, their major function is assumed to be that of a mechanical stress absorber and an integrating device for the entire cytoskeleton. In the nucleus, the IF system is assembled from lamins, which together with an ever increasing number of associated transmembrane and chromatin-binding proteins constitute the nuclear lamina. Despite the large diversity among IF proteins, they all share a similar structural building plan, with a long central α-helical 'rod' domain that is flanked by non-α-helical N- and C-terminal end domains called 'head' and 'tail', respectively [ , ].Lamins exhibit a highly conserved globular C-terminal lamin-tail domain (LTD) which has the immunoglobulin (Ig) fold. Invertebrate cytoplasmic IFs share sequence similarity with nuclear lamins and also contain a C-terminal tail domain with homology to the LTD [ ].Domains homologous to the LTD have been detected in several uncharacterised proteins from phylogenetically diverse bacteria and two archaea, Methanosarcina and Halobacterium. In several bacterial proteins, the LTD cooccurs with membrane-associated hydrolases of the metallo-beta-lactamase, synaptojanin, and calcineurin-like phosphoesterase superfamilies. In other secreted or periplasmic bacterial proteins, the LTDs are associated with oligosaccharide-binding domains or are present as multiple tandem repeats in a single protein. These associations suggest a potential role for the prokaryotic LTDs in tethering proteins to the membrane or membrane-associated structures. In contrast to the bacterial homologs, all animal LTDs are closely related and are contained in proteins with a stereotypic architecture. The precursor of the animal LTD might have been acquired via horizontal gene transfer from bacteria relatively late in the evolution of the eukaryotic crown group. Subsequent to this acquisition, a coiled-coil domain, derived from preexisting intermediate filament coil-coils, might have been fused to the N-terminal of the LTD [ ].The LTD domain could be involved both in protein and DNA binding [ ]. The LTD domain adopts an Ig-like fold of type s. It consists of a 2-layered sandwich of 9 anti-parallel β-strands arranged in two β-sheets with a Greek key topology. One of the sheets has five β-strands while the other has four. Seven of the 9 strands are present in the classical Ig fold topology [, ].
Protein Domain
Name: CRISPR-associated endoribonuclease Cas6/Csy4, subtype I-F/YPEST superfamily
Type: Homologous_superfamily
Description: The CRISPR-Cas system is a prokaryotic defense mechanism against foreign genetic elements. The key elements of this defense system are the Cas proteins and the CRISPR RNA. Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes [ ]. CRISPRs appear to provide acquired resistance against mobile genetic elements (viruses, transposable elements and conjugative plasmids). CRISPR clusters contain sequences complementary to antecedent mobile elements and target invading nucleic acids. CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA).The defense reaction is divided into three stages. In the adaptation stage, the invader DNA is cleaved, and a piece of it is selected to be integrated as a new spacer into the CRISPR locus, where it is stored as an identity tag for future attacks by this invader. During the second stage (the expression stage), the CRISPR RNA (pre-crRNA) is transcribed and subsequently processed into the mature crRNAs. In the third stage (the interference stage), Cas proteins, together with crRNAs, identify and degrade the invader [ , , ].The CRISPR-Cas systems have been sorted into three major classes. In CRISPR-Cas types I and III, the mature crRNA is generally generated by a member of the Cas6 protein family. Whereas in system III the Cas6 protein acts alone, in some class I systems it is part of a complex of Cas proteins known as Cascade (CRISPR-associated complex for antiviral defense). The Cas6 protein is an endoribonuclease necessary for crRNA production whereas the additional Cas proteins that form the Cascade complex are needed for crRNA stability [ ]. This protein family, typified by YPO2462 of Yersinia pestis, is a CRISPR-associated (Cas) family strictly associated with the Ypest subtype of CRISPR/Cas locus. It is designated Csy4, for CRISPR/Cas Subtype Ypest protein 4. In Pseudomonas aeruginosa, crRNA biogenesis requires the endoribonuclease Csy4, which binds and cleaves the repetitive sequence of the CRISPR transcript [ ].The crystal structure of Csy4 in complex with its cognate RNA reveals an unexpected recognition mechanism whereby Csy4 makes sequence-specific interactions in the major groove of the CRISPR repeat stem-loop. Together with electrostatic contacts to the phosphate backbone, these enable Csy4 to selectively bind and cleave pre-crRNAs [ ].
Protein Domain
Name: Two partner secretion pathway transporter
Type: Family
Description: Proteins in this family are transporter proteins that are members or probable members of the two partner secretion pathway (TPS). In Gram-negative bacteria, TPS is used for the secretion of large, mostly virulence-related proteins. In this family, filamentous hemagglutinin transporter protein FhaC (TpsB transporter) from Bordetella pertussis mediates the secretion of major adhesin filamentous hemagglutinin (FHA) [ ], while outer membrane transporter protein IbpB (TpsB transporter) from Haemophilus somnus mediate the secretion of protein IbpA (high molecular weight immunoglobulin-binding protein)[].
Protein Domain
Name: Band 3 cytoplasmic domain
Type: Domain
Description: This entry contains the cytoplasmic domain of the Band 3 anion exchange proteins that exchange Cl-/HCO3-. Band 3 constitutes the most abundant polypeptide in the red blood cell membrane, comprising 25% of the total membrane protein. The cytoplasmic domain of band 3 functions primarily as an anchoring site for other membrane-associated proteins. Included among the protein ligands of cdb3 are ankyrin, protein 4.2, protein 4.1, glyceraldehyde-3-phosphate dehydrogenase (GAPDH), phosphofructokinase, aldolase, hemoglobin, hemichromes, and the protein tyrosine kinase (p72syk) [ ].
Protein Domain
Name: Bunyavirus glycoprotein G1
Type: Domain
Description: Bunyavirus has three genomic segments: small (S), middle-sized (M), and large (L). The S segment encodes the nucleocapsid and a non-structural protein. The M segment codes for two glycoproteins, G1 (also known as Gc) and G2 (also known as G2), and another non-structural protein (NSm). The L segment codes for an RNA polymerase. This entry represents the polyprotein region forming the G1 glycoprotein, which is the viral attachment protein [ , , ]. It interacts with the G2 polyprotein ().
Protein Domain
Name: Oligosaccharyl transferase complex, subunit OST3/OST6
Type: Family
Description: During N-linked glycosylation of proteins, oligosaccharide chains are assembled on the carrier molecule dolichyl pyrophosphate in the following order: 2 molecules of N-acetylglucosamine (GlcNAc), 9 molecules of mannose, and 3 molecules of glucose. These 14-residue oligosaccharide cores are then transferred to asparagine residues on nascent polypeptide chains in the endoplasmic reticulum (ER). As proteins progress through the Golgi apparatus, the oligosaccharide cores are modified by trimming and extension to generate a diverse array of glycosylated proteins [ , ].The oligosaccharyl transferase complex (OST complex) transfers 14-sugar branched oligosaccharides from dolichyl pyrophosphate to asparagine residues [ ]. The complex contains nine protein subunits: Ost1p, Ost2p, Ost3p, Ost4p, Ost5p, Ost6p, Stt3p, Swp1p, and Wbp1p, all of which are integral membrane proteins of the ER. The OST complex interacts with the Sec61p pore complex [] involved in protein import into the ER.This entry represents subunits OST3 and OST6. OST3 is homologous to OST6 [ ], and several lines of evidence indicate that they are alternative members of the OST complex. Disruption of both OST3 and OST6 causes severe underglycosylation of soluble and membrane-bound glycoproteins and a defect in the assembly of the complex. Hence, the function of these genes seems to be essential for recruiting a fully active complex necessary for efficient N-glycosylation []. This entry also includes the magnesium transporter protein 1, also known as OST3 homologue B, which might be involved in N-glycosylation through its association with the oligosaccharyl transferase (OST) complex.
Protein Domain
Name: AH/BAR domain superfamily
Type: Homologous_superfamily
Description: This superfamily represents a structural domain which consists of three α-helices, including the arfaptin homology (AH) domain and the BAR (Bin-Amphiphysin-Rvs) domain.The arfaptin homology (AH) domain is a protein domain found in a range of proteins, including arfaptins, protein kinase C-binding protein PICK1 [ ] and mammalian 69kDa islet cell autoantigen (ICA69) []. The AH domain of arfaptin has been shown to dimerise and to bind Arf and Rho family GTPases [, ], including ARF1, a small GTPase involved in vesicle budding at the Golgi complex and immature secretory granules. The AH domain consists of three α-helices arranged as an extended antiparallel α-helical bundle. Two arfaptin AH domains associate to form a highly elongated, crescent-shaped dimer [, ].Members of the Amphiphysin protein family are key regulators in the early steps of endocytosis, involved in the formation of clathrin-coated vesicles by promoting the assembly of a protein complex at the plasma membrane and directly assist in the induction of the high curvature of the membrane at the neck of the vesicle. Amphiphysins contain a characteristic domain, known as the BAR (Bin-Amphiphysin-Rvs) domain, which is required for their in vivofunction and their ability to tubulate membranes [ ]. The crystal structure of these proteins suggest the domain forms a crescent-shaped dimer of a three-helix coiled coil with a characteristic set of conserved hydrophobic, aromatic and hydrophilic amino acids. Proteins containing this domain have been shown to homodimerise, heterodimerise or, in a few cases, interact with small GTPases.
Protein Domain
Name: PPE superfamily
Type: Homologous_superfamily
Description: The human pathogen Mycobacterium tuberculosis harbours a large number of genes that encode proteins whose N-termini contain the characteristic motifs Pro-Glu (PE) or Pro-Pro-Glu (PPE). A subgroup of the PE proteins contains polymorphic GC-rich sequences (PGRS), while a subgroup of the PPE proteins contains major polymorphic tandem repeats (MPTR). The function of most of these proteins remains unknown [ ]. However, the PE_PGRS proteins from Mycobacterium marinum are secreted by components of the ESX-5 system that belongs to the recently defined type VII secretion systems []. It has also been reported that the PE_PGRS family of proteins contains multiple calcium-binding and glycine-rich sequence motifs GGXGXD/NXUX. This sequence repeat constitutes a calcium-binding parallel β-roll or parallel β-helix structure and is found in RTX toxins secreted by many Gram-negative bacteria []. This mycobacterial superfamily is named after a conserved amino-terminal region of about 180 amino acids, the PPE motif. The carboxy termini of proteins belonging to the PPE family are variable, and on the basis of this region at least three groups can be distinguished. The MPTR subgroup is characterised by tandem copies of a motif NXGXGNXG. The second subgroup contains a conserved motif at about position 350. The third group shares only similarity in the amino terminal region. The function of these proteins is uncertain but it has been suggested that they may be related to antigenic variation of Mycobacterium tuberculosis [ ].
Protein Domain
Name: Alpha-synuclein
Type: Family
Description: Synucleins are small, soluble proteins expressed primarily in neural tissue and in certain tumours [ , ]. The family includes three known proteins: alpha-synuclein, beta-synuclein, and gamma-synuclein. All synucleins have in common a highly conserved α-helical lipid-binding motif with similarity to the class-A2 lipid-binding domains of the exchangeable apolipoproteins [].Synuclein family members are not found outside vertebrates, although they have some conserved structural similarity with plant 'late-embryo-abundant' proteins. The alpha- and beta-synuclein proteins are found primarily in brain tissue, where they are seen mainly in presynaptic terminals [ , ]. The gamma-synuclein protein is found primarily in the peripheral nervous system and retina, but its expression in breast tumors is a marker for tumor progression []. Normal cellular functions have not been determined for any of the synuclein proteins, although some data suggest a role in the regulation of membrane stability and/or turnover. Mutations in alpha-synuclein are associated with rare familial cases of early-onset Parkinson's disease, and the protein accumulates abnormally in Parkinson's disease, Alzheimer's disease, and several other neurodegenerative illnesses []. This entry represents alpha-synuclein, which regulates synaptic vesicle trafficking and subsequent neurotransmitter release [ ]. It also acts as a molecular chaperone in its multimeric membrane-bound state, assisting in the folding of SNAREs []. Synelfin from the zebra finch is a homologue of the human alpha-synuclein and may serve a novel function critical to the regulation of vertebrate neural plasticity []. It is regulated during the critical period for song learning.
Protein Domain
Name: 3a-like viroporin, cytosolic domain, alpha/betacoronavirus
Type: Domain
Description: All coronaviruses have a similar genomic structure comprising two large open reading frames (ORFs) (ORF1a and ORF1b) encoding the coronavirus replicase. At the 3' end, the genome encodes four structural proteins (S, E, M and N) and a variable number of accessory proteins. Accessory proteins play an important role in virus-host interactions, especially in antagonizing or regulating host immunity and virus adaptation to the host. There are large variations in the number of accessory proteins (1-10) among coronaviruses. Betacoronavirus (bCoVs) have 3-5 accessory proteins, except for SARS-CoV and SARS-CoV-2, which possess the largest number of accessory proteins among all coronaviruses (10 and 9, respectively). 3a-like accessory proteins are found in multiple alpha and betacoronavirus lineages that infect bats and humans. They are transmembrane proteins of the viroporin family that form ion channels in the host membrane and have been implicated in inducing apoptosis, pathogenicity, and virus release. The induction of cytokine storms in COVID-19 patients might be linked to ORF3a mediated activation of inflammasome. 3a-like viroporins contain a transmembrane domain (TM) and a cytosolic domain (CD) [ , , , , , , ].This is the cytosolic domain (CD) of 3a-like viroporins, which consists of two antiparallel β-sheets forming a β-sandwich. The 3a-like viroporin forms a dimer and the six transmembrane helices of the dimer form an ion channel with polar/charged residues in the interior of the channel capable of conducting cations [ ].
Protein Domain
Name: SH3-like domain, bacterial-type
Type: Domain
Description: SH3 (src Homology-3) domains are small protein modules containing approximately 50 amino acid residues [ , ]. They are found in a great variety of intracellular or membrane-associated proteins [, , ] for example, in a variety of proteins with enzymatic activity, in adaptor proteins, such as fodrin and yeast actin binding protein ABP-1.The SH3 domain has a characteristic fold which consists of five or six β-strands arranged as two tightly packed anti-parallel β-sheets. The linker regions may contain short helices. The surface of the SH3-domain bears a flat, hydrophobic ligand-binding pocket which consists of three shallow grooves defined by conservative aromatic residues in which the ligand adopts an extended left-handed helical arrangement. The ligand binds with low affinity but this may be enhanced by multiple interactions. The region bound by the SH3 domain is in all cases proline-rich and contains PXXP as a core-conserved binding motif. The function of the SH3 domain is not well understood but they may mediate many diverse processes such as increasing local concentration of proteins, altering their subcellular location and mediating the assembly of large multiprotein complexes [ ].SH3 domains are widespread among metazoan intracellular signalling proteins and typically bind proline-richpolypeptides. This SH3 domain is a prokaryotic homologue. It might have two possible functions: (1) promoting survival of a pathogen withinthe invaded cell by modulating pathways controlled by SH3 domains; or (2) promoting invasion by binding to receptors on eukaryotic cells [, ].
Protein Domain
Name: Laminin G domain
Type: Domain
Description: Laminins are large heterotrimeric glycoproteins involved in basement membrane function [ ]. The Laminin G or LNS domain (for Laminin-alpha, Neurexin and Sex hormone-binding globulin) is an around 180 amino acid long domain found in a large and diverse set of extracellular proteins [, ]. The laminin globular (G) domain can be found in one to several copies in various laminin family members, including a large number of extracellular proteins. The C terminus of the laminin alpha chain contains a tandem repeat of five laminin G domains, which are critical for heparin-binding and cell attachment activity []. Laminin alpha4 is distributed in a variety of tissues including peripheral nerves, dorsal root ganglion, skeletal muscle and capillaries; in the neuromuscular junction, it is required for synaptic specialisation []. The structure of the laminin-G domain has been predicted to resemble that of pentraxin [].Laminin G domains can vary in their function, and a variety of binding functions have been ascribed to different LamG modules. For example, the laminin alpha1 and alpha2 chains each have five C-teminal laminin G domains, where only domains LG4 and LG5 contain binding sites for heparin, sulphatides and the cell surface receptor dystroglycan [ ]. Laminin G-containing proteins appear to have a wide variety of roles in cell adhesion, signalling, migration, assembly and differentiation. Proteins with laminin-G domains include:Laminin.Merosin.Agrin.Neurexins.Vitamin K dependent protein S.Sex steroid binding protein SBP/SHBG.Drosophila proteins Slit, Crumbs, Fat.several proteoglycan precursors.
Protein Domain
Name: Pilin-like
Type: Homologous_superfamily
Description: This superfamily includes Pilin and related proteins, such as general secretion pathway protein G (GSPG) and autotransporter adhesin YadA-like proteins.Pilin is a component of type IV pilus (T4P), a polar flexible filament, which consists of a single polypeptide chain arranged in a helical configuration of five subunits per turn, which is involved cell adhesion, microcolony formation, twitching motility and transformation [ , ]. Gram-negative bacteria produce pilin which is characterised by the presence of a very short leader peptide of 6 to 7 residues, followed by a methylated N-terminal phenylalanine residue and by a highly conserved sequence of about 24 hydrophobic residues, of the NMePhe type pilin [, ].GSPG shares several sequence similarities with bacterial fimbrial protein, or pilin, the major structural protein of pili [ , ]. Pili are polar flexible filamentous adhesions ~2500 nm in length, and diameter ~5.4 nm. Fimbrial and GSPG proteins share the following characteristics: a methylated, hydrophobic N-terminal residue; a hydrophobic leader peptide of 5-10 residues, terminating with glycine; glutamate as the fifth residue of the mature sequence; and a highly hydrophobic N-terminal. This system is homologous to the type IV pilus biogenesis and includes different proteins, termed psudopilins, which are structurally homologous to the type IV pilins [, ].Autotransporter adhesin YadA-like proteins are part of a class of pathogenicity factors that act as cell surface adhesion molecules, in which N-terminal head and neck domains extend from the bacterial outer membrane [ ].
Protein Domain
Name: Intein
Type: Domain
Description: Inteins, or protein introns, are parts of protein sequences that are post-translationally excised, their flanking regions (exteins) being spliced together to yield an additional protein product [ , ]. This process is believed to be self-catalysed, apparently initiating at the C-terminal splice junction, where a conserved asparagine residue mediates the nucleophilic attack of the peptide bond between it and its neighbouring residue. Most inteins consist of two domains: One is involved in autocatalytic splicing, and the other is an endonucleasethat is important in the spread of inteins [ ]. Inteins are between 134 and 608 amino acids long, and they are found in members of all three domains of life: eukaryotes, bacteria, and archaea, although most frequently in archaea. Inteins are found in proteins with diverse functions, including metabolic enzymes, DNA and RNA polymerases, proteases, ribonucleotide reductases, and the vacuolar-type ATPase. However, enzymes involved in DNA replication and repair appear to dominate. Inteins are found in conserved regions of conserved proteins and can be regarded as parasitic genetic elements []. Inteins are difficult to identify from sequence data because they lie in the same reading frame as the spliced protein and they are characterised by only a few short conserved motifs []: two of these are similar to the nonapeptide LAGLIDADG, which is diagnostic of certain homing endonucleases (mutation of one such motif causes loss of endonuclease activity, but not of the protein splicing function); another includes the C' splice site, mutations in which disable protein function.
Protein Domain
Name: Cell cycle, FtsW / RodA / SpoVE, conserved site
Type: Conserved_site
Description: A number of prokaryotic integral membrane proteins involved in cell cycle processes have been found to be structurally related [ , ]. These proteins include, the Escherichia coli and related bacteria cell division protein ftsW and the rod shape-determining protein rodA (or mrdB), the Bacillus subtilis stage V sporulation protein E (spoVE), the B. subtilis hypothetical proteins ywcF and ylaO and the Cyanophora paradoxa cyanelle ftsW homologue.FtsW is an integral membrane protein with ten transmembrane segments [ ]. In general, it is one of two paralogues involved in peptidoglycan biosynthesis, the other being RodA, and is essential for cell division []. In endospore-forming bacteria (e.g. Bacillus subtilis) three or more RodA/FtsW/SpoVE family paralogues are present [, ]. SpoVE acts in spore cortex formation and is dispensable for growth [, ].FtsW belongs to the so called SEDS (shape, elongation, division and sporulation) family, which are thought to be peptidoglycan polymerases [ ].Treponema pallidum RodA and Escherichia coli MrdB are probable peptidoglycan polymerases that are essential for cell wall elongation [ ]. RodA is a member of the FtsW/RodA/SpoVE family. It is found only in species with rod (or spiral) shapes. RodA is required for the maintenance of the rod cell shape and is essential for the elongation of the lateral wall of the cell []. In Escherichai coli, the rodA and pbpA genes occur in the same operon, and RodA is required for the expression of the enzymatic activity of the penicillin-binding protein 2 (PbpA) [].
Protein Domain
Name: Peptidase U32
Type: Family
Description: This is a group of peptidases belonging to MEROPS peptidase family U32 (clan U-). They are classified as collagenases as they are present in bacterial collagenases, involved in bacterial infection. For example, Porphyromonas gingivalis PrtC (Bacteroides gingivalis) [ ], is an enzyme that degrades type I collagen and that seems to require a metal cofactor, or Helicobacter pylori HP0169, another Peptidase U32 protein is required for colonisation of mouse gut.Novel functions of peptidase U32 proteins have been found, suggesting that these peptidases are involved in diverse cellular processes. The conservation of a CX6CX15CX3/4C motif in several U32 proteases indicates that these proteins bind [Fe-S] clusters. RlhA, a member of the U32 protease family involved in the C-hydroxylation of a cytidine on E. coli 23S rRNA is a Fe-S protein related to iron metabolism []. TrhP (tRNA hydroxylation protein P), a member of this family, is involved in prephenate-dependent formation of 5-hydroxyuridine (ho5U) modification at position 34 in tRNAs, the first step in 5-methoxyuridine (mo5U) biosynthesis, which is important for tRNA hydroxylation ensuring the efficient decoding during protein synthesis []. UbiU-UbiV are involved in ubiquinone biosynthesis in which the 4Fe-4S clusters bind to each protein have a role in electron transfer chains in hydroxylation reactions [].The peptidases families associated with clan U- have an unknown catalytic mechanism as the protein fold of the active site domain and the active site residues have not been reported.
Protein Domain
Name: Sigma1/sigma2, reoviral
Type: Family
Description: Reoviruses are double-stranded RNA viruses that lack a membrane envelope. Their capsid is organised in two concentric icosahedral layers: an inner core and an outer capsid layer. The sigma1 protein is found in the outer capsid, and the sigma2 protein is found in the core. There are four other kinds of protein (besides sigma2) in the core, termed lambda 1-3, mu2. Interactions between sigma2 and lambda 1 and lambda 3 are thought to initiate core formation, followed by mu2 and lambda2 []. Sigma1 is a trimeric protein, and is positioned at the 12 vertices of the icosahedral outer capsid layer. Its N-terminal fibrous tail, arranged as a triple coiled coil, anchors it in the virion, and a C-terminal globular head interacts with thecellular receptor [ ]. These two parts form by separate trimerization events.The N-terminal fibrous tail forms on the polysome, without the involvement of ATP or chaperones. The post- translational assembly of the C-terminalglobular head involves the chaperone activity of Hsp90, which is associated with phosphorylation of Hsp90 during the process []. Sigma1 protein actsas a cell attachment protein, and determines viral virulence, pathways of spread, and tropism. Junctional adhesion molecule has been identified as areceptor for sigma1 [ ]. In type 3 reoviruses, a small region, predicted toform a beta sheet, in the N-terminal tail was found to bind target cell surface sialic acid (i.e. sialic acid acts as a co-receptor) and promote apoptosis [].The sigma1 protein also binds to the lambda2 core protein [].
Protein Domain
Name: Leupaxin/Paxillin/TGFB1I1
Type: Family
Description: This entry includes the transforming growth factor beta-1-induced transcript 1 (TGFB1I1, also known as Hic-5) protein, paxillin and leupaxin.Hic-5 functions as a molecular adapter coordinating multiple protein-protein interactions at the focal adhesion complex and in the nucleus [ ]. Leupaxin is a transcriptional coactivator for androgen receptor (AR) and serum response factor (SRF) [].Paxillin is a cytoskeletal protein involved in actin-membrane attachment at sites of cell adhesion to the extracellular matrix (focal adhesion) [ , ]. Extensive tyrosine phosphorylation occurs during integrin-mediated cell adhesion, embryonic development, fibroblast transformation and following stimulation of cells by mitogens that operate through the 7TM family of G-protein-coupled receptors []. Paxillin binds in vitro to the focal adhesion protein vinculin, as well as to the SH3 domain of c-Src, and, when tyrosine phosphorylated, to the SH2 domain of v-Crk []. An N-terminal region has been identified that supports the binding of both vinculin and the focal adhesion tyrosine kinase, pp125Fak [].Paxillin is a 68kDa protein containing multiple domains, including four tandem C-terminal LIM domains (each of which binds 2 zinc ions); an N-terminal proline-rich domain, which contains a consensus SH3 binding site; and three potential Crk-SH2 binding sites [ ]. The predicted structure of paxillin suggests that it is a unique cytoskeletal protein capable of interaction with a variety of intracellular signalling and structural molecules important in growth control and the regulation of cytoskeletal organisation [, ].
Protein Domain
Name: Acin1, RNSP1-SAP18 binding (RSB) motif
Type: Conserved_site
Description: The RSB motif on the Acinus (also known as Acin1) protein is the core around which the ASAP complex is built. The apoptosis and splicing-associated protein complex, ASAP, is made up of three proteins, SAP18 (Sin3-associated protein of 18kDa), RNA-binding protein S1 (RNPS1) and apoptotic chromatin inducer in the nucleus (Acinus). The ASAP complex appears to be an assembly of proteins at the interface between transcription, splicing and nonsense-mediated mRNA decay (NMD), acting as a hub in the network of protein-interactions that regulate gene-expression [ ].
Protein Domain
Name: Dynamin-binding protein, first C-terminal SH3 domain
Type: Domain
Description: Dynamin-binding protein (DNMBP, also known as Tuba) is a scaffold protein that links dynamin with actin-regulating proteins. It binds a variety of actin regulatory proteins, including N-WASP, CR16, WAVE1, WIRE, PIR121, NAP1, and Ena/VASP proteins, via a C-terminal SH3 domain [ ]. It plays a critical role in ciliogenesis and nephrogenesis, most likely via Cdc42 activation [].The C-terminal SH3 domains of DNMBP bind to N-WASP and Ena/VASP proteins, which are key regulatory proteins of the actin cytoskeleton [ ]. This entry represents the first C-terminal SH3 domain.
Protein Domain
Name: Capsid/spike protein, ssDNA virus
Type: Homologous_superfamily
Description: This entry represents a coat protein found in ssDNA viruses, such as the capsid protein F and the spike protein G in Microviridae [ ], the Parvovirus capsid protein [], the Dependovirus capsid protein [] and the Densovirus capsid protein []. These proteins share a β-sandwich structure consisting of 8 β-strands in two sheets with a jelly-roll fold, although some members can have an additional 1-2 strands; characteristic interaction between the domains of this fold allows the formation of five-fold and pseudo six-fold assemblies.
Protein Domain
Name: tRNA(Ile)-lysidine synthase
Type: Family
Description: The aminoacyl-tRNA synthetases (also known as aminoacyl-tRNA ligases) catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction [ , ]. These proteins differ widely in size and oligomeric state, and have limited sequence homology []. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric []. Class II aminoacyl-tRNA synthetases share an anti-parallel β-sheet fold flanked by α-helices [], and are mostly dimeric or multimeric, containing at least three conserved regions [, , ]. However, tRNA binding involves an α-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan, valine, and some lysine synthetases (non-eukaryotic group) belong to class I synthetases. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, phenylalanine, proline, serine, threonine, and some lysine synthetases (non-archaeal group), belong to class-II synthetases. Based on their mode of binding to the tRNA acceptor stem, both classes of tRNA synthetases have been subdivided into three subclasses, designated 1a, 1b, 1c and 2a, 2b, 2c [ ].This entry represents lysidine-tRNA(Ile) synthetase, which ligates lysine onto the cytidine present at position 34 of the AUA codon-specific tRNA(Ile) that contains the anticodon CAU, in an ATP-dependent manner. Cytidine is converted to lysidine, thus changing the amino acid specificity of the tRNA from methionine to isoleucine. The N-terminal region contains the highly conserved SGGXDS motif, predicted to be a PP-loop motif involved in ATP binding.The only examples in which the wobble position of a tRNA must discriminate between G and A of mRNA are AUA (Ile) versus AUG (Met) and UGA (stop) versus UGG (Trp). In all bacteria, the wobble position of the tRNA(Ile) recognizing AUA is lysidine, a lysine derivative of cytidine. This domain is found, apparently, in all bacteria in a single copy. Eukaryotic sequences appear to be organellar. The domain architecture of this protein is variable; some, including characterised proteins of Escherichia coli and Bacillus subtilis known to be tRNA(Ile)-lysidine synthetase, include a conserved 50-residue domain that many other members lack. This protein belongs to the ATP-binding PP-loop family. It appears in the literature and protein databases as TilS, YacA, and putative cell cycle protein MesJ (a misnomer).The PP-loop motif appears to be a modified version of the P-loop of nucleotide binding domain that is involved in phosphate binding [ ]. Named PP-motif, since it appears to be a part of a previously uncharacterised ATP pyrophophatase domain. ATP sulfurylases, E. coli NtrL, and B. subtilis OutB consist of this domain alone. In other proteins, the pyrophosphatase domain is associated with amidotransferase domains (type I or type II), a putative citrulline-aspartate ligase domain or a nitrilase/amidase domain. The HUP domain class (after HIGH-signature proteins, UspA, and PP-ATPase) groups together PP-loop ATPases, the nucleotide-binding domains of class I aminoacyl-tRNA synthetases, UspA protein (USPA domains), photolyases, and electron transport flavoproteins (ETFP). The HUP domain is a distinct class of alpha/beta domain[].
Protein Domain
Name: Zinc finger, TFIIS-type
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents a zinc finger motif found in transcription factor IIs (TFIIS). In eukaryotes the initiation of transcription of protein encoding genes by polymerase II (Pol II) is modulated by general and specific transcription factors. The general transcription factors operate through common promoters elements (such as the TATA box). At least eight different proteins associate to form the general transcription factors: TFIIA, -IIB, -IID, -IIE, -IIF, -IIG, -IIH and -IIS [ ]. During mRNA elongation, Pol II can encounter DNA sequences that cause reverse movement of the enzyme. Such backtracking involves extrusion of the RNA 3'-end into the pore, and can lead to transcriptional arrest. Escape from arrest requires cleavage of the extruded RNA with the help of TFIIS, which induces mRNA cleavage by enhancing the intrinsic nuclease activity of RNA polymerase (Pol) II, past template-encoded pause sites []. TFIIS extends from the polymerase surface via a pore to the internal active site. Two essential and invariant acidic residues in a TFIIS loop complement the Pol II active site and could position a metal ion and a water molecule for hydrolytic RNA cleavage. TFIIS also induces extensive structural changes in Pol II that would realign nucleic acids in the active centre.TFIIS is a protein of about 300 amino acids. It contains three regions: a variable N-terminal domain not required for TFIIS activity; a conserved central domain required for Pol II binding; and a conserved C-terminal C4-type zinc finger essential for RNA cleavage. The zinc finger folds in a conformation termed a zinc ribbon [ ] characterised by a three-stranded antiparallel β-sheet and two β-hairpins. A backbone model for Pol II-TFIIS complex was obtained from X-ray analysis. It shows that a beta hairpin protrudes from the zinc finger and complements the pol II active site []. Some viral proteins also contain the TFIIS zinc ribbon C-terminal domain. The Vaccinia virus protein, unlike its eukaryotic homologue, is an integral RNA polymerase subunit rather than a readily separable transcription factor [ ].
Protein Domain
Name: Lysidine-tRNA(Ile) synthetase, C-terminal
Type: Domain
Description: The aminoacyl-tRNA synthetases (also known as aminoacyl-tRNA ligases) catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction [ , ]. These proteins differ widely in size and oligomeric state, and have limited sequence homology []. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric []. Class II aminoacyl-tRNA synthetases share an anti-parallel β-sheet fold flanked by α-helices [], and are mostly dimeric or multimeric, containing at least three conserved regions [, , ]. However, tRNA binding involves an α-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan, valine, and some lysine synthetases (non-eukaryotic group) belong to class I synthetases. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, phenylalanine, proline, serine, threonine, and some lysine synthetases (non-archaeal group), belong to class-II synthetases. Based on their mode of binding to the tRNA acceptor stem, both classes of tRNA synthetases have been subdivided into three subclasses, designated 1a, 1b, 1c and 2a, 2b, 2c [].This entry represents the C-terminal domain of lysidine-tRNA(Ile) synthetase, which ligates lysine onto the cytidine present at position 34 of the AUA codon-specific tRNA(Ile) that contains the anticodon CAU, in an ATP-dependent manner. Cytidine is converted to lysidine, thus changing the amino acid specificity of the tRNA from methionine to isoleucine. The N-terminal region contains the highly conserved SGGXDS motif, predicted to be a PP-loop motif involved in ATP binding.The only examples in which the wobble position of a tRNA must discriminate between G and A of mRNA are AUA (Ile) versus AUG (Met) and UGA (stop) versus UGG (Trp). In all bacteria, the wobble position of the tRNA(Ile) recognizing AUA is lysidine, a lysine derivative of cytidine. This domain is found, apparently, in all bacteria in a single copy. Eukaryotic sequences appear to be organellar. The domain architecture of this protein is variable; some, including characterised proteins of Escherichia coli and Bacillus subtilis known to be tRNA(Ile)-lysidine synthetase, include a conserved 50-residue domain that many other members lack. This protein belongs to the ATP-binding PP-loop family. It appears in the literature and protein databases as TilS, YacA, and putative cell cycle protein MesJ (a misnomer).The PP-loop motif appears to be a modified version of the P-loop of nucleotide binding domain that is involved in phosphate binding [ ]. Named PP-motif, since it appears to be a part of a previously uncharacterised ATP pyrophophatase domain. ATP sulfurylases, E. coli NtrL, and B. subtilis OutB consist of this domain alone. In other proteins, the pyrophosphatase domain is associated with amidotransferase domains (type I or type II), a putative citrulline-aspartate ligase domain or a nitrilase/amidase domain. The HUP domain class (after HIGH-signature proteins, UspA, and PP-ATPase) groups together PP-loop ATPases, the nucleotide-binding domains of class I aminoacyl-tRNA synthetases, UspA protein (USPA domains), photolyases, and electron transport flavoproteins (ETFP). The HUP domain is a distinct class of alpha/beta domain[].
Protein Domain
Name: GPCR, family 2, ADGRE2/ADGRE5
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The secretin-like GPCRs include secretin [ ], calcitonin [], parathyroid hormone/parathyroid hormone-related peptides [] and vasoactive intestinal peptide [], all of which activate adenylyl cyclase and the phosphatidyl-inositol-calcium pathway. These receptors contain seven transmembrane regions, in a manner reminiscent of the rhodopsins and other receptors believed to interact with G-proteins (however there is no significant sequence identity between these families, the secretin-like receptors thus bear their own unique '7TM' signature). Their N-terminal is probably located on the extracellular side of the membrane and potentially glycosylated. This N-terminal region contains a long conserved region which allows the binding of large peptidic ligand such as glucagon, secretin, VIP and PACAP; this region contains five conserved cysteines residues which could be involved in disulphide bond. The C-terminal region of these receptor is probably cytoplasmic. Every receptor gene in this family is encoded on multiple exons, and several of these genes are alternatively spliced to yield functionally distinct products. The Adhesion G Protein-Coupled Receptors (aGPCRs) constitute an evolutionary ancient membrane protein family. The receptors contain a 7-TM domain with phylogeny suggesting ancestry to the Family B/2 (secretin receptor family, Class B/2) G-Protein-Coupled Receptors. aGPCRs are distinguished by their large amino-terminal regions that typically contain multiple modular motifs such as EGF (Epidermal Growth Factor-like), cadherin and immunoglobulin domains as well as novel lineage-specific structures. A defining feature of aGPCRs is the GPCR Autoproteoolysis-Inducing (GAIN) domain linking the N-terminal structure to the 7-TM region. Most aGPCRs undergo autocatalytic cleavage here, at the GPCR proteolysis site (GPS) into N-terminal and C-terminal fragments [ ].Adhesion G protein-coupled receptor E2 (ADGRE2) protein is a member of the EGF-7TM subclass of aGPCRs and has an N-terminal extracellular region that consists of 5 tandem EGF-like adhesion domains, an internal mucin-like stalk domain containing a short G-protein proteolytic site and a C-terminal seven-pass transmembrane domain. ADGRE2 undergoes autocatalytic cleavage within its G-protein proteolytic site motif. It is expressed predominantly in myeloid leukocytes but also on the surface of lung mast cells and the HMC1 human mast-cell line. The endogenous ligand is dermatan sulfate. The most closely related paralogue of ADGRE2 is ADGRE5 (also called CD97). Ligand binding of ADGRE5 mediates cell-cell adhesion of leukocytes and mediates an essential role in leukocyte migration [ ].
Protein Domain
Name: tRNA(Ile)-lysidine synthase , substrate-binding domain
Type: Domain
Description: The aminoacyl-tRNA synthetases (also known as aminoacyl-tRNA ligases) catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction [ , ]. These proteins differ widely in size and oligomeric state, and have limited sequence homology []. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric []. Class II aminoacyl-tRNA synthetases share an anti-parallel β-sheet fold flanked by α-helices [], and are mostly dimeric or multimeric, containing at least three conserved regions [, , ]. However, tRNA binding involves an α-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan, valine, and some lysine synthetases (non-eukaryotic group) belong to class I synthetases. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, phenylalanine, proline, serine, threonine, and some lysine synthetases (non-archaeal group), belong to class-II synthetases. Based on their mode of binding to the tRNA acceptor stem, both classes of tRNA synthetases have been subdivided into three subclasses, designated 1a, 1b, 1c and 2a, 2b, 2c [].This entry represents the substrate-binding domain of lysidine-tRNA(Ile) synthetase, which ligates lysine onto the cytidine present at position 34 of the AUA codon-specific tRNA(Ile) that contains the anticodon CAU, in an ATP-dependent manner. Cytidine is converted to lysidine, thus changing the amino acid specificity of the tRNA from methionine to isoleucine. The N-terminal region contains the highly conserved SGGXDS motif, predicted to be a PP-loop motif involved in ATP binding.The only examples in which the wobble position of a tRNA must discriminate between G and A of mRNA are AUA (Ile) versus AUG (Met) and UGA (stop) versus UGG (Trp). In all bacteria, the wobble position of the tRNA(Ile) recognizing AUA is lysidine, a lysine derivative of cytidine. This domain is found, apparently, in all bacteria in a single copy. Eukaryotic sequences appear to be organellar. The domain architecture of this protein is variable; some, including characterised proteins of Escherichia coli and Bacillus subtilis known to be tRNA(Ile)-lysidine synthetase, include a conserved 50-residue domain that many other members lack. This protein belongs to the ATP-binding PP-loop family. It appears in the literature and protein databases as TilS, YacA, and putative cell cycle protein MesJ (a misnomer).The PP-loop motif appears to be a modified version of the P-loop of nucleotide binding domain that is involved in phosphate binding []. Named PP-motif, since it appears to be a part of a previously uncharacterised ATP pyrophophatase domain. ATP sulfurylases, E. coli NtrL, and B. subtilis OutB consist of this domain alone. In other proteins, the pyrophosphatase domain is associated with amidotransferase domains (type I or type II), a putative citrulline-aspartate ligase domain or a nitrilase/amidase domain. The HUP domain class (after HIGH-signature proteins, UspA, and PP-ATPase) groups together PP-loop ATPases, the nucleotide-binding domains of class I aminoacyl-tRNA synthetases, UspA protein (USPA domains), photolyases, and electron transport flavoproteins (ETFP). The HUP domain is a distinct class of alpha/beta domain[].
Protein Domain
Name: mRNA splicing factor Cwf21 domain
Type: Domain
Description: The cwf21 domain is found in proteins involved in mRNA splicing. Proteins containing this domain have been isolated as a subcomplex of the splicosome in Schizosaccharomyces pombe (Fission yeast) [ ]. In yeast, this domain binds the protein Prp8p [], a large and highly conserved U5 snRNP protein which has been proposed as a protein cofactor at the spliceosomal catalytic centre [].The cwf21 domain is found in, amongst others, the small Cwc21p protein in yeast as well as in the much larger human ortholog SRm300 (serine/arginine repetitive matrix protein).
Protein Domain
Name: Pilin accessory predicted
Type: Family
Description: This family consists of several enterobacterial PilO proteins. The function of PilO is unknown although it has been suggested that it is a cytoplasmic protein in the absence of other Pil proteins, but PilO protein is translocated to the outer membrane in the presence of other Pil proteins. Alternatively, PilO protein may form a complex with other Pil protein(s). PilO has been predicted to function as a component of the pilin transport apparatus and thin-pilus basal body []. This family does not seem to be related to .
Protein Domain
Name: BAG family molecular chaperone regulator 2
Type: Family
Description: This entry represents the BAG2 protein, which belongs to the BAG family. BAG2 acts as a nucleotide-exchange factor (NEF) promoting the release of ADP from the HSP70 and HSC70 proteins thereby triggering client/substrate protein release [ , ].BAG-family proteins contain a single BAG domain, except for human BAG-5 which has four BAG repeats. The BAG domain is a conserved region located at the C terminus of the BAG-family proteins that binds the ATPase domain of Hsc70/Hsp70. BAG family proteins regulate chaperone protein activities through their interaction with Hsc70/Hsp70 [ ].
Protein Domain
Name: Polysaccharide export EpsE
Type: Family
Description: Sequences in this family of proteins belong to the polysaccharide export protein family, which includes the Wza [ ] from Escherichia coli. This family of proteins are homologous to the EpsE protein of the methanolan biosynthesis operon of Methylobacillus sp. 12S []. The distribution of this protein appears to be restricted to a subset of exopolysaccharide operons containing a syntenic grouping of genes including a variant of the EpsH exosortase protein []. Exosortase has been proposed to be involved in the targeting and processing of proteins containing the PEP-CTERM domain to the exopolysaccharide layer.
Protein Domain
Name: Selenoprotein T
Type: Family
Description: This entry represents selenoprotein T (SelT), which is conserved from plants to humans. SelT is localized to the endoplasmic reticulum through a hydrophobic domain. The protein binds to UDP-glucose:glycoprotein glucosyltransferase (UGTR), the endoplasmic reticulum (ER)-resident protein, which is known to be involved in the quality control of protein folding [ , ]. Selenium (Se) plays an essential role in cell survival and most of the effects of Se are probably mediated by selenoproteins, including selenoprotein T. The function of SelT is unknown, although it may have a role in PACAP signaling during PC12 cell differentiation [, ].
Protein Domain
Name: KaiC domain
Type: Domain
Description: This entry represents a domain found in bacterial proteins related to Circadian clock protein kinase KaiC and archaeal uncharacterised sequences belonging to the UPF0273 family. More than one copy is sometimes found in each protein in this group. KaiC is one of the Kai proteins among which direct protein-protein association may be a critical process in the generation of circadian rhythms in cyanobacteria [ , ]. The circadian clock protein KaiC, is encoded in the kaiABC operon that controls circadian rhythms and may be universal inCyanobacteria. KaiC performs autophosphorylation and acts as its own transcriptional repressor.
Protein Domain
Name: Acidobacterial duplicated orphan permease
Type: Family
Description: These proteins are found mostly in three species of Acidobacteria, namely Acidobacteria bacterium (strain Ellin345), Acidobacterium capsulatum ATCC 51196, and Solibacter usitatus (strain Ellin6076), where they form large paralogous families. Each protein contains two copies of a predicted ABC transporter permease doamain. However unlike other proteins containing this domain these proteins are essentially never encoded fused or adjacent to ABC transporter ATP-binding protein ( ) genes. This entry is termed ADOP, for Acidobacterial Duplicated Orphan Permease, to reflect the restricted lineage, internal duplication, lack of associated ATP-binding cassette proteins, and permease homology. The function of these proteins is unknown.
Protein Domain
Name: Probable peptidoglycan glycosyltransferase FtsW/RodA
Type: Family
Description: A number of prokaryotic integral membrane proteins involved in cell cycle processes have been found to be structurally related [ , ]. These proteins include, the Escherichia coli and related bacteria cell division protein ftsW and the rod shape-determining protein rodA (or mrdB), the Bacillus subtilis stage V sporulation protein E (spoVE), the B. subtilis hypothetical proteins ywcF and ylaO and the Cyanophora paradoxa cyanelle ftsW homologue. They constitute the SEDS (shape, elongation, division and sporulation) family. SEDS proteins are thought to be peptidoglycan polymerases, functioning as cell wall synthases of the cell elongation and division machinery [].
Protein Domain
Name: E3 ubiquitin-protein ligase UBR4, N-terminal
Type: Domain
Description: This entry represents the N-terminal domain of E3 ubiquitin-protein ligases UBR4 (POE or PUSHOVER in Drosophila) from animals. UBR4 is a component of the N-rule pathway, identified as noncanonical N-recognins implicated in bulk lysosomal degradation and autophagy. Proteins containing this domain are extraordinarily large proteins that recognise and bind to proteins with N-terminal destabilising residues according to the N-end rule leading them to their ubiquitination and degradation. These proteins share a UBR box with others E3-ligases UBR box proteins, but UBR4 lacks the ubiquitylation domain, suggesting that it may interact with other catalytic proteins [ , , ].
Protein Domain
Name: Type-F conjugative transfer system secretin TraK
Type: Family
Description: The TraK protein is predicted to interact with the TraV and TraB proteins as part of the scaffold, which extends from the inner membrane, through the periplasm to the cell envelope and through which the F-type conjugative pilus passes. TraK is homologous to the P-type IV secretion system protein TrbG, the Ti-type protein VirB9 and the I-type TraN protein. The protein is related to the secretin family especially the HrcC subgroup of the type III secretion system. The protein is hypothesized to oligomerize to form a ring structure akin to other secretins [ , , ].
Protein Domain
Name: Helix hairpin bin domain superfamily
Type: Homologous_superfamily
Description: This domain superfamily is found in uncharacterised protein whose genomic location within nitrogen fixation clusters suggests that they may play a role in this process. They consist of two α-helices. A four-residue linker between the helices allows them to form an anti-parallel bundle and cross over each other towards their termini [ ]. A similar domain is found in Sulfolobus spindle-shaped virus 1 (SSV1) protein D-63 []. This domain is also present in eukaryotic proteins such as Golgi to ER traffic protein 1 (GET1), also known as tail-anchored protein insertion receptor WRB, Pre-mRNA-splicing factor ISY1 and Vacuolar protein sorting-associated protein 37.
Protein Domain
Name: Putative 2-aminoethylphosphonate binding protein, ABC transporter
Type: Family
Description: Bacterial high affinity transport systems are involved in active transport of solutes across the cytoplasmic membrane. Most of the bacterial ABC (ATP-binding cassette) importers are composed of one or two transmembrane permease proteins, one or two nucleotide-binding proteins and a highly specific periplasmic solute-binding protein. In Gram-negative bacteria the solute-binding proteins are dissolved in the periplasm, while in archaea and Gram-positive bacteria, their solute-binding proteins are membrane-anchored lipoproteins [ , ].This ABC transporter periplasmic substrate binding protein component is found in a region of the salmonella typhimurium LT2 genome [ ] responsible for the catabolism of 2-aminoethylphosphonate via the phnWX pathway ().
Protein Domain
Name: Sterile alpha motif/pointed domain superfamily
Type: Homologous_superfamily
Description: Sterile alpha motif (SAM) domains are known to be involved in diverse protein-protein interactions, associating with both SAM-containing and non-SAM-containing proteins pathway [ ]. SAM domains exhibit a conserved structure, consisting of a 4-5-helical bundle of two orthogonally packed alpha-hairpins. However SAM domains display a diversity of function, being involved in interactions with proteins, DNA and RNA []. The name sterile alpha motif arose from its presence in proteins that are essential for yeast sexual differentiation. The SAM domain has had various names, including SPM, PTN (pointed), SEP (yeast sterility, Ets-related, PcG proteins), NCR (N-terminal conserved region) and HLH (helix-loop-helix) domain, all of which are related and can be classified as SAM domains.SAM domains occur in eukaryotic and in some bacterial proteins. Structures have been determined for several proteins that contain SAM domains, including Ets-1 transcription factor, which plays a role in the development and invasion of tumour cells by regulating the expression of matrix-degrading proteases [ ]; Etv6 transcription factor, gene rearrangements of which have been demonstrated in several malignancies []; EphA4 receptor tyrosine kinase, which is believed to be important for the correct localization of a motoneuron pool to a specific position in the spinal cord []; EphB2 receptor, which is involved in spine morphogenesis via intersectin, Cdc42 and N-Wasp []; p73, a p53 homologue involved in neuronal development []; and polyhomeotic, which is a member of the Polycomb group of genes (Pc-G) required for the maintenance of the spatial expression pattern of homeotic genes [].
Protein Domain
Name: SAND domain
Type: Domain
Description: The SAND domain (named after Sp100, AIRE-1, NucP41/75, DEAF-1) is a conserved ~80 residue region found in a number of nuclear proteins, many of which function in chromatin-dependent transcriptional control. These include proteins linked to various human diseases, such as the Sp100 (Speckled protein 100kDa) [ ], NUDR (Nuclear DEAF-1 related), GMEB (Glucocorticoid Modulatory Element Binding) proteins [] and AIRE-1 (Autoimmune regulator 1) proteins.Proteins containing the SAND domain have a modular structure; the SAND domain can be associated with a number of other modules, including the bromodomain, the PHD finger and the MYND finger. Because no SAND domain has been found in yeast, it is thought that the SAND domain could be restricted to animal phyla. Many SAND domain-containing proteins, including NUDR, DEAF-1 (Deformed epidermal autoregulatory factor-1) and GMEB, have been shown to bind DNA sequences specifically. The SAND domain has been proposed to mediate the DNA binding activity of these proteins [, ].The resolution of the 3D structure of the SAND domain from Sp100b has revealed that it consists of a novel alpha/beta fold. The SAND domain adopts a compact fold consisting of a strongly twisted, five-stranded antiparallel β-sheet with four α-helices packing against one side of the β-sheet. The opposite side of the β-sheet is solvent exposed. The β-sheet and α-helical parts of the structure form two distinct regions. Multiple hydrophobic residues pack between these regions to form a structural core. A conserved KDWK sequence motif is found within the α-helical, positively charged surface patch. The DNA binding surface has been mapped to the α-helical region encompassing the KDWK motif [].
Protein Domain
Name: Protein-tyrosine phosphatase, active site
Type: Active_site
Description: This entry includes proteins of two subfamilies: Ser/Thr ( ) and Tyr dual specificity protein phosphatase and tyrosine specific protein phosphatase ( ). Both of these subfamilies may also have inactive phosphatase domains, and dependent on the domain composition this loss of catalytic activity has different effects on protein function. Inactive single domain phosphatases can still specifically bind substrates, and protect against dephosphorylation, while the inactive domains of tandem phosphatases can be further subdivided into two classes. Those which bind phosphorylated tyrosine residues may recruit multi-phosphorylated substrates for the adjacent active domains and are more conserved, while the other class have accumulated several variable amino acid substitutions and have a complete loss of tyrosine binding capability. The second class shows a release of evolutionary constraint for the sites around the catalytic centre, which emphasises a difference in function from the first group. There is a region of higher conservation common to both classes, suggesting a regulatory centre [ ].Ser/Thr and Tyr dual specificity phosphatases are a group of enzymes with both Ser/Thr ( ) and tyrosine specific protein phosphatase () activity able to remove both the serine/threonine or tyrosine-bound phosphate group from a wide range of phosphoproteins, including a number of enzymes which have been phosphorylated under the action of a kinase. Dual specificity protein phosphatases (DSPs) regulate mitogenic signal transduction and control the cell cycle. Tyrosine specific protein phosphatases catalyze the removal of a phosphate group attached to a tyrosine residue. They are also very important in the control of cell growth, proliferation, differentiation and transformation.
Protein Domain
Name: Chemotaxis response regulator CheV
Type: Family
Description: This group contains chemotaxis response regulator protein CheV. CheV is a two-domain protein with an N-terminal CheW-like (SH3-like) domain and a C-terminal CheY-like receiver domain. It is often regarded as a version of CheW, where the CheW-like domain is fused to the receiver domain.In bacterial chemotaxis, cellular movement is directed in response to chemical gradients. Transmembrane chemoreceptors that sense the stimuli are coupled (via a coupling protein, CheW) with a signal transduction histidine kinase (CheA) [ , ]. CheA phosphorylates response regulators CheB and CheY. The two cytoplasmic proteins, CheW and CheA, both contain homologous SH3-like domains that interact with transmembrane chemoreceptors, or methyl accepting chemotaxis proteins (MCPs). In CheA, a histidine protein kinase domain is fused to the amino-terminus of the SH3 region [, ]. CheV is a third type of protein with a CheW-like domain.In Bacillus subtilis, CheW and CheV may be partially redundant in coupling the receptors to CheA; however, they are both necessary for efficient chemotaxis [ ]. CheV is phosphorylated in vitro on a conserved aspartate as a result of phosphoryl group transfer from phosphorylated CheA (CheA-P). This reaction is slower compared with the phospho-transfer reaction between CheA-P and one other response regulator of the system, CheB. It is part of a signal transduction pathway to facilitate adaptation to attractants []. In Helicobacter pylori, CheV paralogues and CheW are not redundant and seem to have separate roles in chemotaxis [] and the CheV proteins play a role in the bacterial mobility [].For additional information please see [ ].
Protein Domain
Name: PPE family
Type: Domain
Description: The human pathogen Mycobacterium tuberculosis harbours a large number of genes that encode proteins whose N-termini contain the characteristic motifs Pro-Glu (PE) or Pro-Pro-Glu (PPE). A subgroup of the PE proteins contains polymorphic GC-rich sequences (PGRS), while a subgroup of the PPE proteins contains major polymorphic tandem repeats (MPTR). The function of most of these proteins remains unknown [ ]. However, the PE_PGRS proteins from Mycobacterium marinum are secreted by components of the ESX-5 system that belongs to the recently defined type VII secretion systems []. It has also been reported that the PE_PGRS family of proteins contains multiple calcium-binding and glycine-rich sequence motifs GGXGXD/NXUX. This sequence repeat constitutes a calcium-binding parallel β-roll or parallel β-helix structure and is found in RTX toxins secreted by many Gram-negative bacteria []. This mycobacterial family is named after a conserved amino-terminal region of about 180 amino acids, the PPE motif. The carboxy termini of proteins belonging to the PPE family are variable, and on the basis of this region at least three groups can be distinguished. The MPTR subgroup is characterised by tandem copies of a motif NXGXGNXG. The second subgroup contains a conserved motif at about position 350. The third group shares only similarity in the amino terminal region. The function of these proteins is uncertain but it has been suggested that they may be related to antigenic variation of Mycobacterium tuberculosis [ ]. Some members of this family are associated with virulence and evasion of host immune response as they have a role in preventing ROS generation [].
Protein Domain
Name: Proteinase K-like catalytic domain
Type: Domain
Description: This domain is found in some members of peptidase family S8 (subtilisins) [ ], such as PCSK9 (proprotein convertase subtilisin/kexin type 9; MEROPS identifier S08.039), proteinase K (S08.054), proteinase T (S08.061) from the fungus Tritirachium albumLimber [ ], and other subtilisin-like serine endopeptidases. PCSK9 post-translationally regulates hepatic low-density lipoprotein receptors (LDLRs) by binding to LDLRs on the cell surface, leading to their degradation, and is a target for drugs for hypercholesterolaemia. The binding site of PCSK9 has been localized to the epidermal growth factor-like repeat A (EGF-A) domain of the LDLR []. Characterized proteinases K are secreted endopeptidases that are not substrate-specific and function in a wide variety of species in different pathways. It can hydrolyze keratin and other proteins with subtilisin-like specificity []. The number of calcium-binding motifs found in these differ []. The subtilisin family is one of the largest serine peptidase families characterised to date. Over 200 subtilises are presently known, more than 170 of which with their complete amino acid sequence [ ]. It is widespread, being found in eubacteria, archaebacteria, eukaryotes and viruses []. The vast majority of the family are endopeptidases, although there is an exopeptidase, tripeptidyl peptidase [, ]. Structures have been determined for several members of the subtilisin family: they exploit the same catalytic triad as the chymotrypsins, although the residues occur in a different order (HDS in chymotrypsin and DHS in subtilisin), but the structures show no other similarity [, ]. Some subtilisins are mosaic proteins, while others contain N- and C-terminal extensions that show no sequence similarity to any other known protein [].
Protein Domain
Name: Thrombospondin type-1 (TSP1) repeat
Type: Repeat
Description: Thrombospondins are multimeric multidomain glycoproteins that function at cell surfaces and in the extracellular matrix milieu. They act as regulators of cell interactions in vertebrates. They are divided into two subfamilies, A and B, according to their overall molecular organisation. The subgroup A proteins TSP-1 and -2 contain an N-terminal domain, a VWFC domain, three TSP1 repeats, three EGF-like domains, TSP3 repeats and a C-terminal domain. They are assembled as trimer. The subgroup B thrombospondins, designated TSP-3, -4, and COMP (cartilage oligomeric matrix protein, also designated TSP-5) are distinct in that they contain unique N-terminal regions, lack the VWFC domain and TSP1 repeats, contain four copies of EGF-like domains, and are assembled as pentamers [ ]. EGF, TSP3 repeats and the C-terminal domain are thus the hallmark of a thrombospondin.This repeat was first described in 1986 by Lawler and Hynes [ ]. It was found in the thrombospondin protein where it is repeated 3 times. Now a number of proteins involved in the complement pathway (properdin, C6, C7, C8A, C8B, C9) [] as well as extracellular matrix protein like mindin, F-spondin [], SCO-spondin and even the circumsporozoite surface protein 2 and TRAP proteins of Plasmodium [, ] contain one or more instance of this repeat. It has been involved in cell-cell interaction, inhibition of angiogenesis [] and apoptosis [].The intron-exon organisation of the properdin gene confirms the hypothesis that the repeat might have evolved by a process involving exon shuffling []. A study of properdin structure provides some information about the structure of the thrombospondin type I repeat [].
Protein Domain
Name: PX domain superfamily
Type: Homologous_superfamily
Description: The PX (phox) domain [ ] occurs in a variety of eukaryotic proteins and have been implicated in highly diverse functions such as cell signalling, vesicular trafficking, protein sorting and lipid modification [, , , ]. PX domains are important phosphoinositide-binding modules that have varying lipid-binding specificities []. The PX domain is approximately 120 residues long [], and folds into a three-stranded β-sheet followed by three -helices and a proline-rich region that immediately preceeds a membrane-interaction loop and spans approximately eight hydrophobic and polar residues. The PX domain of neutrophil cytosol factor 1 (p47phox) binds to the SH3 domain in the same protein []. Phosphorylation of p47(phox), a cytoplasmic activator of the microbicidal phagocyte oxidase (phox), elicits interaction of p47(phox) with phoinositides. The protein phosphorylation-driven conformational change of p47(phox) enables its PX domain to bind to phosphoinositides, the interaction of which plays a crucial role in recruitment of p47(phox) from the cytoplasm to membranes and subsequent activation of the phagocyte oxidase. The lipid-binding activity of this protein is normally suppressed by intramolecular interaction of the PX domain with the C-terminal Src homology 3 (SH3) domain [ ].The PX domain is conserved from yeast to human. A multiple alignment of representative PX domain sequences from eukaryotic proteins [ ], shows relatively little sequence conservation, although their structure appears to be highly conserved. Although phosphatidylinositol-3-phosphate (PtdIns(3)P) is the primary target of PX domains, binding to phosphatidic acid, phosphatidylinositol-3,4-bisphosphate (PtdIns(3,4)P2), phosphatidylinositol-3,5-bisphosphate (PtdIns(3,5)P2), phosphatidylinositol-4,5-bisphosphate (PtdIns(4,5)P2), and phosphatidylinositol-3,4,5-trisphosphate (PtdIns(3,4,5)P3) has been reported as well. The PX-domain is also a protein-protein interaction domain [].
Protein Domain
Name: Hemocyanin, C-terminal domain superfamily
Type: Homologous_superfamily
Description: Crustacean and cheliceratan hemocyanins (oxygen-transport proteins) and insect hexamerins (storage proteins) are homologous gene products, although the latter do not bind oxygen [ ].Haemocyanins are found in the haemolymph of many invertebrates. They are divided into 2 main groups, arthropodan and molluscan. These have structurally similar oxygen-binding centres, which are similar to the oxygen-binding centre of tyrosinases, but their quaternary structures are arranged differently. The arthropodan proteins exist as hexamers comprising 3 heterogeneous subunits (a, b and c) and possess 1 oxygen-binding centre per subunit; and the molluscan proteins exist as cylindrical oligomers of 10 to 20 subunits and possess 7 or 8 oxygen-binding centres per subunit [ ]. Although the proteins have similar amino acid compositions, the only real similarity in their primary sequences is in the region corresponding to the second copper-binding domain, which also shows similarity to the copper-binding domain of tyrosinases. Hexamerins are proteins from the hemolymph of insects, which may serve as a store of amino acids for synthesis of adult proteins. They do not possess the copper-binding histidines present in hemocyanins [ ]. Homologues are also present in other kinds of organism, for example, Cyclopenase asqI from the yeast Emericella nidulans and Cyclopenase penL from Penicillium thymicola. AsqL is a tyrosinase involved in biosynthesis of the aspoquinolone mycotoxins, though its exact function is unknown [ ]. PenL is part of the gene cluster that mediates the biosynthesis of penigequinolones, potent insecticidal alkaloids that contain a highly modified 10-carbon prenyl group [].This entry represents the C-terminal domain superfamily of hemocyanin and hexamerin proteins.
Protein Domain
Name: Kringle-like fold
Type: Homologous_superfamily
Description: This entry represents proteins displaying a Kringle-like structure, which consists of a nearly all-beta, disulphide-rich fold. Proteins displaying this fold include both Kringle modules as well as fibronectin type II modules, the latter displaying a shorter two-disulphide version of the Kringle module.Kringle modules occur in blood clotting and fibrinolytic proteins, such as plasminogen, prothrombin, meizothrombin, and urokinase-type plasminogen activator, as well as in apolipoprotein and hepatocyte growth factor. Kringle domains are believed to play a role in binding mediators (e.g., membranes, other proteins or phospholipids), and in the regulation of proteolytic activity [ , ].Fibronectin type II modules occur in fibronectin, as well as in gelatinase A (MMP-2), gelatinase B (MMP-9), and the collagen-binding domain of PDC-109. Fibronectin is a multi-domain glycoprotein, found in a soluble form in plasma, and in an insoluble form in loose connective tissue and basement membranes, that binds cell surfaces and various compounds including collagen, fibrin, heparin, DNA, and actin [ ]. Fibronectins are involved in a number of important functions e.g., wound healing; cell adhesion; blood coagulation; cell differentiation and migration; maintenance of the cellular cytoskeleton; and tumour metastasis. Gelatinases A and B are members of the matrix metalloproteinase family that act as neutral proteinases in the breakdown and remodelling of the extracellular matrix. These gelatinases play important roles in the pathogenesis of inflammation, infection and in neoplastic diseases []. In gelatinase A, the three fibronectin-like modules are inserted within a catalytic domain, these modules acting to target the enzyme to matrix macromolecules [].
Protein Domain
Name: Phox homology
Type: Domain
Description: The PX (phox) domain [ ] occurs in a variety of eukaryotic proteins and have been implicated in highly diverse functions such as cell signalling, vesicular trafficking, protein sorting and lipid modification [, , , ]. PX domains are important phosphoinositide-binding modules that have varying lipid-binding specificities []. The PX domain is approximately 120 residues long [], and folds into a three-stranded β-sheet followed by three -helices and a proline-rich region that immediately preceeds a membrane-interaction loop and spans approximately eight hydrophobic and polar residues. The PX domain of neutrophil cytosol factor 1 (p47phox) binds to the SH3 domain in the same protein []. Phosphorylation of p47(phox), a cytoplasmic activator of the microbicidal phagocyte oxidase (phox), elicits interaction of p47(phox) with phoinositides. The protein phosphorylation-driven conformational change of p47(phox) enables its PX domain to bind to phosphoinositides, the interaction of which plays a crucial role in recruitment of p47(phox) from the cytoplasm to membranes and subsequent activation of the phagocyte oxidase. The lipid-binding activity of this protein is normally suppressed by intramolecular interaction of the PX domain with the C-terminal Src homology 3 (SH3) domain [].The PX domain is conserved from yeast to human. A multiple alignment of representative PX domain sequences from eukaryotic proteins [ ], shows relatively little sequence conservation, although their structure appears to be highly conserved. Although phosphatidylinositol-3-phosphate (PtdIns(3)P) is the primary target of PX domains, binding to phosphatidic acid, phosphatidylinositol-3,4-bisphosphate (PtdIns(3,4)P2), phosphatidylinositol-3,5-bisphosphate (PtdIns(3,5)P2), phosphatidylinositol-4,5-bisphosphate (PtdIns(4,5)P2), and phosphatidylinositol-3,4,5-trisphosphate (PtdIns(3,4,5)P3) has been reported as well. The PX-domain is also a protein-protein interaction domain [].
Protein Domain
Name: Herpesvirus UL31
Type: Family
Description: During primary envelopment Human herpesvirus 1 (HHV-1, HSV-1) nucleocapsids translocate from the nucleus to the cytoplasm. Lining the inside of the INM is the nuclear lamina, which is composed of a meshwork of proteins with spaces too small for the capsid to move through without some disruption of the lamina. The lamina is mainly made up of lamin A/C and lamin B proteins, with smaller amounts of other proteins also present; this lamina must be disrupted before the nucleocapsids can egress. UL31, nuclear egress protein 2 (also known as UL34) and US3 proteins of herpes simplex virus type 1 form a complex that accumulates at the nuclear rim and is required for envelopment of nucleocapsids and successful egress of the nucleocapsids [ ]. Although UL34 has been shown to interact directly with lamin A it cannot disrupt lamin structure by itself. Its interaction with UL31 and US3 appears to be crucial for lamin disruption, though the mechanism is not yet clear [, ].This entry includes Herpesvirus UL31 protein, also known as nuclear egress protein 1 (NEC1). Within the host nucleus, NEC1 interacts with the newly formed capsid and directs it to the inner nuclear membrane by associating with nuclear egress protein 2 (UL34 or NEC2) [ , ]. The NEC1/NEC2 complex, known as the nuclear egress complex (NEC), induces the budding of the capsid at the inner nuclear membrane and its envelopment into the perinuclear space, where the complex promotes the fusion of the enveloped capsid with the outer nuclear membrane and the subsequent release of the viral capsid into the cytoplasm [].
Protein Domain
Name: CAP superfamily
Type: Homologous_superfamily
Description: The cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins (CAP) superfamily proteins are found in a wide range of organisms, including prokaryotes [ ] and non-vertebrate eukaryotes [], The nine subfamilies of the mammalian CAP superfamily include: the human glioma pathogenesis-related 1 (GLIPR1), Golgi associated pathogenesis related-1 (GAPR1) proteins, peptidase inhibitor 15 (PI15), peptidase inhibitor 16 (PI16), cysteine-rich secretory proteins (CRISPs), CRISP LCCL domain containing 1 (CRISPLD1), CRISP LCCL domain containing 2 (CRISPLD2), mannose receptor like and the R3H domain containing like proteins. Members are most often secreted and have an extracellular endocrine or paracrine function and are involved in processes including the regulation of extracellular matrix and branching morphogenesis, potentially as either proteases or protease inhibitors; in ion channel regulation in fertility; as tumour suppressor or pro-oncogenic genes in tissues including the prostate; and in cell-cell adhesion during fertilisation. The overall protein structural conservation within the CAP superfamily results in fundamentally similar functions for the CAP domain in all members, yet the diversity outside of this core region dramatically alters the target specificity and, thus, the biological consequences [ ]. The Ca2-chelating function [] would fit with the various signalling processes (e.g. the CRISP proteins) that members of this family are involved in, and also the sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how blocks the Ca2 transporting ryanodine receptors. The CAP domain forms a unique 3 layer α-β-α fold with some, though not all, of the structural elements found in proteases [ ].
Protein Domain
Name: Zinc finger, ZZ-type superfamily
Type: Homologous_superfamily
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents the ZZ domain, which belongs to the family of 'cross-brace' zinc finger proteins with interleaved zinc binding sites. These zinc fingers are thought to be involved in protein-protein interactions [ ].
Protein Domain
Name: Clusterin, N-terminal
Type: Domain
Description: Clusterin is a vertebrate glycoprotein [ ], the exact function of which is not yet clear. Clusterin expression is complex, appearing as different forms in different cell compartments. One set of proteins is directed for secretion, and other clusterin species are expressed in thecytoplasm and nucleus. The secretory form of the clusterin protein (sCLU) is targeted to the ER by an initial leader peptide. This ~60kDa pre-sCLU protein is further glycosylated and proteolytically cleaved into alpha- and beta-subunits, held together by disulphide bonds.External sCLU is an 80kDa protein and may act as a molecular chaperone, scavenging denatured proteins outside cells following specific stress-induced injury such as heat shock. sCLU possesses nonspecific binding activity to hydrophobic domains of various proteins in vitro[ ].A specific nuclear form of CLU (nCLU) acts as a pro-death signal, inhibiting cell growth and survival. ThenCLU protein has two coiled-coil domains, one at its N terminus that is unable to bind Ku70, and a C-terminal coiled-coil domain that is uniquely able to associate with Ku70 and is minimally required for cell death.Clusterin is synthesized as a precursor polypeptide of about 400 amino acids which is post-translationally cleaved to form two subunits of about 200 amino acids each. The two subunits are linked by five disulphide bonds to form an antiparallel ladder-like structure []. In each of the mature subunits the five cysteines that are involved in disulphide bonds are clustered in domains of about 30 amino acids located in the central part of the subunits.This entry represents the N-terminal domain of the clusterin precursor.
Protein Domain
Name: Clusterin, C-terminal
Type: Domain
Description: Clusterin is a vertebrate glycoprotein [ ], the exact function of which is not yet clear. Clusterin expression is complex, appearing as different forms in different cell compartments. One set of proteins is directed for secretion, and other clusterin species are expressed in thecytoplasm and nucleus. The secretory form of the clusterin protein (sCLU) is targeted to the ER by an initial leader peptide. This ~60kDa pre-sCLU protein is further glycosylated and proteolytically cleaved into alpha- and beta-subunits, held together by disulphide bonds.External sCLU is an 80kDa protein and may act as a molecular chaperone, scavenging denatured proteins outside cells following specific stress-induced injury such as heat shock. sCLU possesses nonspecific binding activity to hydrophobic domains of various proteins in vitro[ ].A specific nuclear form of CLU (nCLU) acts as a pro-death signal, inhibiting cell growth and survival. ThenCLU protein has two coiled-coil domains, one at its N terminus that is unable to bind Ku70, and a C-terminal coiled-coil domain that is uniquely able to associate with Ku70 and is minimally required for cell death.Clusterin is synthesized as a precursor polypeptide of about 400 amino acids which is post-translationally cleaved to form two subunits of about 200 amino acids each. The two subunits are linked by five disulphide bonds to form an antiparallel ladder-like structure []. In each of the mature subunits the five cysteines that are involved in disulphide bonds are clustered in domains of about 30 amino acids located in the central part of the subunits.This entry represents the C-terminal domain of the custerin precursor.
Protein Domain
Name: Type IV pilus inner membrane component PilN
Type: Family
Description: Bacterial type IV pili are surface filaments critical for diverse biological processes including surface and host cell adhesion, colonisation, biofilm formation, twitching motility, DNA uptake during natural transformation and virulence [ , ]. The proteins necessary to form the type IV pili inner-membrane complex, are included in the pilMNOPQ operon which encodes the cytoplasmic actin-like protein PilM, PilN, PilO, the periplasmic lipoprotein PilP and the outer-membrane secretin PilQ. The inner-membrane PilM/N/O/P complex is required for the optimal function of the outer-membrane secretin PilQ. This cluster is highly conserved across the type IV pilus-producing bacterial species, and all of these proteins have been shown to be essential for twitching motility [, ].PilN forms a stable heterodimer with PilO through interaction of their periplasmic domains, being essential for the assembly of a functional complex. PilN requires PilO for a proper folding and they depend on each other for their stability [ , , ]. PilN also interacts with PilM causing conformational changes that allow PilM monomerisation.This entry also includes Type II secretion system protein L. Type II secretion (T2S) and type IV pilus systems share evolutionary origins, being structurally and functionally related [ ]. DNA utilization protein HofN from E.coli is also included in this entry. It is involved in DNA uptake and DNA utilisation as a carbon and energy source, conferring competitive advantage. This protein is encoded by a gene homologue to com genes in Haemophilus influenzae, involved in pilus biogenesis, protein secretion, competence-transformation, and twitching motility [ ].
Protein Domain
Name: Hemocyanin, N-terminal
Type: Domain
Description: Crustacean and cheliceratan hemocyanins (oxygen-transport proteins) and insect hexamerins (storage proteins) are homologous gene products, although the latter do not bind oxygen [ ].Haemocyanins are found in the haemolymph of many invertebrates. They are divided into 2 main groups, arthropodan and molluscan. These have structurally similar oxygen-binding centres, which are similar to the oxygen-binding centre of tyrosinases, but their quaternary structures are arranged differently. The arthropodan proteins exist as hexamers comprising 3 heterogeneous subunits (a, b and c) and possess 1 oxygen-binding centre per subunit; and the molluscan proteins exist as cylindrical oligomers of 10 to 20 subunits and possess 7 or 8 oxygen-binding centres per subunit [ ]. Although the proteins have similar amino acid compositions, the only real similarity in their primary sequences is in the region corresponding to the second copper-binding domain, which also shows similarity to the copper-binding domain of tyrosinases. Hexamerins are proteins from the hemolymph of insects, which may serve as a store of amino acids for synthesis of adult proteins. They do not possess the copper-binding histidines present in hemocyanins [ ]. Homologues are also present in other kinds of organism, for example, Cyclopenase asqI from the yeast Emericella nidulans and Cyclopenase penL from Penicillium thymicola. AsqL is a tyrosinase involved in biosynthesis of the aspoquinolone mycotoxins, though its exact function is unknown [ ]. PenL is part of the gene cluster that mediates the biosynthesis of penigequinolones, potent insecticidal alkaloids that contain a highly modified 10-carbon prenyl group [].This entry represents the N-terminal domain of hemocyanin and hexamerin proteins.
Protein Domain
Name: Hemocyanin, C-terminal
Type: Domain
Description: Crustacean and cheliceratan hemocyanins (oxygen-transport proteins) and insect hexamerins (storage proteins) are homologous gene products, although the latter do not bind oxygen [ ].Haemocyanins are found in the haemolymph of many invertebrates. They are divided into 2 main groups, arthropodan and molluscan. These have structurally similar oxygen-binding centres, which are similar to the oxygen-binding centre of tyrosinases, but their quaternary structures are arranged differently. The arthropodan proteins exist as hexamers comprising 3 heterogeneous subunits (a, b and c) and possess 1 oxygen-binding centre per subunit; and the molluscan proteins exist as cylindrical oligomers of 10 to 20 subunits and possess 7 or 8 oxygen-binding centres per subunit [ ]. Although the proteins have similar amino acid compositions, the only real similarity in their primary sequences is in the region corresponding to the second copper-binding domain, which also shows similarity to the copper-binding domain of tyrosinases. Hexamerins are proteins from the hemolymph of insects, which may serve as a store of amino acids for synthesis of adult proteins. They do not possess the copper-binding histidines present in hemocyanins [ ]. Homologues are also present in other kinds of organism, for example, Cyclopenase asqI from the yeast Emericella nidulans and Cyclopenase penL from Penicillium thymicola. AsqL is a tyrosinase involved in biosynthesis of the aspoquinolone mycotoxins, though its exact function is unknown [ ]. PenL is part of the gene cluster that mediates the biosynthesis of penigequinolones, potent insecticidal alkaloids that contain a highly modified 10-carbon prenyl group [ ].This entry represents the C-terminal domain of hemocyanin and hexamerin proteins.
Protein Domain
Name: LDLR class B repeat
Type: Repeat
Description: The low-density lipoprotein receptor (LDLR) is the major cholesterol-carrying lipoprotein of plasma, acting to regulate cholesterol homeostasis in mammalian cells. The LDL receptor binds LDL and transports it into cells by acidic endocytosis. In order to be internalized, the receptor-ligand complex must first cluster into clathrin-coated pits. Once inside the cell, the LDLR separates from its ligand, which is degraded in the lysosomes, while the receptor returns to the cell surface []. The internal dissociation of the LDLR with its ligand is mediated by proton pumps within the walls of the endosome that lower the pH. The LDLR is a multi-domain protein, containing: The ligand-binding domain contains seven or eight 40-amino acid LDLR class A (cysteine-rich) repeats, each of which contains a coordinated calcium ion and six cysteine residues involved in disulphide bond formation [ ]. Similar domains have been found in other extracellular and membrane proteins []. The second conserved region contains two EGF repeats, followed by six LDLR class B (YWTD) repeats, and another EGF repeat. The LDLR class B repeats each contain a conserved YWTD motif, and is predicted to form a β-propeller structure [ ]. This region is critical for ligand release and recycling of the receptor [].The third domain is rich in serine and threonine residues and contains clustered O-linked carbohydrate chains.The fourth domain is the hydrophobic transmembrane region.The fifth domain is the cytoplasmic tail that directs the receptor to clathrin-coated pits.LDLR is closely related in structure to several other receptors, including LRP1, LRP1b, megalin/LRP2, VLDL receptor, lipoprotein receptor, MEGF7/LRP4, and LRP8/apolipoprotein E receptor2); these proteins participate in a wide range of physiological processes, including the regulation of lipid metabolism, protection against atherosclerosis, neurodevelopment, and transport of nutrients and vitamins [ ].This entry represents the LDLR classB (YWTD) repeat, the structure of which has been solved [ ]. The six YWTD repeats together fold into a six-bladed β-propeller. Each blade of the propeller consists of four antiparallel β-strands; the innermost strand of each blade is labeled 1 and the outermost strand, 4. The sequence repeats are offset with respect to the blades of the propeller, such that any given 40-residue YWTD repeat spans strands 24 of one propeller blade and strand 1 of the subsequent blade. This offset ensures circularization of the propeller because the last strand of the final sequence repeat acts as an innermost strand 1 of the blade that harbors strands 24 from the first sequence repeat. The repeat is found in a variety of proteins that include, vitellogenin receptor from Drosophila melanogaster, low-density lipoprotein (LDL) receptor [], preproepidermal growth factor, and nidogen (entactin).
Protein Domain
Name: Zinc finger, PHD-type, conserved site
Type: Conserved_site
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents the PHD (homeodomain) zinc finger domain [ ], which is a C4HC3 zinc-finger-like motif found in nuclear proteins thought to be involved in chromatin-mediated transcriptional regulation. The PHD finger motif is reminiscent of, but distinct from the C3HC4 type RING finger.The function of this domain is not yet known but in analogy with the LIM domain it could be involved in protein-protein interaction and be important for the assembly or activity of multicomponent complexes involved in transcriptional activation or repression. Alternatively, the interactions could be intra-molecular and be important in maintaining the structural integrity of the protein. In similarity to the RING finger and the LIM domain, the PHD finger is thought to bind two zinc ions.The signature of this entry starts at the first cysteine of the zinc finger region and ends at the last one. The spacing between cysteines in the PHD finger is closely related to that in the RING finger. Discrimination between these two domains with either a pattern or a profile is therefore difficult, and some rare domains are recognised by both the RING and PHD patterns and profiles.
Protein Domain
Name: Zinc finger, C2H2C-type
Type: Repeat
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents the CysCysHisHisCys (CCHHC) type zinc finger domain found in eukaryotes. The CCHHC-type zinc finger contains five absolutely conserved cysteine and histidine residues (rather than the more usual four) with the sequence C-P-x-P-G-C-x-G-x-G-H-x(7)-H-R-x(4)-C. The second histidine has been shown to coordinate Zn(II) along with the three cysteines residues. The first His plays a different role in stabilizing the structure, stacking between the metal-binding core and an aromatic residue that is relatively conserved. CCHHC-type zinc fingers form small compact structures that can sit entirely within the major groove of DNA [ , , , , ].Some proteins known to contain a CCHHC-type zinc finger are listed below: Animal myelin transcription factor 1 (MyT1), or neural zinc finger 2 (NZF2), a transcription factor that contains seven copies of the CCHHC-type zinc finger. It binds to sites in the proteolipid protein promoter.Vertebrate MyT1-like (MyT1L/NZF1), appears to be involved in neural development.Vertebrate Suppressor of Tumorigenicity 18 (ST18/NZF3), a breast cancer tumour suppressor.Vertebrate L3MBTL, a member of the Polycomb group of proteins, which function as transcriptional repressors in large protein complexes.Vertebrate L3MBTL3, a possible tumor suppressor.Vertebrate L3MBTL4.
Protein Domain
Name: Zinc finger, C2H2C-type superfamily
Type: Homologous_superfamily
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents the CysCysHisHisCys (CCHHC) type zinc finger domain found in eukaryotes. The CCHHC-type zinc finger contains five absolutely conserved cysteine and histidine residues (rather than the more usual four) with the sequence C-P-x-P-G-C-x-G-x-G-H-x(7)-H-R-x(4)-C. The second histidine has been shown to coordinate Zn(II) along with the three cysteines residues. The first His plays a different role in stabilizing the structure, stacking between the metal-binding core and an aromatic residue that is relatively conserved. CCHHC-type zinc fingers form small compact structures that can sit entirely within the major groove of DNA [ , , , , ].Some proteins known to contain a CCHHC-type zinc finger are listed below: Animal myelin transcription factor 1 (MyT1), or neural zinc finger 2 (NZF2), a transcription factor that contains seven copies of the CCHHC-type zinc finger. It binds to sites in the proteolipid protein promoter.Vertebrate MyT1-like (MyT1L/NZF1), appears to be involved in neural development.Vertebrate Suppressor of Tumorigenicity 18 (ST18/NZF3), a breast cancer tumour suppressor.Vertebrate L3MBTL, a member of the Polycomb group of proteins, which function as transcriptional repressors in large protein complexes.Vertebrate L3MBTL3, a possible tumor suppressor.Vertebrate L3MBTL4.
Protein Domain
Name: Bacterial microcompartment domain
Type: Domain
Description: Bacterial microcompartments (BMCs) are large proteinaceous structures comprised of a roughly icosahedral shell and a series of encapsulated enzymes. The shells of BMCs are made primarily of a family of proteins whose structural core is the BMC domain, and variations upon this core provide functional diversity. This domain is found in a variety of polyhedral organelle shell proteins (CcmK), including CsoS1A, CsoS1B and CsoS1C of Thiobacillus neapolitanus (Halothiobacillus neapolitanus) and their orthologues from other bacteria [ , , ].Some autotrophic and non-autotrophic organisms form polyhedral organelles, carboxysomes/enterosomes [ ]. The best studied is the carboxysome of Halothiobacillus neapolitanus, which is composed of at least 9 proteins: six shell proteins, CsoS1A, CsoS1B, CsoS1C, Cso2A, Cso2B and CsoS3 (carbonic anhydrase) [, ], one protein of unknown function and the large and small subunits of RuBisCo (CbbL and Cbbs). Carboxysomes appear to be approximately 120 nm in diameter, most often observed as regular hexagons, with a solid interior bounded by a unilamellar protein shell. The interior is filled with type I RuBisCo, which is composed of 8 large subunits and 8 small subunits; it accounts for 60% of the carboxysomal protein, which amounts to approximately 300 molecules of enzyme per carboxysome. Carboxysomes are required for autotrophic growth at low CO2concentrations and are thought to function as part of a CO 2-concentrating mechanism [ , ].Polyhedral organelles, enterosomes, from non-autotrophic organisms are involved in coenzyme B 12-dependent 1,2-propanediol utilisation (e.g., in Salmonella enterica [ ]) and ethanolamine utilisation (e.g., in Salmonella typhimurium []). Genes needed for enterosome formation are located in the 1,2-propanediol utilisation pdu[ , ] or ethanolamine utilisation eut[ , ] operons, respectively. Although enterosomes of non-autotrophic organisms are apparently related to carboxysomes structurally, afunctional relationship is uncertain. A role in CO 2concentration, similar to that of the carboxysome, is unlikely since there is no known association between CO 2and coenzyme B12-dependent 1,2-propanediol or ethanolamine utilisation [ ]. It seems probable that enterosomes help protect the cells from reactive aldehyde species in the degradation pathways of 1,2-propanediol and ethanolamine [].The BMC domain fold consists of three α-helices (designated A, B, and C) and four β-strands ( ). Some instances of the BMC shell protein reveal a circular permutation in which a highly similar tertiary structure is built from secondary structure elements occurring in a different order. The secondary structure elements contributed by the C-terminal region of the typical BMC fold are instead contributed by the N-terminal region of the BMC circularly permuted domain ( ) [ , , ].
Protein Domain
Name: Thrombin receptor
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].Thrombin is a serine protease with a central role in blood clotting. It cleaves various substrates involved in coagulation, and activates cellsurface receptors via a novel proteolytic action. Thrombin stimulates aggregation and secretion in blood platelets at the site of vascular injury,and also has inflammatory and reparative actions, stimulating chemotaxis in monocytes, proliferation of fibroblasts and lymphocytes, and inducingendothelium-dependent relaxation of blood vessels. The protein activates a number of substrates involved in coagulation: it cleaves fibrinogen tofibrin and activates coagulation factor XIII; it also activates factors V and VIII. When bound to thrombomodulin, it activates plasma protein C,which, in concert with protein S, inactivates factors Va and VIIIa, leading to a decrease in thrombin formation.The thrombin receptor is expressed in high levels in platelets, vascular endothelial cells, and various cell lines. The receptor activatesphosphoinositide metabolism via a pertussis-toxin-insensitive G-protein, and inhibits adenylyl cyclase via a pertussis-toxin-sensitive G-protein.
Protein Domain
Name: CcmK-like superfamily
Type: Homologous_superfamily
Description: Bacterial microcompartments (BMCs) are large proteinaceous structures comprised of a roughly icosahedral shell and a series of encapsulated enzymes. The shells of BMCs are made primarily of a family of proteins whose structural core is the BMC domain, and variations upon this core provide functional diversity. This domain is found in a variety of polyhedral organelle shell proteins (CcmK), including CsoS1A, CsoS1B and CsoS1C of Thiobacillus neapolitanus (Halothiobacillus neapolitanus) and their orthologues from other bacteria [ , , ].Some autotrophic and non-autotrophic organisms form polyhedral organelles, carboxysomes/enterosomes [ ]. The best studied is the carboxysome of Halothiobacillus neapolitanus, which is composed of at least 9 proteins: six shell proteins, CsoS1A, CsoS1B, CsoS1C, Cso2A, Cso2B and CsoS3 (carbonic anhydrase) [, ], one protein of unknown function and the large and small subunits of RuBisCo (CbbL and Cbbs). Carboxysomes appear to be approximately 120 nm in diameter, most often observed as regular hexagons, with a solid interior bounded by a unilamellar protein shell. The interior is filled with type I RuBisCo, which is composed of 8 large subunits and 8 small subunits; it accounts for 60% of the carboxysomal protein, which amounts to approximately 300 molecules of enzyme per carboxysome. Carboxysomes are required for autotrophic growth at low CO2concentrations and are thought to function as part of a CO 2-concentrating mechanism [ , ].Polyhedral organelles, enterosomes, from non-autotrophic organisms are involved in coenzyme B 12-dependent 1,2-propanediol utilisation (e.g., in Salmonella enterica [ ]) and ethanolamine utilisation (e.g., in Salmonella typhimurium []). Genes needed for enterosome formation are located in the 1,2-propanediol utilisation pdu[ , ] or ethanolamine utilisation eut[ , ] operons, respectively. Although enterosomes of non-autotrophic organisms are apparently related to carboxysomes structurally, a functional relationship is uncertain. A role in CO2concentration, similar to that of the carboxysome, is unlikely since there is no known association between CO 2and coenzyme B12-dependent 1,2-propanediol or ethanolamine utilisation [ ]. It seems probable that enterosomes help protect the cells from reactive aldehyde species in the degradation pathways of 1,2-propanediol and ethanolamine [].The BMC domain fold consists of three α-helices (designated A, B, and C) and four β-strands ( ). Some instances of the BMC shell protein reveal a circular permutation in which a highly similar tertiary structure is built from secondary structure elements occurring in a different order. The secondary structure elements contributed by the C-terminal region of the typical BMC fold are instead contributed by the N-terminal region of the BMC circularly permuted domain ( ) [ , , ].CcmK has a Ferredoxin-like fold with an extra C-terminal helix and forms a compact hexameric 'tiles' of hexagonal shape.
Protein Domain
Name: RNA-binding, CRM domain
Type: Domain
Description: The CRM domain is an ~100-amino acid RNA-binding domain. The name chloroplast RNA splicing and ribosome maturation (CRM) has been suggested to reflect the functions established for the four characterised members of the family: Zea mays (Maize) CRS1 ( ), CAF1 ( ) and CAF2 ( ) proteins and the Escherichia coli protein YhbY ( ). The CRM domain is found in eubacteria, archaea, and plants. The CRM domain is represented as a stand-alone protein in archaea and bacteria, and in single- and multi-domain proteins in plants. It has been suggested that prokaryotic CRM proteins existed as ribosome-associated proteins prior to the divergence of archaea and bacteria, and that they were co-opted in the plant lineage as RNA binding modules by incorporation into diverse protein contexts. Plant CRM domains are predicted to reside not only in the chloroplast, but also in the mitochondrion and the nucleo/cytoplasmic compartment. The diversity of the CRM domain family in plants suggests a diverse set of RNA targets [, ].The CRM domain is a compact alpha/beta domain consisting of a four-stranded beta sheet and three alpha helices with an α-β-α-β-α-β-beta topology. The beta sheet face is basic, consistent with a role in RNA binding. Proximal to the basic beta sheet face is another moiety that could contribute to nucleic acid recognition. Connecting strand beta1 and helix alpha2 is a loop with a six amino acid motif, GxxG flanked by large aliphatic residues, within which one 'x' is typically a basic residue [ ]. Escherichia coli YhbY is associated with pre-50S ribosomal subunits, which implies a function in ribosome assembly. GFP fused to a single-domain CRM protein from maize localises to the nucleolus, suggesting that an analogous activity may have been retained in plants [ ]. A CRM domain containing protein in plant chloroplasts has been shown to function in group I and II intron splicing []. In vitro experiments with an isolated maize CRM domain have shown it to have RNA binding activity. These and other results suggest that the CRM domain evolved in the context of ribosome function prior to the divergence of Archaea and Bacteria, that this function has been maintained in extant prokaryotes, and that the domain was recruited to serve as an RNA binding module during the evolution of plant genomes []. YhbY has a fold similar to that of the C-terminal domain of translation initiation factor 3 (IF3C), which binds to 16S rRNA in the 30S ribosome [].
Protein Domain
Name: Zinc finger, ClpX C4-type superfamily
Type: Homologous_superfamily
Description: The ClpX heat shock protein of Escherichia coli is a member of the universally conserved Hsp100 family of proteins, and possesses a putative zinc finger motif of the C4 type [ ]. This presumed zinc binding domain (ZBD) is found at the N terminus of the ClpX protein. ClpX is an ATPase which functions both as a substrate specificity component of the ClpXP protease and as a molecular chaperone. ZBD is a member of the treble clef zinc finger family, a motif known to facilitate protein-ligand, protein-DNA, and protein-protein interactions and forms a constitutive dimer that is essential for the degradation of some, but not all, ClpX substrates [].
Protein Domain
Name: Sulphite reductase [NADPH] flavoprotein, alpha chain, Proteobacteria
Type: Family
Description: Escherichia coli NADPH-sulphite reductase (SiR) is a multimeric hemoflavoprotein composed of eight alpha-subunits (SiR-FP) and four beta-subunits (SiR-HP) that catalyses the six electron reduction of sulphite to sulphide. This is one of several activities required for the biosynthesis of L-cysteine from sulphate. The alpha component of NADPH-sulphite reductase is a flavoprotein, the beta component is a hemoprotein [ ]. The flavoprotein component catalyses the electron flow from NADPH to FAD to FMN to the hemoprotein component.This entry describes a clade of NADPH-dependent sulphite reductase flavoprotein subunit alpha from Proteobacteria. The proteins bind one FAD and one FMN as prosthetic groups and contains an NADPH-binding domain [ ].
Protein Domain
Name: Thiamin/thiamin pyrophosphate ABC transporter, substrate-binding protein, Proteobacteria
Type: Family
Description: This entry represents thiB, which is a thiamine-binding protein and part of the ABC transporter complex ThiBPQ involved in thiamine import. ThiBPQ is required for transport of thiamine and thiamine pyrophosphate in Salmonella typhimurium [ ]. Bacterial high affinity transport systems are involved in active transport of solutes across the cytoplasmic membrane. Most of the bacterial ABC (ATP-binding cassette) importers are composed of one or two transmembrane permease proteins, one or two nucleotide-binding proteins and a highly specific periplasmic solute-binding protein. In Gram-negative bacteria the solute-binding proteins are dissolved in the periplasm, while in archaea and Gram-positive bacteria, their solute-binding proteins are membrane-anchored lipoproteins [, ].
Protein Domain
Name: IgG-blocking virulence domain
Type: Domain
Description: This entry represents a domain found in Mycoplasma and Ureaplasma proteins. Proteins containing this domain include protein M of Mycoplasma genitalium, MG_281, a virulence protein that binds the IgG light chain to block the binding of antibody to antigen. The crystal structure of the protein M antibody-binding region that includes this domain is solved (PDB|4NZR). Full-length homologues to MG_281 are known in a few other Mycoplasma species, but this domain shows distant homology to many additional proteins with a much wider distribution across the Mollicutes. Proteins in this entry also include paralogous families in some species, such as MCAP_0345, MCAP_0347, MCAP_0349, and MCAP_0351 in Mycoplasma capricolum [ ].
Protein Domain
Name: CHIP , U-box domain
Type: Domain
Description: This entry represents the RING-like U-box domain found in E3 ubiquitin-protein ligase CHIP. CHIP is a multifunctional protein that functions both as a co-chaperone and an E3 ubiquitin-protein ligase. It couples protein folding and proteasome mediated degradation by interacting with heat shock proteins (e.g. HSC70) and ubiquitinating their misfolded client proteins thereby targeting them for proteasomal degradation [ , ]. It is also important for cellular differentiation and survival (apoptosis), as well as susceptibility to stress. It targets a wide range of proteins, such as expanded ataxin-1, ataxin-3, huntingtin, and androgen receptor, which play roles in glucocorticoid response, tau degradation, and both p53 and cAMP signaling [, ].
Protein Domain
Name: FAM175 family, BRISC complex, Abro1 subunit
Type: Family
Description: Members of protein family FAM175 include the BRCA1-A complex subunit Abraxas 1 [ , ], BRISC complex subunit Abraxas 2 or Abro1 (Abraxas brother protein 1) [, ], and uncharacterised plant proteins.It is thought that BRCA1-A complex subunit Abraxas acts as a central scaffold protein responsible for assembling the various components of the BRCA1-A complex, and mediates recruitment of BRCA1 [ , ]. Similarly, Abro1 probably acts as a scaffold facilitating assembly of the various components of BRISC [] - the protein does not interact with BRCA1, but binds polyubiquitin []. The primary sequences of these proteins contain an MPN-like domain [].This entry represents the Abro1 protein.
Protein Domain
Name: FAM175 family, BRCA1-A complex, Abraxas 1 subunit
Type: Family
Description: Members of protein family FAM175 include the BRCA1-A complex subunit Abraxas 1 [ , ], BRISC complex subunit Abraxas 2 or Abro1 (Abraxas brother protein 1) [, ], and uncharacterised plant proteins.It is thought that BRCA1-A complex subunit Abraxas acts as a central scaffold protein responsible for assembling the various components of the BRCA1-A complex, and mediates recruitment of BRCA1 [ , ]. Similarly, Abro1 probably acts as a scaffold facilitating assembly of the various components of BRISC [] - the protein does not interact with BRCA1, but binds polyubiquitin []. The primary sequences of these proteins contain an MPN-like domain [].This entry represents the Abraxas 1 protein.
Protein Domain
Name: Bunyavirus nucleocapsid (N) , C-terminal domain
Type: Homologous_superfamily
Description: Orthobunyavirus are enveloped viruses with a genome consisting of 3 ssRNA segments (called L, M and S). The nucleocapsid protein (also known as nucleoprotein) is encoded on the small (S) genomic RNA. The N protein is the major component of the nucleocapsids. This protein is thought to interact with the L protein, virus RNA and/or other N proteins [ ].This superfamily represents the C-terminal domain found in Bunyavirus nucleocapsid (N) protein. It is made up of 6 alpha helices. Some residues of this domain make up the 'C-arm' which makes intimate contact with core domains of neighbouring molecules to form a tightly-bound network throughout the crystal.
Protein Domain
Name: Ras-binding domain of Byr2
Type: Domain
Description: This entry represents the Ras binding/interacting domain of the Byr2 protein from Schizosaccharomyces pombe. The Ras binding domain (RBD) of Byr2 is necessary and sufficient for the protein to be translocated by Ras to the plasma membrane [ ]. This domain can also be found in Ste11 protein from Saccharomyces cerevisiae. Byr2 is mitogen-activated protein/ERK kinase kinase (MEKK) that responds to pheromone signalling and controls mating through a mitogen-activated protein kinase (MAPK) pathway [ ]. The small GTP binding protein Ras binds and activates Byr2 []. Ste11 is a mitogen activated protein kinase kinase kinase (MAPKKK) involved in the MAPK pathways governing mating, osmosensing, and filamentous growth [ ].
Protein Domain
Name: CAP domain
Type: Domain
Description: The cysteine-rich secretory proteins, antigen 5, and pathogenesis-related 1 proteins (CAP) superfamily proteins are found in a wide range of organisms, including prokaryotes [ ] and non-vertebrate eukaryotes [], The nine subfamilies of the mammalian CAP superfamily include: the human glioma pathogenesis-related 1 (GLIPR1), Golgi associated pathogenesis related-1 (GAPR1) proteins, peptidase inhibitor 15 (PI15), peptidase inhibitor 16 (PI16), cysteine-rich secretory proteins (CRISPs), CRISP LCCL domain containing 1 (CRISPLD1), CRISP LCCL domain containing 2 (CRISPLD2), mannose receptor like and the R3H domain containing like proteins. Members are most often secreted and have an extracellular endocrine or paracrine function and are involved in processes including the regulation of extracellular matrix and branching morphogenesis, potentially as either proteases or protease inhibitors; in ion channel regulation in fertility; as tumour suppressor or pro-oncogenic genes in tissues including the prostate; and in cell-cell adhesion during fertilisation. The overall protein structural conservation within the CAP superfamily results in fundamentally similar functions for the CAP domain in all members, yet the diversity outside of this core region dramatically alters the target specificity and, thus, the biological consequences []. The Ca2-chelating function [] would fit with the various signalling processes (e.g. the CRISP proteins) that members of this family are involved in, and also the sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how blocks the Ca2 transporting ryanodine receptors. This entry represents the CAP domain common to all members of the CAP superfamily. The CAP domain forms a unique 3 layer α-β-α fold with some, though not all, of the structural elements found in proteases [ ].
Protein Domain
Name: AIG1-type guanine nucleotide-binding (G) domain
Type: Domain
Description: This entry represents the AIG1-type G domain.The P-loop guanosine triphosphatases (GTPases) control a multitude of biological processes, ranging from cell division, cell cycling,and signal transduction, to ribosome assembly and protein synthesis. GTPases exert their control by interchanging between an inactive GDP-bound state andan active GTP-bound state, thereby acting as molecular switches. The common denominator of GTPases is the highly conserved guanine nucleotide-binding (G)domain that is responsible for binding and hydrolysis of guanine nucleotides.The TRAFAC (translation factor related) class AIG1/Toc34/Toc159-like paraseptin GTPase family contains the following subfamilies []:The GTPases of immunity-associated protein (GIMAP)/immune-associated nucleotide-binding protein (IAN) subfamily is conserved among vertebratesand angiosperm plants and has been postulated to regulate apoptosis, particularly in context with diseases such as cancer, diabetes, andinfections. The function of GIMAP/IAN GTPases has been linked to self defense in plants and to the development of T cells in vertebrates [, ].Plant-specific Toc (translocon at the outer envelope membrane of chloroplasts) proteins. Toc proteins function as integral components of thechloroplast protein import machinery. The Toc translocon contains the two membrane-bound GTPases Toc33/34 and Toc 159, which expose their G domainsto the cytosol and recognise and then deliver precursor proteins through the translocation pore Toc75 [, ].The GIMAP/IAN GTPases contain a avrRpt2 induced gene 1 (AIG1)-type G domain that exhibits the five motifs G1-G5 characteristic for GTP/GDP-bindingproteins. In addition, the AIG-type G domain contains a unique, highly conserved, hydrophobic motif between G3 and G4. It has a divergent version ofthe guanine recognition motif (G4) at the end of the core strand 5 and an additional helix alpha6 at the C terminus. The AIG1-type G domain contains acentral β-sheet sandwiched by two layers of α-helices.
Protein Domain
Name: Transcription factor LuxR-like, autoinducer-binding domain
Type: Domain
Description: This domain binds N-acyl homoserine lactones (AHLs), which are also known as autoinducers. These are small, diffusible molecules used as communication signals in a large variety of proteobacteria. It is almost always found in association with the DNA-binding LuxR domain ( ). The autoinducer binding domain forms the N-terminal region of the protein, while the DNA-binding domain forms the C-terminal region. In most cases, binding of AHL by this N-terminal domain leads to unmasking of the DNA-binding domain, allowing it to bind DNA and activate transcription [ ]. In rare cases, some LuxR proteins such as EsaR, act as repressors []. In these proteins binding of AHL to this domain leads to inactivation of the protein as a transcriptional regulator. A large number of processes have been shown to be regulated by LuxR proteins, including bioluminescence, production of virulence factors in plant and animal pathogens, antibiotic production and plasmid transfer.Structural studies of TraR from Agrobacterium tumefaciens [ , ] show that the functional protein is a homodimer. Binding of the cognate AHL is required for protein folding, resistance to proteases and dimerisation. The autoinducer binding domain binds its cognate AHL in an alpha/beta/alpha sandwich and provides an extensive dimerisation surface, though residues from the C-terminal region also make some contribution to dimerisation. The autoinducer binding domain is also required for interaction with RpoA, allowing transcription to occur [].There are some proteins which consist solely of the autoinducer binding domain. The function of these is not known, but TrlR from Agrobacterium has been shown to inhibit the activity of TraR by the formation of inactive heterodimers [ ].
Protein Domain
Name: ADP-ribosylation factor 1-5
Type: Family
Description: Arf GTPases are involved in the formation of coated carrier vesicles by recruiting coat proteins. This entry includes Arf1, Arf2, Arf3, Arf4, Arf5, and related proteins. Each contains an N-terminal myristoylated amphipathic helix that is folded into the protein in the GDP-bound state. GDP/GTP exchange exposes the helix, which anchors to the membrane. Following GTP hydrolysis, the helix dissociates from the membrane and folds back into the protein. A general feature of Arf1-5 signaling may be the cooperation of two Arfs at the same site. Arfs1-5 are generally considered to be interchangeable in function and location, but some specific functions have been assigned [ ]. Arf1 localizes to the early/cis-Golgi, where it is activated by GBF1 and recruits the coat protein COPI. It also localizes to the trans-Golgi network (TGN), where it is activated by BIG1/BIG2 and recruits the AP1, AP3, AP4, and GGA proteins [ ]. Humans, but not rodents and other lower eukaryotes, lack Arf2. Human Arf3 shares 96% sequence identity with Arf1 and is believed to generally function interchangeably with Arf1. Human Arf4 in the activated (GTP-bound) state has been shown to interact with the cytoplasmic domain of epidermal growth factor receptor (EGFR) and mediate the EGF-dependent activation of phospholipase D2 (PLD2), leading to activation of the activator protein 1 (AP-1) transcription factor [ ]. Arf4 has also been shown to recognise the C-terminal sorting signal of rhodopsin and regulate its incorporation into specialised post-Golgi rhodopsin transport carriers (RTCs) []. There is some evidence that Arf5 functions at the early-Golgi and the trans-Golgi to affect Golgi-associated alpha-adaptin homology Arf-binding proteins (GGAs) [].
Protein Domain
Name: Thrombospondin type-1 (TSP1) repeat superfamily
Type: Homologous_superfamily
Description: Thrombospondins are multimeric multidomain glycoproteins that function at cell surfaces and in the extracellular matrix milieu. They act as regulators of cell interactions in vertebrates. They are divided into two subfamilies, A and B, according to their overall molecular organisation. The subgroup A proteins TSP-1 and -2 contain an N-terminal domain, a VWFC domain, three TSP1 repeats, three EGF-like domains, TSP3 repeats and a C-terminal domain. They are assembled as trimer. The subgroup B thrombospondins, designated TSP-3, -4, and COMP (cartilage oligomeric matrix protein, also designated TSP-5) are distinct in that they contain unique N-terminal regions, lack the VWFC domain and TSP1 repeats, contain four copies of EGF-like domains, and are assembled as pentamers [ ]. EGF, TSP3 repeats and the C-terminal domain are thus the hallmark of a thrombospondin.This repeat was first described in 1986 by Lawler and Hynes [ ]. It was found in the thrombospondin protein where it is repeated 3 times. Now a number of proteins involved in the complement pathway (properdin, C6, C7, C8A, C8B, C9) [] as well as extracellular matrix protein like mindin, F-spondin [], SCO-spondin and even the circumsporozoite surface protein 2 and TRAP proteins of Plasmodium [, ] contain one or more instance of this repeat. It has been involved in cell-cell interaction, inhibition of angiogenesis [] and apoptosis [].The intron-exon organisation of the properdin gene confirms the hypothesis that the repeat might have evolved by a process involving exon shuffling [ ]. A study of properdin structure provides some information about the structure of the thrombospondin type I repeat [].The TSP1 repeat structure has a disulfide-rich fold with all-beta sheets, each with three antiparallel strands.
Protein Domain
Name: Transcription factor LuxR-like, autoinducer-binding domain superfamily
Type: Homologous_superfamily
Description: This domain superfamily binds N-acyl homoserine lactones (AHLs), which are also known as autoinducers. These are small, diffusible molecules used as communication signals in a large variety of proteobacteria. It is almost always found in association with the DNA-binding LuxR domain ( ). The autoinducer binding domain forms the N-terminal region of the protein, while the DNA-binding domain forms the C-terminal region. In most cases, binding of AHL by this N-terminal domain leads to unmasking of the DNA-binding domain, allowing it to bind DNA and activate transcription [ ]. In rare cases, some LuxR proteins such as EsaR, act as repressors []. In these proteins binding of AHL to this domain leads to inactivation of the protein as a transcriptional regulator. A large number of processes have been shown to be regulated by LuxR proteins, including bioluminescence, production of virulence factors in plant and animal pathogens, antibiotic production and plasmid transfer.Structural studies of TraR from Agrobacterium tumefaciens [ , ] show that the functional protein is a homodimer. Binding of the cognate AHL is required for protein folding, resistance to proteases and dimerisation. The autoinducer binding domain binds its cognate AHL in an alpha/beta/alpha sandwich and provides an extensive dimerisation surface, though residues from the C-terminal region also make some contribution to dimerisation. The autoinducer binding domain is also required for interaction with RpoA, allowing transcription to occur [].There are some proteins which consist solely of the autoinducer binding domain. The function of these is not known, but TrlR from Agrobacterium has been shown to inhibit the activity of TraR by the formation of inactive heterodimers [ ].
Protein Domain
Name: Hemocyanin/hexamerin middle domain
Type: Domain
Description: Crustacean and cheliceratan hemocyanins (oxygen-transport proteins) and insect hexamerins (storage proteins) are homologous gene products, although the latter do not bind oxygen [ ].Haemocyanins are found in the haemolymph of many invertebrates. They are divided into 2 main groups, arthropodan and molluscan. These have structurally similar oxygen-binding centres, which are similar to the oxygen-binding centre of tyrosinases, but their quaternary structures are arranged differently. The arthropodan proteins exist as hexamers comprising 3 heterogeneous subunits (a, b and c) and possess 1 oxygen-binding centre per subunit; and the molluscan proteins exist as cylindrical oligomers of 10 to 20 subunits and possess 7 or 8 oxygen-binding centres per subunit [ ]. Although the proteins have similar amino acid compositions, the only real similarity in their primary sequences is in the region corresponding to the second copper-binding domain, which also shows similarity to the copper-binding domain of tyrosinases. Hexamerins are proteins from the hemolymph of insects, which may serve as a store of amino acids for synthesis of adult proteins. They do not possess the copper-binding histidines present in hemocyanins [ ]. Homologues are also present in other kinds of organism, for example, Cyclopenase asqI from the yeast Emericella nidulans and Cyclopenase penL from Penicillium thymicola. AsqL is a tyrosinase involved in biosynthesis of the aspoquinolone mycotoxins, though its exact function is unknown [ ]. PenL is part of the gene cluster that mediates the biosynthesis of penigequinolones, potent insecticidal alkaloids that contain a highly modified 10-carbon prenyl group [].This entry represents the middle domain of hemocyanin and hexamerin proteins, which is involved in copper binding in hemocyanins.
Protein Domain
Name: Vitellinogen, beta-sheet shell
Type: Domain
Description: Vitellinogen precursors provide the major egg yolk proteins that are a source of nutrients during early development of oviparous vertebrates and invertebrates. Vitellinogen precursors are multi-domain apolipoproteins that are cleaved into distinct yolk proteins. Different vitellinogen precursors exist, which are composed of variable combinations of yolk protein components; however, the cleavage sites are conserved [, ].In vertebrates, a complete vitellinogen is composed of an N-terminal signal peptide for export, followed by four regions that can be cleaved into yolk proteins: lipovitellin-1, phosvitin, lipovitellin-2, and a von Willebrand factor type D domain (YGP40). Vitellinogens are post-translationally glycosylated and phosphorylated in the endoplasmic reticulum and Golgi complex of hepatocytes, before being secreted into the circulatory system to be taken up by oocytes. In the ovary, vitellinogens bind to specific Vtgr receptors on oocyte membranes to become internalised by endocytosis, where they are cleaved into yolk proteins by cathepsin D. YGP40 is released into the yolk plasma before or during compartmentation of lipovitellin-phosvitin complex into the yolk granule.The different yolk proteins have distinct roles. Phosvitins are important in sequestering calcium, iron and other cations for the developing embryo. Phosvitins are one of the most phosphorylated (10%) proteins in nature, the high concentration of phosphate groups providing efficient metal-binding sites in clusters [ , ]. Lipovitellins are involved in lipid and metal storage, and contain a heterogeneous mixture of about 16% (w/w) noncovalently bound lipid, most being phospholipid. Lipovitellin-1 contains two chains, LV1N and LV1C [, ].This entry represents the β-sheet shell domain found in vitellinogen, which generally corresponds to the lipovitellin-2 peptide product. This domain consists of several large open β-sheets [ ]. It is often found C-terminal to and .
Protein Domain
Name: Vitellinogen, open beta-sheet
Type: Domain
Description: Vitellinogen precursors provide the major egg yolk proteins that are a source of nutrients during early development of oviparous vertebrates and invertebrates. Vitellinogen precursors are multi-domain apolipoproteins that are cleaved into distinct yolk proteins. Different vitellinogen precursors exist, which are composed of variable combinations of yolk protein components; however, the cleavage sites are conserved [ , ].In vertebrates, a complete vitellinogen is composed of an N-terminal signal peptide for export, followed by four regions that can be cleaved into yolk proteins: lipovitellin-1, phosvitin, lipovitellin-2, and a von Willebrand factor type D domain (YGP40). Vitellinogens are post-translationally glycosylated and phosphorylated in the endoplasmic reticulum and Golgi complex of hepatocytes, before being secreted into the circulatory system to be taken up by oocytes. In the ovary, vitellinogens bind to specific Vtgr receptors on oocyte membranes to become internalised by endocytosis, where they are cleaved into yolk proteins by cathepsin D. YGP40 is released into the yolk plasma before or during compartmentation of lipovitellin-phosvitin complex into the yolk granule.The different yolk proteins have distinct roles. Phosvitins are important in sequestering calcium, iron and other cations for the developing embryo. Phosvitins are one of the most phosphorylated (10%) proteins in nature, the high concentration of phosphate groups providing efficient metal-binding sites in clusters [ , ]. Lipovitellins are involved in lipid and metal storage, and contain a heterogeneous mixture of about 16% (w/w) noncovalently bound lipid, most being phospholipid. Lipovitellin-1 contains two chains, LV1N and LV1C [, ].This entry represents the open β-sheet domain found in vitellinogen, which generally corresponds to a domain within the lipovitellin-1 peptide product. This domain adopts a structure consisting of several large open β-sheets [ ].
Protein Domain
Name: DM10 domain
Type: Domain
Description: This entry represents the DM10 domain, which consists of approximately 105 residues whose function is unknown. The DM10 domain has been identified in only two types of proteins: nucleoside diphosphate kinases (which contain a single copy of the DM10 domain) and in an uncharacterised class of proteins (which contain multiple copies of DM10 domains).The nm23-H7 class of nucleoside diphosphate kinase (NDK7; ) consists of an N-terminal DM10 domain and two functional catalytic NDK domains [ , ]. In Chlamydomonas, this protein (termed p40) is tightly associated with the flagellar axoneme.The DM10 domain exists in multiple copies in an uncharacterised class of proteins that includes Chlamydomonas Rib72 and mammalian EF-hand domain-containing protein 1 and EF-hand domain-containing family member C2, EFHC1/2. Rib72 and EFHC1 each contain three repeated copies of the DM10 domain followed by a C-terminal domain containing two EF-hands that are predicted to bind calcium ions. Orthologous proteins are present in many other organisms that contain motile cilia including mouse, rat, Ciona, sea urchin, Leishmania, and trypanosomes. However, no obvious Rib72/EFHC1 relative has been identified in higher plants, or in the Caenorhabditis elegans genome, which encodes only one DM10 domain-containing protein.In Chlamydomonas, and possibly mammals, DM10 domain-containing proteins are tightly bound to the flagellar doublet microtubules. This suggests that DM10 domains might act as flagellar NDK regulatory modules or as units specifically involved in axonemal targeting or assembly [, ].DM10 domains have the same organisation of secondary structural elements, a pleckstrin homology (PH)-like fold, which includes seven beta strands, with a short 3-4 residue helix after the first strand, and a more extended alpha helical region at the C terminus [ , ].
Protein Domain
Name: CTF transcription factor/nuclear factor 1, conserved site
Type: Conserved_site
Description: Nuclear factor I (NF-I) or CCAAT box-binding transcription factor (CTF) [ , , ] (also known as TGGCA-binding proteins) are a family of vertebrate nuclear proteins which recognise and bind, as dimers, the palindromic DNA sequence 5'-TGGCANNNTGCCA-3'. This family was first described for its role in stimulating the initiation of adenovirus DNA replication []. In vertebrates there are four members NFIA, NFIB, NFIC, and NFIX and an orthologue from Caenorhabditis elegans has been described, called Nuclear factor I family protein (NFI-I) []. The CTF/NF-I proteins are individually capable of activating transcription and DNA replication, thus they function by regulating cell proliferation and differentiation. They are involved in normal development and have been associated with developmental abnormalities and cancer in humans []. In a given species, there are a large number of different CTF/NF-I proteins, generated both by alternative splicing and by the occurrence of four different genes. CTF/NF-1 proteins contain 400 to 600 amino acids. The N-terminal 200 amino-acid sequence, almost perfectly conserved in all species and genes sequenced, mediates site-specific DNA recognition, protein dimerisation and Adenovirus DNA replication. The C-terminal 100 amino acids contain the transcriptional activation domain. This activation domain is the target of gene expression regulatory pathways elicited by growth factors and it interacts with basal transcription factors and with histone H3 [].This entry represents a specific signature for this family of proteins, which includes the four vertebrate members NFIA, NFIB, NFIC and NFIX. The signature is a perfectly conserved, highly charged 12-residue peptide located in the DNA-binding domain of CTF/NF-I. It does not contain the four conserved Cys residues, which are required for its DNA-binding activity [ ].
Protein Domain
Name: Paraneoplastic encephalomyelitis antigen
Type: Family
Description: Many eukaryotic proteins that are either known or thought to bind single-stranded RNA contain one or more copies of a putative RNA-binding domain of about 90 amino acids [, ]. This region has been found in, for example,heterogeneous nuclear ribonucleoproteins, small nuclear ribonucleoproteins, pre-RNA and mRNA associated proteins, Drosophila sex determination and elavproteins, human paraneoplastic encephalomyelitis antigen HuD, and many others.The structure of an RNA-binding domain of Drosophila Sex-lethal (Sxl) protein has been determined using multi-dimensional hetero-nuclear NMR [].Sxl contains two RNP consensus-type RNA-binding domains (RBDs) - the determined structure represents the second of these (RBD-2) []. Thecalculated intermediate-resolution family of structures exhibits the beta-α-β/beta-α-β tertiary fold found in other RBD-containingproteins [ ].
Protein Domain
Name: RNA recognition motif domain, eukaryote
Type: Domain
Description: Many eukaryotic proteins that are known or supposed to bind single-stranded RNA contain one or more copies of a putative RNA-binding domain of about 90 amino acids. This is known as the eukaryotic putative RNA-binding region RNP-1 motif [ , ], or RNA recognition motif (RRM). RRMs are found in a variety of RNA binding proteins, including heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs). The motif also appears in a few single stranded DNA binding proteins. The RRM structure consists of four strands and two helices arranged in an α/β sandwich, with a third helix present during RNA binding in some cases [].
Protein Domain
Name: Sulphate anion transporter, conserved site
Type: Conserved_site
Description: A number of proteins involved in the transport of sulphate across a membrane as well as some yet uncharacterised proteins have been shown [, ] to be evolutionary related.These proteins are: Neurospora crassa sulphate permease II (gene cys-14).Yeast sulphate permeases (genes SUL1 and SUL2).Rat sulphate anion transporter 1 (SAT-1).Mammalian DTDST, a probable sulphate transporter which, in human, is involved in the genetic disease, diastrophic dysplasia (DTD).Sulphate transporters 1, 2 and 3 from the legume Stylosanthes hamata.Human pendrin (gene PDS), which is involved in a number of hearing loss genetic diseases.Human protein DRA (Down-Regulated in Adenoma).Soybean early nodulin 70.Escherichia coli hypothetical protein ychM.Caenorhabditis elegans hypothetical protein F41D9.5.These proteins are highly hydrophobic and seem to contain about 12 transmembrane domains.
Protein Domain
Name: SUI1 domain
Type: Domain
Description: In budding yeast (Saccharomyces cerevisiae), SUI1 is a translation initiation factor that functions in concert with eIF-2 and the initiator tRNA-Met indirecting the ribosome to the proper start site of translation [ ]. SUI1 is a protein of 108 residues. Close homologues of SUI1 have been found [] in mammals, insects and plants. SUI1 is also evolutionary related to:Hypothetical proteins from bacteria such as Escherichia coli (yciH) or Haemophilus influenzae (HI1225).Hypothetical proteins from archaea such as Methanococcus jannaschii (MJ0463).Two eukaryotic proteins also seem to contain a C-terminal SUI1-like domain. These are:Density-regulated protein (gene: DENR). This protein is found in mammals, insects, nematodes, plants and fungi.Ligatin (gene: LGTN). This protein is found in mammals and insects.
Protein Domain
Name: Periplasmic solute binding protein, ZnuA-like
Type: Family
Description: Members of this family constitute the solute-binding protein component of cluster-9 ABC transporters. This family includes periplasmic solute binding protein TroA from Treponema pallidum, which binds Zn2+ and Mn2+ [ ]. In Streptococcus suis, TroA is required for manganese acquisition []. Related proteins are found in both Gram-positive and Gram-negative bacteria, including manganese-binding lipoprotein MntA [] and high-affinity zinc uptake system protein ZnuA [].ZnuA is part of the bacterial zinc-uptake complex ZnuABC. ZinT acts as a Zn2+-buffering protein that delivers Zn2+ to ZnuA, a high-affinity zinc-uptake protein. In Gram-negative bacteria the ZnuABC transporter system ensures an adequate import of zinc in Zn2+-poor environments, such as those encountered by pathogens within the infected host [ , ].
Protein Domain
Name: Intracellular chloride channel
Type: Family
Description: The chloride intracellular channel (CLIC) family represent a subgroup of the glutathione-S-transferase (GSTs) superfamily. CLIC proteins can exist as both soluble globular proteins and integral membrane proteins with ion channel function [ ]. Membrane insertion seems to be redox-regulated [] and has a strong pH dependence []. CLIC proteins are enigmatic as they do not fit the patterns of other, well-characterised ion channel proteins that show clear and often multiple TM domains []. They share a conserved C-terminal CLIC module, with several members containing additional, unrelated N-terminal domains. CLIC proteins are highly conserved in vertebrates, which usually possess six distinct paralogues (CLIC1-CLIC6) []. They are multifunctional proteins and participate in a wide variety of signaling activities.
Protein Domain
Name: NBR1, PB1 domain
Type: Domain
Description: The PB1 domain is an essential part of NBR1 (next to BRCA1) protein, a scaffold protein mediating specific protein-protein interaction with both titin protein kinase and with another scaffold protein p62 [ ].A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domain, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster [ ]. The NBR1 protein contains a type I PB1 domain.
Protein Domain
Name: Calponin repeat
Type: Repeat
Description: Calponin [ , ] is a thin filament-associated protein that is implicated in the regulationand modulation of smooth muscle contraction. It is capable of binding to actin, calmodulin, troponin C and tropomyosin. The interaction of calponin with actin inhibits the actomyosin MgATPase activity. Calponin is abasic protein of approximately 34 Kd. Multiple isoforms are found in smooth muscles. Calponin contains three repeats of a well conserved 26 amino acid domain. Such a domain is also found in vertebrate smooth muscle protein(SM22 or transgelin), and a number of other proteins whose physiological role is not yet established, including Drosophila synchronous flight muscle protein SM20, Caenorhabditis elegans unc-87 protein [], rat neuronal protein NP25[ ], and an Onchocerca volvulus antigen [].
Protein Domain
Name: SUI1 domain superfamily
Type: Homologous_superfamily
Description: In budding yeast (Saccharomyces cerevisiae), SUI1 is a translation initiation factor that functions in concert with eIF-2 and the initiator tRNA-Met indirecting the ribosome to the proper start site of translation [ ]. SUI1 is a protein of 108 residues. Close homologues of SUI1 have been found [] in mammals, insects and plants. SUI1 is also evolutionary related to:Hypothetical proteins from bacteria such as Escherichia coli (yciH) or Haemophilus influenzae (HI1225).Hypothetical proteins from archaea such as Methanococcus jannaschii (MJ0463).Two eukaryotic proteins also seem to contain a C-terminal SUI1-like domain. These are:Density-regulated protein (gene: DENR). This protein is found in mammals, insects, nematodes, plants and fungi.Ligatin (gene: LGTN). This protein is found in mammals and insects.
Protein Domain
Name: FNBP1, SH3 domain
Type: Domain
Description: This entry represents the SH3 domain of FNBP1.Formin-binding protein 1 (FNBP1, also known as formin-binding protein 17) contains a N-terminal FER-CIP4 homology (FCH) domain and a C-terminal SH3 domain. It belongs to the CIP4 (Cdc42 interacting protein-4) subfamily of the F-BAR protein family. F-BAR proteins (F for FCH, Fer-CIP4 homology domain) are proteins with an extended CIP4-Fer domain. The F-BAR proteins have been implicated in cell membrane processes such as membrane invagination, tubulation and endocytosis [ ]. FNBP1 was originally isolated as a molecule that binds to the proline-rich region of formin []. It induces tubular membrane invaginations and participates in endocytosis []. It interacts with sorting nexin, SNX2, and is linked to acute myelogeneous leukemia [].
Protein Domain
Name: BamE-like
Type: Homologous_superfamily
Description: The fold of this superfamily, tandemly repeated in beta-lactamase-inhibitor BLIP, has an alpha(2)-beta(4) structure [ ]. It is also found in osmotically-inducible lipoprotein OsmE, which has no known function [], and in outer membrane protein assembly factor BamE.BamE is part of the outer membrane protein (OMP) assembly Bam complex (composed of the outer membrane protein BamA, and four lipoproteins BamB, BamC, BamD and BamE), which is involved in assembly and insertion of β-barrel proteins into the outer membrane [ , , , , ]. E. coli BamE is a nonessential member of the complex that stabilizes the interaction between the essential proteins BamA and BamD. It may modulate the conformation of BamA, likely through interactions with BamD [].
Protein Domain
Name: Peptidase C19, ubiquitin carboxyl-terminal hydrolase
Type: Domain
Description: Ubiquitin carboxyl-terminal hydrolases (UCH) ( ) [ ] are thiol proteases that recognise and hydrolyse the peptide bond at the C-terminal glycine of ubiquitin. These enzymes are involved in the processing of poly-ubiquitin precursors as well as that of ubiquinated proteins. The deubiquitinsing proteases can be split into 2 size ranges, 20-30kDa( ) and 100-200kDa [ ]: the second class consist of large proteins (800 to 2000 residues) that belong to the peptidase family C19, and this group is currently represented by yeast UBP1 []. UCH thiol proteases contain an N-terminal catalytic domain sometimes followed by C-terminal extensions that mediate protein-protein interactions [ ]. This entry represents the catalytic domain of UCH proteins of the UBP1 group.A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families [ ]. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid, N-ethylmaleimide or p-chloromercuribenzoate. Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [ ].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [ ]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [ ]. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.
Protein Domain
Name: Aconitase B, swivel
Type: Domain
Description: Aconitase (aconitate hydratase; ) is an iron-sulphur protein that contains a [4Fe-4S]-cluster and catalyses the interconversion of isocitrate and citrate via a cis-aconitate intermediate. Aconitase functions in both the TCA and glyoxylate cycles, however unlike the majority of iron-sulphur proteins that function as electron carriers, the [4Fe-4S]-cluster of aconitase reacts directly with an enzyme substrate. In eukaryotes there is a cytosolic form (cAcn) and a mitochondrial form (mAcn) of the enzyme. In bacteria there are also 2 forms, aconitase A (AcnA) and B (AcnB). Several aconitases are known to be multi-functional enzymes with a second non-catalytic, but essential function that arises when the cellular environment changes, such as when iron levels drop [ , ]. Eukaryotic cAcn and mAcn, and bacterial AcnA have the same domain organisation, consisting of three N-terminal alpha/beta/alpha domains, a linker region, followed by a C-terminal 'swivel' domain with a beta/beta/alpha structure (1-2-3-linker-4), although mAcn is smaller than cAcn. However, bacterial AcnB has a different organisation: it contains an N-terminal HEAT-like domain, followed by the 'swivel' domain, then the three alpha/beta/alpha domains (HEAT-4-1-2-3) [].Eukaryotic cAcn enzyme balances the amount of citrate and isocitrate in the cytoplasm, which in turn creates a balance between the amount of NADPH generated from isocitrate by isocitrate dehydrogenase with the amount of acetyl-CoA generated from citrate by citrate lyase. Fatty acid synthesis requires both NADPH and acetyl-CoA, as do other metabolic processes, including the need for NADPH to combat oxidative stress. The enzymatic form of cAcn predominates when iron levels are normal, but if they drop sufficiently to cause the disassembly of the [4Fe-4S]-cluster, then cAcn undergoes a conformational change from a compact enzyme to a more open L-shaped protein known as iron regulatory protein 1 (IRP1; or IRE-binding protein 1, IREBP1) [, ]. As IRP1, the catalytic site and the [4Fe-4S]-cluster are lost, and two new RNA-binding sites appear. IRP1 functions in the post-transcriptional regulation of genes involved in iron metabolism - it binds to mRNA iron-responsive elements (IRE), 30-nucleotide stem-loop structures at the 3' or 5' end of specific transcripts. Transcripts containing an IRE include ferritin L and H subunits (iron storage), transferrin (iron plasma chaperone), transferrin receptor (iron uptake into cells), ferroportin (iron exporter), mAcn, succinate dehydrogenase, erythroid aminolevulinic acid synthetase (tetrapyrrole biosynthesis), among others. If the IRE is in the 5'-UTR of the transcript (e.g. in ferritin mRNA), then IRP1-binding prevents its translation by blocking the transcript from binding to the ribosome. If the IRE is in the 3'-UTR of the transcript (e.g. transferrin receptor), then IRP1-binding protects it from endonuclease degradation, thereby prolonging the half-life of the transcript and enabling it to be translated [ ].IRP2 is another IRE-binding protein that binds to the same transcripts as IRP1. However, since IRP1 is predominantly in the enzymatic cAcn form, it is IRP2 that acts as the major metabolic regulator that maintains iron homeostasis [ ]. Although IRP2 is homologous to IRP1, IRP2 lacks aconitase activity, and is known only to have a single function in the post-transcriptional regulation of iron metabolism genes []. In iron-replete cells, IRP2 activity is regulated primarily by iron-dependent degradation through the ubiquitin-proteasomal system.Bacterial AcnB is also known to be multi-functional. In addition to its role in the TCA cycle, AcnB was shown to be a post-transcriptional regulator of gene expression in Escherichia coli and Salmonella enterica [ , ]. In S. enterica, AcnB initiates a regulatory cascade controlling flagella biosynthesis through an interaction with the ftsH transcript, an alternative RNA polymerase sigma factor. This binding lowers the intracellular concentration of FtsH protease, which in turn enhances the amount of RNA polymerase sigma32 factor (normally degraded by FtsH protease), and sigma32 then increases the synthesis of chaperone DnaK, which in turn promotes the synthesis of the flagellar protein FliC. AcnB regulates the synthesis of other proteins as well, such as superoxide dismutase (SodA) and other enzymes involved in oxidative stress.This entry represents the 'swivel' domain of bacterial aconitase B (AcnB) that is located in the N-terminal region following a HEAT-like domain. HEAT-like domains are usually implicated in protein-protein interactions, while the 'swivel' domain is usually a mobile unit in proteins that carry it. In AcnB, this N-terminal region was shown to be sufficient for dimerisation and for AcnB binding to mRNA. An iron-mediated dimerisation mechanism may be responsible for switching AcnB between its catalytic and regulatory roles, as dimerisation requires iron while mRNA binding is inhibited by iron.
Protein Domain
Name: Calycin
Type: Homologous_superfamily
Description: Calycins form a large protein superfamily that share similar β-barrel structures. Calycins can be divided into families that include lipocalins, fatty acid binding proteins, triabin, and thrombin inhibitor [ ]. Of these families, the lipocalin family () is the largest and functionally the most diverse. Lipocalins are extracellular proteins that share several common recognition properties such as ligand binding, receptor binding and the formation of complexes with other macromolecules. Lipocalins include the retinol binding protein, lipocalin allergen, aphrodisin (a sex hormone), alpha-2U-globulin, prostaglandin D synthase, beta-lactoglobulin, bilin-binding protein, and the nitrophorins [ , , , ]. Bacterial hypothetical proteins YodA from Escherichia coli and YwiB from Bacillus subtilis share a similar calycin β-barrel structure. Part of the YodA hypothetical protein has a calycin-like structure [].
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom