Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 13701 to 13800 out of 30763 for seed protein

Category restricted to ProteinDomain (x)

0.032s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Name: STAM2, SH3 domain
Type: Domain
Description: This entry represents the SH3 domain of STAM2 (signal transducing adapter molecule 2). STAM2, a subunit of the ESCRT-0 complex, is an endosomal protein acting as a regulator of receptor signaling and trafficking. It contains an N-terminal VHS domain (Vps-27/Hrs/Stam), UIM (ubiquitin-interacting motif), SH3 (Src homology 3), CC ("coiled coil"), SSM (STAM-specific motif) and C-terminal ITAM (immunoreceptor tyrosine-based activation motif) domain. It may play a regulatory role in the endosomal sorting of ubiquitinated membrane proteins [ ]. STAMs were discovered as proteins that are highly phosphorylated following cytokine and growth factor stimulation [ ]. They function in cytokine signaling and surface receptor degradation, as well as regulate Golgi morphology. They associate with many proteins including Jak2 and Jak3 tyrosine kinases [], Hrs, AMSH, and UBPY. STAM adaptor proteins contain VHS (Vps27, Hrs, STAM homology), ubiquitin interacting (UIM), and SH3 domains [].
Protein Domain
Name: Claudin-18
Type: Family
Description: Claudins form the paracellular tight junction seal in epithelial tissues. In humans, 24 claudins (claudin 1-24) have been identified. Their ability to polymerise and form strands is affected by the cell types [ , , ]. They can also form heteropolymers with each other within and between tight junction strands []. Most of the claudins (claudin-12 being the exception) have a C-terminal PDZ-binding motif that can interact with other PDZ domain proteins, such as scaffolding protein, ZO-1, -2 and -3 []. They also interact with non-tight junction proteins, such as cell adhesion proteins EpCam and tetraspanins and the signaling proteins, ephrin A and B and their receptors, EphA and EphB [].Human and mouse isoforms of claudin-18 have been cloned. Claudin-18 shares ~22-40% overall similarity with other claudin family members at the aminoacid level, displaying highest similarity to claudin-1.
Protein Domain
Name: OX-2 membrane glycoprotein-like
Type: Family
Description: This entry represents OX-2 membrane glycoprotein (OX2G) from humans and other vertebrates and its viral homologues, including OX-2 membrane glycoprotein homologue (OX2V) from Human herpesvirus 8 (HHV-8). Member of this protein family show two immunoglobulin domains. OX2G, also known as CD200, co-stimulates T-cell proliferation. It may regulate myeloid cell activity in a variety of tissues [ , , ]. It is involved in regulation of macrophage function []. Interaction of CD200 with its receptor acts to suppress inflammatory responses [, , , ]. This protein has been shown to play important roles in controlling autoimmunity, inflammation, the development and spread of cancer, hypersensitivity, and spontaneous fetal loss [ ]. OX2V is encoded by herpesvirurses and has significant homology with cellular OX2. Viral OX2 encodes a glycosylated cell surface protein which has been shown to target myeloid-lineage cells [ ].
Protein Domain
Name: BRCA1-associated
Type: Family
Description: This entry includes animal BRCA1 (BREAST CANCER SUSCEPTIBILITY 1) proteins and their homologues from plants [ ].The aimal BRCA1 protein is a E3 ubiquitin-protein ligase that specifically mediates the formation of 'Lys-6'-linked polyubiquitin chains and plays a central role in DNA repair by facilitating cellular responses to DNA damage [ , ]. It contains an N-terminal zinc-finger domain. It also contains a BRCT C-terminal domain, an approximately 100 amino acid tandem repeat, which appears to act as a phospho-protein binding domain []. Arabidopsis BRCA1-related proteins includes AtBRCA1(AT4G21070, ) and AtBARD1 (AT1G04020, ). Both are involved in DNA repair in plants [ , ]. AtBRCA1 can be induced by gamma-rays []. AtBARD1, also known as Row1, functions mainly as a REPRESSOR OF WUSCHEL1, which is a transcription repressor required to regulate the maintenance of stem cell populations in shoot meristems [, ].
Protein Domain
Name: Claudin-15
Type: Family
Description: Claudins form the paracellular tight junction seal in epithelial tissues. In humans, 24 claudins (claudin 1-24) have been identified. Their ability to polymerise and form strands is affected by the cell types [ , , ]. They can also form heteropolymers with each other within and between tight junction strands []. Most of the claudins (claudin-12 being the exception) have a C-terminal PDZ-binding motif that can interact with other PDZ domain proteins, such as scaffolding protein, ZO-1, -2 and -3 []. They also interact with non-tight junction proteins, such as cell adhesion proteins EpCam and tetraspanins and the signaling proteins, ephrin A and B and their receptors, EphA and EphB [].Human and mouse isoforms of claudin-15 have been cloned. Claudin-15 shares ~25-45% overall similarity with other claudin family members at the aminoacid level, displaying highest similarity to claudin-10.
Protein Domain
Name: UPF0234, N-terminal
Type: Homologous_superfamily
Description: This superfamily represents the N-terminal domain of UPF0234 uncharacterised proteins, which includes YajQ.In Pseudomonas syringae, YajQ functions as a host protein involved in the temporal control of bacteriophage Phi6 gene transcription. It has been shown to bind to the phage's major structural core protein P1, most likely activating transcription by acting indirectly on the RNA polymerase. YajQ may remain bound to the phage particles throughout the infection period [ , ]. Earlier, YajQ was characterized as a putative nucleic acid-binding protein based on the similarity of its (ferredoxin-like) three-dimensional topology with that of RNP-like RNA-binding domains [, ].The polypeptide chain of YajQ is folded into two domains with identical folding topology. Each domain has a four-stranded antiparallel β-sheet flanked on one side by two α-helices. This structural motif is a characteristic feature of many RNA-binding proteins [ ].
Protein Domain
Name: Effector-associated domain 10
Type: Domain
Description: This entry represents the effector-associated domain 10 (EAD10) found in cyanobacteria with a predicted mixed alpha+beta character [ ].Effector-associated domains (EADs) are predicted to function as adaptor domains mediating protein-protein interactions. The EADs show a characteristic architectural pattern. One copy is always fused, typically to the N- or C-terminal, of a core component of a biological conflict system; examples include VMAP (vWA-MoxR associated protein), iSTAND (inactive STAND (iSTAND) NTPase system), or GAP1 (GTPase-associated protein 1). Further copies of the same EAD are fused to either effector or signal-transducing domains, or additional EADs. EAD pairs are frequently observed together on the genome in conserved gene neighborhoods, but can also be severed from such neighborhoods and located in distant regions, indicating EAD-EAD protein domain coupling approximates the advantages of collinear transcription [ , ]. EADs are all small domains with no enzymatic features.
Protein Domain
Name: ARMET, N-terminal
Type: Domain
Description: This entry represents the N-terminal domain of ARMET. ARMET, also known as mesencephalic astrocyte-derived neurotrophic factor (MANF) or arginine-rich protein, is a small protein of approximately 170 residues which contains four di-sulphide bridges that are highly conserved from nematodes to humans. It is a soluble protein resident in the endoplasmic reticulum and induced by ER stress. It appears to be involved with dealing with mis-folded proteins in the ER, thus in quality control of ER stress [ ]. ARMET from Rattus norvegicus (Rat) selectively promotes the survival of dopaminergic neurons of the ventral mid-brain. It modulates GABAergic transmission to the dopaminergic neurons of the substantia nigra, and enhances spontaneous, as well as evoked, GABAergic inhibitory postsynaptic currents in dopaminergic neurons [].Proteins containing this domain includes the related neurotrophic factor CDNF (cerebral dopamine neurotrophic factor) [ ].
Protein Domain
Name: Fimbrial membrane usher, conserved site
Type: Conserved_site
Description: In Gram-negative bacteria the biogenesis of fimbriae (or pili) requires a two- component assembly and transport system which is composed of a periplasmicchaperone (see ) and an outer membrane protein which has been termed a molecular 'usher' [, , ]. The usher protein is rather large (from 86 to 100 Kd) and seems to be mainly composed of membrane-spanning β-sheets, astructure reminiscent of porins. Although the degree of sequence similarity of these proteins is not very highthey share a number of characteristics. One of these is the presence of two pairs of cysteines, the first one located in the N-terminal part and the secondat the C-terminal extremity that are probably involved in disulphide bonds. The best conserved region is located in the central part of these proteins.This entry represents the conserved site of the Fimbrial biogenesis outer membrane usher protein.
Protein Domain
Name: Uncharacterized cupredoxin-like protein, Cyanobacteria-type
Type: Domain
Description: This entry represents a group of uncharacterised proteins from Cyanobacteria that share protein sequence similarity with cupredoxin. Cupredoxins are blue copper proteins because they have an intense blue color due to the presence of a mononuclear type 1 (T1) copper site. Structurally, the cupredoxin-like fold consists of a β-sandwich with 7 strands in 2 β-sheets, which is arranged in a Greek-key β-barrel. Some of these proteins have lost the ability to bind copper. Majority of family members contain multiple cupredoxin domain repeats; ceruloplasmin and coagulation factors V/VIII have six repeats; Laccase, ascorbate oxidase, and spore coat protein A, and multicopper oxidase CueO contain three repeats; and nitrite reductase has two repeats. Others are mono-domain cupredoxins, such as plastocyanin, pseudoazurin, plantacyanin, azurin, rusticyanin, stellacyanin, quinol oxidase and the periplasmic domain of cytochrome c oxidase subunit II [ , ].
Protein Domain
Name: JAB domain, prokaryotic
Type: Domain
Description: This entry represents the JAB domain in prokaryotes. The domain is widely found in bacteria, archaea and phages. Its function is still not clear. However, in eukaryotes, JAB domain has been found in metalloenzymes that function as the ubiquitin isopeptidase/deubiquitinase in the ubiquitin-based signaling and protein turnover pathways [ ]. Prokaryotic proteins containing JAB domains are predicted to have a similar role in their cognates of the ubiquitin modification pathway []. A distinct family, the RadC-like JAB domains are widespread in bacteria and are predicted to function as nucleases []. In halophilic archaea, the JAB domain shows strong gene-neighbourhood associations with a nucleotidyltransferase suggesting a role in nucleotide metabolism [].The archaeal (H. volcanii) JAB (also known as JAMM) domain containing protein, HvJAMM1, has been characterised [ ]. It cleaves ubiquitin-like small archaeal modifier proteins (SAMP1/2) from protein conjugates [].
Protein Domain
Name: CFAP53/TCHP
Type: Family
Description: This entry includes CFAP53 and TCHP.Cilia- and flagella-associated protein 53 (CFAP53), also known as coiled-coil domain-containing protein 11 (CCDC11), is a novel centriolar satellite protein essential for the assembly and function of motile cilia and establishment of left-right asymmetry [ , ]. In zebrafish, it is required for cilia rotation specifically in Kupffer's vesicle, the zebrafish laterality organ [].Trichoplein keratin filament-binding protein (TCHP) may act as a 'capping' or 'branching' protein for keratin filaments in the cell periphery. It may regulate K8/K18 filament and desmosome organisation mainly at the apical or peripheral regions of simple epithelial cells [ ]. In human, it acts as a tumor suppressor which has the ability to inhibit cell growth and be pro-apoptotic during cell stress. It inhibits cell growth in bladder and prostate cancer cells by a down-regulation of HSPB1 by inhibiting its phosphorylation [].
Protein Domain
Name: Niban-like
Type: Family
Description: The Niban-like family is also known as family with sequence similarity 129 (FAM129). This family consists of Niban (FAM129A), Niban-like protein 1 (FAM129B or MINERVA) and Niban-like protein 2 (FAM129C) [ ]. Overexpression of Niban (FAM129A) has been detected in patients with many types of cancer, including thyroid, head and neck, renal, and liver cancer. Niban is highly expressed in the early stages of cancer development and remains overexpressed throughout the cancer progression [ , , , ]. It has been suggested that Niban might be involved in the ER stress response and can modulate cell death signaling by regulating translation []. Niban-like protein 1 was suggested to play a role in apoptosis suppression in cancer cells [ ]. Niban-like protein 2 is a B-cell membrane protein that is overexpressed in chronic lymphocytic leukemia [ ].
Protein Domain
Name: Epithelial sodium channel, chordates
Type: Family
Description: The epithelial Na+ channel (ENaC) proteins consist of sodium channels from animals and has no recognizable homologues in other eukaryotes or bacteria. The vertebrate ENaC proteins from epithelialcells cluster tightly together on the phylogenetic tree: voltage-insensitive ENaC homologues are also found in the brain. Eleven sequenced Caenorhabditis elegans proteins, includingthe degenerins ( ), are distantly related to the vertebrate proteins as well as to each other. At least some of the proteins in this group form part of a mechano-transducing complex for touch sensitivity. Others include the acid-sensing ion channels, ASIC1-3 that are homo- or hetero-oligomeric neuronal H+-gated channels that mediate pain sensation in response to tissue acidosis. Mammalian ENaC is important for the maintenance of Na+ balance and the regulation of blood pressure. Three homologous ENaC subunits, a, b and g, havebeen shown to assemble to form the highly Na+-selective channel.
Protein Domain
Name: Transcription regulator HTH, HdfR
Type: Family
Description: Numerous bacterial transcription regulatory proteins bind DNA via a helix-turn-helix (HTH) motif. These proteins are very diverse, but for convenience may be grouped into subfamilies on the basis of sequence similarity. One such family, the lysR family, groups together a range of proteins, including AmpR, CatM, CatR, CynR, CysB, GltC, IlvY, IrgB, LysR, MetR, NhaR, SyrM, TcbR, TfdS and TrpI [ , , , , ]. The majority of these proteins appear to be transcription activatorsand most are known to negatively regulate their own expression. All possess a potential HTH DNA-binding motif towards their N-terminal end.The hdfR gene encodes a LysR family protein. HdfR is able to bind to the flhDC promoter, indicating that HdfR is a transcriptional regulator for the flagellar master operon. Furthermore, the expression of the hdfR gene was shown to be negatively regulated by H-NS [ ].
Protein Domain
Name: Histone H3-K79 methyltransferase
Type: Family
Description: The enzymes that catalyse histone methylation are classified into 3 families: the PRMT (protein arginine N-methyltransferase) family, the SET-domain-containing family, and the non-SET domain proteins. Lysine 79 (K79) of H3 is methylated by a methyltransferase represented in this entry that lacks a SET domain [ ].H3K79 methylation occurs in a variety of organisms ranging from yeast to human. In budding yeast, K79 methylation is mediated by the silencing protein Dot1 [ ]. Dot1 homologues can be found in a variety of eukaryotic organisms; in mammals the homologous protein is called Dot1-like protein (DOT1L) [, ]. Dot1/DOT1L catalyses the sequential mono-, di-, and tri-methylation of H3K79, while SET domain-containing methyltransferases mediate methylation in a processive manner []. K79 methylation level is regulated throughout the cell cycle and plays a critical role in the progression of G1 phase, S phase, mitosis, and meiosis [].
Protein Domain
Name: Derlin
Type: Family
Description: The endoplasmic reticulum (ER) of the yeast Saccharomyces cerevisiae (Baker's yeast) contains a proteolytic system able to selectively degrade misfolded lumenal secretory proteins. For examination of the components involved in this degradation process, mutants were isolated. They could be divided into four complementation groups. The mutations led to stabilisation of two different substrates for this process, and the classes were called der for degradation in the ER. DER1 was cloned by complementation of the der1-2 mutation. The DER1 gene codes for a novel, hydrophobic protein that is localized to the ER. Deletion of DER1 abolished degradation of the substrate proteins, suggesting that the function of the Der1 protein may be specifically required for the degradation process associated with the ER [ ]. Interestingly this family seems distantly related to the Rhomboid family of membrane peptidases. This family may also mediate degradation of misfolded proteins.
Protein Domain
Name: Trm112-like
Type: Family
Description: Trm112 is required for tRNA methylation in Saccharomyces cerevisiae (Baker's yeast) and is found in complexes with 2 tRNA methylases (Trm9 and Trm11) also with putative methyltransferase Ydr140w [ ]. Trm112 from S. cerevisiae (Ynr046w) is plurifunctional and a component of the eRF1 methyltransferase []. The crystal structure of Ynr046w has been determined to 1.7 A resolution. It comprises a zinc-binding domain built from both the N- and C-terminal sequences and an inserted domain, absent from bacterial and archaeal orthologs of the protein, composed of three α-helices [].Trm112 has also been described in archaea (UPF0434 protein from Haloferax volcanii). UPF0434 proteins are found both in bacteria and archaea, and the study of interacting partners from the H. volcanii member appears to indicate that Trm112 is a general partner for methyltransferases in all organisms [ ].This entry also includes mitochondrial protein preY, an uncharacterized protein from vertebrates.
Protein Domain
Name: Endonuclease MutS2
Type: Family
Description: The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair. Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts [ ]. This entry represents endonuclease MutS2. MutS2 is a paralogue of MutS and is not involved in DNA mismatch repair but in the suppression of homologous recombination [ , , , ]. It may therefore have a key role in the control of bacterial genetic diversity.
Protein Domain
Name: rRNA 2'-O-methyltransferase fibrillarin-like
Type: Family
Description: rRNA 2'-O-methyltransferase fibrillarin (Fibrillarin) is a component of a nucleolar small nuclear ribonucleoprotein (SnRNP) [ , ]. It is a S-adenosyl-L-methionine-dependent methyltransferase that has the ability to methylate both RNAs and proteins. Site specificity is provided by a guide RNA that base pairs with the substrate and methylation occurs at a characteristic distance from the sequence involved in base pairing with the guide RNA [, , , , ]. Fibrillarin is associated with U3, U8 and U13 small nuclear RNAs in mammals [] and is similar to the yeast NOP1 protein []. It has awell conserved sequence of around 320 amino acids, and contains 3 domains, an N-terminal Gly/Arg-rich region; a central domain resembling other RNA-binding proteins and containing an RNP-2-like consensussequence; and a C-terminal α-helical domain. An evolutionarily related pre-rRNA processing protein, which lacks the Gly/Arg-rich domain, has been found in various archaebacteria.
Protein Domain
Name: PLAT/LH2 domain
Type: Domain
Description: This entry represents a domain found in a variety of membrane or lipid associated proteins. It is known as the PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology) domain, is found in a variety of membrane or lipid associated proteins. Structurally, this domain forms a β-sandwich composed of two sheets of four strands each [ , , ]. The most highly conserved regions coincide with the β-strands, with most of the highly conserved residues being buried within the protein. An exception to this is a surface lysine or arginine that occurs on the surface of the fifth β-strand of the eukaryotic domains. In pancreatic lipase, the lysine in this position forms a salt bridge with the procolipase protein. The conservation of a charged surface residue may indicate the location of a conserved ligand-binding site. It is thought that this domain may mediate membrane attachment via other protein binding partners.
Protein Domain
Name: CXC domain
Type: Domain
Description: Polycomb group (Pc-G) proteins ensure the stable inheritance of expression patterns through cell division and regulate the control of cell proliferation.The Enhancer of zeste (E(z))-type of Pc-G proteins includes: Drosophila melanogaster Enhancer of zeste E(z).Mammalian ENX-1 and ENX-2.Arabidopsis thaliana CURLY LEAF (CLF), a transcriptional repressor of floral homeotic gene AGAMOUS.Arabidopsis thaliana CLF-like.Arabidopsis thaliana MEDEA (MEA), a suppressor of endosperm development.Arabidopsis thaliana EZA1.These proteins contain a SET domain at the C terminus.Unique to them is the presence of a CXC domain, an ~65-residue cys-rich region preceding the SET domain. The spacing of 17 cyteines is conserved. The CXCdomain contains three units of C-X(6)-C-X(3)-C-X-C motif, although the most C- terminal unit is reverse-oriented. Because of its evolutionary conservation,the CXC domain is likely to be involved in an important function of E(z)- related proteins [, , ].The CXC domain shows some similarity to the CRC domain found in the tesmin/ TSO1 protein family [].
Protein Domain
Name: Tubulin/FtsZ, GTPase domain
Type: Domain
Description: This entry represents a GTPase domain found in all tubulin chains, such as tubulin alpha, beta and gamma chains, plant ARC3 and prokaryotic FtsZ and CetZ proteins [ , ]. These proteins are involved in polymer formation. Tubulin is the major component of microtubules, while FtsZ (homologue of eukaryotic tubulin) is the polymer-forming protein of bacterial cell division, it is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells [, ]. FtsZ can polymerise into tubes, sheets, and rings in vitro and is ubiquitous in bacteria and archaea. CetZ co-exists with FtsZ in many archaea. Cetz does not affect cell division, instead, it is involved in cell shape control []. Arabidopsis chloroplast protein ARC3 (At1g75010) is a Z-ring accessory protein involved in the initiation of plastid division and division site placement [, ].
Protein Domain
Name: Dynamin
Type: Family
Description: Membrane transport between compartments in eukaryotic cells requires proteins that allow the budding and scission of nascent cargo vesicles from one compartment and their targeting and fusion with another. Dynamins are large GTPases that belong to a protein superfamily [] that, in eukaryotic cells, includes classical dynamins, dynamin-like proteins, OPA1, Mx proteins, mitofusins and guanylate-binding proteins/atlastins [, , , ], and are involved in the scission of a wide range of vesicles and organelles. They play a role in many processes including budding of transport vesicles, division of organelles, cytokinesis and pathogen resistance.The minimal distinguishing architectural features that are common to all dynamins and are distinct from other GTPases are the structure of the large GTPase domain (300 amino acids) and the presence of two additional domains; the middle domain and the GTPase effector domain (GED), which are involved in oligomerization and regulation of the GTPase activity.
Protein Domain
Name: Harbinger transposase-derived nuclease, animal
Type: Family
Description: This entry represents the Harbinger transposase-derived proteins mostly from animals. Proteins in this family may have nuclease activity, but do not appear to have transposase activity [ , ].Harbinger DNA transposons have been identified in protists, plants, insects, worms, and vertebrates. However, mammals do not have Harbinger transposons. In human, no recognisable members of Harbinger transposase superfamily are found. Instead, a widely expressed HARBI1 gene encoding a 350-amino acid protein derived from a Harbinger transposase has been identified []. The HARBI1 protein is conserved in humans, rats, mice, cows, pigs, chickens, frogs and various bony fish []. Conserved motifs, which are expected to be catalytic centres of nuclease/ligase reactions necessary for transpositions, found in the Harbinger transposases, are also well preserved in the HARBI1 proteins []. It was also proposed that these hypothetical HARBI1 nucleases are also characterised by a strong DNA-target specificity [].
Protein Domain
Name: Outer membrane protein, OmpA
Type: Family
Description: The OmpA outer membrane proteins in this group all contain the OmpA-like transmembrane domain at the N-terminal and the conserved bacterial outer membrane protein domain at the C-terminal. The outer membrane protein A of Escherichia coli (OmpA), is one of the most studied proteins in this group [ ]. It has a multifunctional role. OmpA is required for the action of colicins K and L and for the stabilisation of mating aggregates in conjugation. OmpA may be involved in the maintenance of the position of the peptidoglycan cell wall in the periplasm by non-covalent interaction with TolR []. These proteins are implicated as a secondary receptor for a number of T-even like phages, for example, during phage Sf6 infection, it requires both lipopolysaccharide and OmpA []. In addition, OmpA can act as a porin with low permeability that allows slow penetration of small solutes [].
Protein Domain
Name: Acyl-protein synthetase, LuxE, bacterial
Type: Family
Description: LuxE is an acyl-protein synthetase found in bioluminescent bacteria. LuxE catalyses the formation of an acyl-protein thiolester from a fatty acid and a protein. This is the second step in the bioluminescent fatty acid reduction system, which converts tetradecanoic acid to the aldehyde substrate of the luciferase-catalysed bioluminescence reaction [ ]. A conserved cysteine found at position 364 in Photobacterium phosphoreum LuxE () is thought to be acylated during the transfer of the acyl group from the synthetase subunit to the reductase. The C-terminal of the synthetase is though to act as a flexible arm to transfer acyl groups between the sites of activation and reduction [ ]. A LuxE domain is also found in the Vibrio cholerae RBFN protein (), which is involved in the biosynthesis of the O-antigen component 3-deoxy-L-glycero-tetronic acid. This group represents an acyl-protein synthetase, LuxE type found in bacteria.
Protein Domain
Name: SASP, alpha/beta-type superfamily
Type: Homologous_superfamily
Description: Small, acid-soluble spore proteins (SASP or ASSP) are proteins bound to the spore DNA of bacteria of the genera Bacillus, Thermoactynomycetes, and Clostridium [ , ]. They are double-stranded DNA-binding proteins that cause DNA to change to an A-like conformation. They protect the DNA backbone from chemical and enzymatic cleavage and are thus involved in dormant spore's high resistance to UV light. SASP are degraded in the first minutes of spore germination and provide amino acids for both new protein synthesis and metabolism.There are two distinct families of SASP: the alpha/beta type and the gamma-type. Alpha/beta SASP are small proteins of about sixty to seventy amino acid residues that are generally coded by a multigene family. The N terminus of alpha/beta SASP contains the site which is cleaved by a SASP-specific protease that acts during germination while the C terminus and is probably involved in DNA-binding.
Protein Domain
Name: Floricaula/leafy, C-terminal domain superfamily
Type: Homologous_superfamily
Description: This domain is found in various plant development proteins which are homologues of floricaula (FLO) and Leafy (LFY) proteins which are floral meristem identity proteins. Mutations in the sequences of these proteins affect flower and leaf development. LFY is a plant-specific transcription factor (TF) essential for flower development. It is one of the few master regulators of flower development, as it integrates environmental and endogenous signals to orchestrate the whole floral network. Transcription factors such as LFY recognize short DNA motifs primarily through their DNA-binding domain. Upon binding to short stretches of DNA called cis-elements or TF binding sites (TFBS), they regulate gene expression.This entry represents the DNA binding domain found in C-terminal of LFY proteins in plants. Structure-function studies have demonstrated that LFY binds semi-palindromic 19-bp DNA elements through its highly conserved C-terminal DBD, a unique helix-turn-helix fold that by itself dimerizes on DNA [ ].
Protein Domain
Name: Arterivirus nucleocapsid
Type: Family
Description: Arterivirus are ssRNA positive-strand viruses with no DNA stage in their replication cycle. This family contains the viral nucleocapsid protein, which encapsidates the viral ssRNA.Porcine reproductive and respiratory syndrome virus (PRRSV) is the causative agent of both severe and persistent respiratory disease and reproductive failure in pigs worldwide. The PRRSV virion contains a core made of the 123 amino acid nucleocapsid (N or VP1) protein, a product of the ORF7 gene. The crystal structure of the capsid-forming domain of the nucleocapsid protein has been determined to 2.6 A resolution. The protein exists as a tight dimer forming a four-stranded beta sheet floor superposed by two long alpha helices and flanked by two N- and two C-terminal alpha helices. The structure represents a new class of viral capsid-forming domains, distinctly different from those of other known enveloped viruses, but reminiscent of the coat protein of bacteriophage MS2 [ ].
Protein Domain
Name: Atrophin-1
Type: Family
Description: Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA (OMIM:125370) is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins [ , ]. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteristic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity [ ]..
Protein Domain
Name: PLAT/LH2 domain superfamily
Type: Homologous_superfamily
Description: This entry represents a domain superfamily found in a variety of membrane or lipid associated proteins. It is known as the PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology) domain, is found in a variety of membrane or lipid associated proteins. Structurally, this domain forms a β-sandwich composed of two sheets of four strands each [ , , ]. The most highly conserved regions coincide with the β-strands, with most of the highly conserved residues being buried within the protein. An exception to this is a surface lysine or arginine that occurs on the surface of the fifth β-strand of the eukaryotic domains. In pancreatic lipase, the lysine in this position forms a salt bridge with the procolipase protein. The conservation of a charged surface residue may indicate the location of a conserved ligand-binding site. It is thought that this domain may mediate membrane attachment via other protein binding partners.
Protein Domain
Name: Ist3-like, RNA recognition motif
Type: Domain
Description: This entry represents the RNA recognition motif (RRM) of the Ist3 family that includes fungal Ist3 (also known as SNU17), X-linked 2 RNA-binding motif proteins (RBMX2) found in Metazoa and plants, and similar proteins. Ist3 is a novel yeast Saccharomyces cerevisiae protein required for the first catalytic step of splicing and for progression of spliceosome assembly [ ]. It binds specifically to the U2 snRNP and is an intrinsic component of prespliceosomes and spliceosomes [, , ].Yeast Ist3 contains an atypical RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). In the yeast pre-mRNA retention and splicing complex, the atypical RRM of Ist3 functions as a scaffold that organizes the other two constituents, Bud13 (bud site selection 13) and Pml1 (pre-mRNA leakage 1) []. The biological function of RBMX2 remains unclear. It shows high sequence similarity to yeast ist3 protein and harbours one RRM as well.
Protein Domain
Name: ABC-2 transporter
Type: Family
Description: ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily which uses the hydrolysis of ATP to energise diverse biological import and exportsystems. ABC transporters are minimally constituted of two conserved regions: a highly conserved ATP binding cassette (ABC) and a lessconserved transmembrane domain (TMD). These regions can be found on the same protein (mostly in eukaryotes and bacterial exporters) or on two differentones (mostly bacterial importers) [ ]. But in a subgroup of exporters, thetransmembrane region is encoded by a separated polypeptide, the ABC-2 type transport system integral membrane protein. The molecular size of thistransmembrane protein is around 30kDa. It is thought to contain six transmembrane regions, it either form homooligomeric channels or associatewith another type of transmembrane protein to form heteroligomers. The function of the integral inner-membrane protein is to translocate thesubstrate across the membrane and seems to play an important role in substrate recognition [].
Protein Domain
Name: Lsr2
Type: Family
Description: This entry represents Lsr2, which is a small, basic DNA-binding protein present in Mycobacterium and related actinomycetes that regulates gene expression and influences the organization of bacterial chromatin. Lsr2 is a dimer that binds to AT-rich regions of chromosomal DNA and physically protects DNA from damage by reactive oxygen intermediates (ROI). It is a functional homologue of the H-NS-like proteins [ ]. H-NS proteins play a role in nucleoid organisation and also function as a pleiotropic regulator of gene expression [, ].The Lsr2 protein is composed of two domains. The N-terminal domain mediates the dimerization of Lsr2 [ ]. The C-terminal DNA-binding domain of Lsr2 interacts with the minor groove of DNA, and shares similarity to other bacterial nucleoid-associated DNA-binding domains []. The C-terminal domain is found associated with a variety of other protein domains (such as Ku and Ribonuclease T) where it presumably provides a DNA-binding activity.
Protein Domain
Name: Pelle, death domain
Type: Domain
Description: This entry represents the death domain (DD) of the protein kinase Pelle from Drosophila melanogaster and similar proteins [ ]. In Drosophila, interaction between the DDs of Tube and Pelle is an important component of the Toll pathway, which functions in establishing dorsoventral polarity in embryos and in mediating innate immune responses to pathogens. Tube and Pelle transmit the signal from the Toll receptor to the Dorsal/Cactus complex [, ]. Pelle also functions in photoreceptor axon targeting []. DDs (Death domains) are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes [ , ].
Protein Domain
Name: DAPK3, catalytic domain
Type: Domain
Description: This entry represents the catalytic domain of DAPK3, which is a member of the Ca(2+)/calmodulin-regulated family of serine/threonine protein kinases [ , ].STKs (serine/threonine kinases) catalyse the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. DAPKs mediate cell death and act as tumour suppressors. They are necessary to induce cell death and their overexpression leads to death-associated changes including membrane blebbing, cell rounding, and formation of autophagic vesicles []. Vertebrates contain three subfamily members with different domain architecture, localization, and function. DAPK1 is the prototypical member of the DAPK family and is also simply referred to as DAPK. DAPK2 is also called DAPK-related protein 1 (DRP-1), while DAPK3 has also been named DAP-like kinase (DLK) and zipper-interacting protein kinase (ZIPk). These proteins are ubiquitously expressed in adult tissues, are capable of cross talk with each other, and may act synergistically in regulating cell death [ ].
Protein Domain
Name: ATPase, type III secretion system, FliI/YscN
Type: Family
Description: Proteins in this entry show extensive homology to the ATP synthase F1 beta subunit, and are involved in type III protein secretion. They fall into the two separate functional groups outlined below.The first group, exemplified by the Salmonella typhimurium FliI protein ( ), is needed for flagellar assembly. Most structural components of the bacterial flagellum are translocated through the central channel of the growing flagellar structure by the type III flagellar protein-export apparatus in an ATPase-driven manner, to be assembled at the growing end. FliI is the ATPase that couples ATP hydrolysis to the translocation reaction [ , ].The second group couples ATP hydrolysis to protein translocation in non-flagellar type III secretion systems. Often these systems are involved in virulence and pathogenicity. YscN ( ) from pathogenic Yersinia species, for example, energises the injection of anti host factors directly into eukaryotic cells, thus overcoming host defences [ , ].
Protein Domain
Name: Sorting nexin-4
Type: Family
Description: Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds phosphoinositides (PIs) and targets the protein to PI-enriched membranes [ , ]. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway [, , ].SNX4 is involved in recycling traffic from the sorting endosome (post-Golgi endosome) back to the late Golgi. It shows a similar domain architecture as SNX1-2, among others, containing a Bin/Amphiphysin/Rvs (BAR) domain, which detects membrane curvature, C-terminal to the PX domain [ ]. SNX4 is implicated in the regulation of plasma membrane receptor trafficking and interacts with receptors for EGF, insulin, platelet-derived growth factor and the long form of the leptin receptor [].
Protein Domain
Name: Multivesicular body sorting factor 12
Type: Domain
Description: Mvb12 is a subunit of the ESCRT-I core complex involved in ubiquitin-dependent sorting of proteins into the endosome. Mvb12 may regulate the interactions of ESCRT-I with cargo and other proteins of the ESCRT machinery to efficiently coordinate cargo sorting and release of ESCRT-I from the multivesicular body [ ].The multivesicular body (MVB) protein-sorting pathway targets transmembrane proteins either for degradation or for function in the vacuole/lysosomes. Thesignal for entry into this pathway is monoubiquitination of protein cargo, which results in incorporation of cargo into luminal vesicles at lateendosomes. Another crucial player is phosphatidylinositol 3-phosphate (PtdINS(3)P), which is enriched on early endosomes and on the luminal vesiclesof MVBs. ESCRT (endosomal sorting complex required for transport)-I, -II and -III complexes are critical for MVB budding and sorting of monoubiquitinated cargo into the luminal vesicles []. Various Ub-binding domains(UBDs), such as UIM, UEV and NZF are found in such machineries [, ].
Protein Domain
Name: Metabotropic glutamate receptor, Homer-binding domain
Type: Domain
Description: This entry represents the proline-rich region of metabotropic glutamate receptor proteins that bind Homer-related synaptic proteins. Metabotropic glutamate receptors function as receptors for glutamate. The activity of this receptor is mediated by a G-protein that activates a phosphatidylinositol-calcium second messenger system.The Homer proteins form a physical tether linking mGluRs with the inositol trisphosphate receptors (IP3R) that appears to be due to the proline-rich Homer ligand (PPXXFr). Activation of PI turnover triggers intracellular calcium release [ ]. Metabotropic glutamate receptor (MGluR) function is altered in the mouse model of human Fragile X syndrome, a disorder caused by loss of function mutations in the Fragile X messenger ribonucleoprotein 1 gene Fmr1. Homer 3 (and to a lesser extent Homer 1b/c) has been shown to form a multimeric complex with mGlu1a and the IP3 receptor, indicating that Homers may play a role in the localisation of receptors to their signalling partners [].
Protein Domain
Name: Lactococcin-A immunity protein-like
Type: Family
Description: Gram-positive lactobacilli produce bacteriocins to kill closely-related competitor species [ ]. To protect themselves from the bactericidal activity of this molecule they co-express an immunity protein. This entry represents Lactococcin-A immunity protein from Lactococcus lactis and similar proteins predominantly found in the bacterial class Bacilli. The structure of this protein is a soluble, cytoplasmic, antiparallel four α-helical globular bundle with a fifth, more flexible and more divergent C-terminal helical hair-pin [ , ]. The C-terminal hair-pin recognises the C terminus of the producer bacteriocin and this interaction is sufficient to dis-orient the bacteriocin within the membrane and close up the permeabilising pore that on its own the bacteriocin creates []. These immunity proteins interact in the same way with other bacteriocins. Since many enterococci can produce more than one bacteriocin it seems likely that the whole operon can be carried on transferable plasmids [].
Protein Domain
Name: Laforin
Type: Family
Description: Laforin (encoded by the EPM2A gene) is a protein phosphatase and one of the two proteins (the other one is malin, a E3-ubiquitin ligase) that is defective in Lafora disease (LD), a progressive form of inherited epilepsy associated with widespread neurodegeneration and the formation of polyglucosan bodies in the neurons [ ]. Laforin and malin form a functional complex, in which laforin could recognize and recruit putative substrates to be ubiquitinated by malin for degradation. The Laforin-malin complex regulates glycogen synthesis through targeting R5/PTG (Protein Targeting to Glycogen) for inactivation, most probably by proteolysis. Moreover, the Laforin-Malin complex is also involved in different pathways, such as intracellular protein degradation, oxidative stress, and the endoplasmic reticulum unfolded protein response []. It contains a carbohydrate binding module (CBM) at the N terminus and a dual specificity phosphatase domain (DSP) at the C terminus. The structure of Laforin has been revealed [].
Protein Domain
Name: Bacterial HORMA domain
Type: Domain
Description: This entry represents the HORMA domain found in bacterial proteins from the cyclic-oligonucleotide-based anti-phage signalling system (CBASS), such us CD-NTase-associated protein 7 from Pseudomonas aeruginosa (Cap7, also known as Bacterial HORMA2 sensor protein), which is part of the type III-CBASS. CBSSs are a family of defence systems against bacteriophages which are ancestry related with the cGAS-STING innate immune pathway in animals. CBASSs are composed of an oligonucleotide cyclase, which generates signalling cyclic oligonucleotides in response to phage infection, and an effector that is activated by the cyclic oligonucleotides and promotes cell death [ , ]. Cap7 is essential for phage defence and it is the sensor protein in the type III-CBASS. It binds to a closure peptide (consensus Glu-Val-Met-Glu-Phe-Asn-Pro), and forms the CdnD:Cap7:Cap8 complex, which allows it to activate the oligonucleotide cyclase CdnD for second messenger synthesis. The oligonucleotide cyclase becomes active only when physically bound by the HORMA-domain protein [].
Protein Domain
Name: Claudin-10
Type: Family
Description: Claudins form the paracellular tight junction seal in epithelial tissues. In humans, 24 claudins (claudin 1-24) have been identified. Their ability to polymerise and form strands is affected by the cell types [ , , ]. They can also form heteropolymers with each other within and between tight junction strands []. Most of the claudins (claudin-12 being the exception) have a C-terminal PDZ-binding motif that can interact with other PDZ domain proteins, such as scaffolding protein, ZO-1, -2 and -3 []. They also interact with non-tight junction proteins, such as cell adhesion proteins EpCam and tetraspanins and the signaling proteins, ephrin A and B and their receptors, EphA and EphB [].Claudin-10 was identified through cDNA database searching, pursuing sequences similar to other claudin family members []. Human and mouseisoforms have been cloned. Claudin-10 shares ~20-45% overall similarity with other claudin family members at the amino acid level, displaying highestsimilarity to claudin-15.
Protein Domain
Name: Claudin-4
Type: Family
Description: Claudins form the paracellular tight junction seal in epithelial tissues. In humans, 24 claudins (claudin 1-24) have been identified. Their ability to polymerise and form strands is affected by the cell types [ , , ]. They can also form heteropolymers with each other within and between tight junction strands []. Most of the claudins (claudin-12 being the exception) have a C-terminal PDZ-binding motif that can interact with other PDZ domain proteins, such as scaffolding protein, ZO-1, -2 and -3 []. They also interact with non-tight junction proteins, such as cell adhesion proteins EpCam and tetraspanins and the signaling proteins, ephrin A and B and their receptors, EphA and EphB [].Claudin-4 was originally termed Clostridium perfringens enterotoxin receptor (CPE-R). It was reclassified as claudin-4 on the basis of cDNA sequence similarity with claudins-1 and -2, and antibody studies that showed it to be expressed at tight junctions [].
Protein Domain
Name: Claudin-9
Type: Family
Description: Claudins form the paracellular tight junction seal in epithelial tissues. In humans, 24 claudins (claudin 1-24) have been identified. Their ability to polymerise and form strands is affected by the cell types [ , , ]. They can also form heteropolymers with each other within and between tight junction strands []. Most of the claudins (claudin-12 being the exception) have a C-terminal PDZ-binding motif that can interact with other PDZ domain proteins, such as scaffolding protein, ZO-1, -2 and -3 []. They also interact with non-tight junction proteins, such as cell adhesion proteins EpCam and tetraspanins and the signaling proteins, ephrin A and B and their receptors, EphA and EphB [].Claudin-9 was identified through cDNA database searching, pursuing sequences similar to other claudin family members []. Human and mouse isoforms havebeen cloned. Claudin-9 shares ~25-70% overall similarity with other claudin family members at the amino acid level, displaying highest similarity toclaudin-6.
Protein Domain
Name: Claudin-7
Type: Family
Description: Claudins form the paracellular tight junction seal in epithelial tissues. In humans, 24 claudins (claudin 1-24) have been identified. Their ability to polymerise and form strands is affected by the cell types [ , , ]. They can also form heteropolymers with each other within and between tight junction strands []. Most of the claudins (claudin-12 being the exception) have a C-terminal PDZ-binding motif that can interact with other PDZ domain proteins, such as scaffolding protein, ZO-1, -2 and -3 []. They also interact with non-tight junction proteins, such as cell adhesion proteins EpCam and tetraspanins and the signaling proteins, ephrin A and B and their receptors, EphA and EphB [].Claudin-7 was identified through searching expressed sequence tag (EST) databases for sequences similar to claudin-1 and -2. It was subsequently cloned and expressed in cells, where it was shown to concentrate at tight junctions [].
Protein Domain
Name: RapA, N-terminal Tudor-like domain 2
Type: Domain
Description: This is the second of two Tudor-like domains found in the N-terminal region of RapA proteins. RapA is an abundant RNAP-associated protein of 110kDa molecular weight with ATPase activity. It forms a stable complex with the RNAP core enzyme, but not with the holoenzyme. The ATPase activity of RapA increases upon its binding to RNAP [ ]. The N-terminal region of RapA contains two copies of a Tudor-like domains, both folded as a highly bent antiparallel β-sheet. This fold is also found in transcription factor NusG , ribosomal protein L24, human SMN (survival of motor neuron) protein, mammalian DNA repair factor 53BP1, putative fission yeast DNA repair factor Crb2 and bacterial transcription-repair coupling factor known as Mfd. The functional roles of the N-terminal region homologs in these proteins suggest that the Tudor-like domains of RapA may interact with both nucleic acids and RNAP [].
Protein Domain
Name: Nck2, SH3 domain 2
Type: Domain
Description: This entry represent the second SH3 domain of Nck2. The second SH3 domain of Nck appears to prefer ligands containing the APxxPxR motif [ ]. Nck2 (also known as Grb4) is a member of the Nck family. It plays a crucial role in connecting signaling pathways of tyrosine kinase receptors and important effectors in actin dynamics and cytoskeletal remodeling [ ]. It binds neuronal signaling proteins such as ephrinB []. Cytoplasmic proteins Nck are non-enzymatic adaptor proteins composed of three SH3 (Src homology 3) domains and a C-terminal SH2 domain [ ]. They regulate actin cytoskeleton dynamics by linking proline-rich effector molecules to protein tyrosine kinases and phosphorylated signaling intermediates []. They function downstream of the PDGFbeta receptor and are involved in Rho GTPase signaling and actin dynamics []. They associate with tyrosine-phosphorylated growth factor receptors or their cellular substrates [, ]. There are two vertebrate Nck proteins, Nck1 and Nck2.
Protein Domain
Name: Nck2, SH3 domain 3
Type: Domain
Description: This entry represent the third SH3 domain of Nck2. The third SH3 domain of Nck appears to prefer ligands with a PxAPxR motif [ ].Nck2 (also known as Grb4) is a member of the Nck family. It plays a crucial role in connecting signaling pathways of tyrosine kinase receptors and important effectors in actin dynamics and cytoskeletal remodeling [ ]. It binds neuronal signaling proteins such as ephrinB []. Cytoplasmic proteins Nck are non-enzymatic adaptor proteins composed of three SH3 (Src homology 3) domains and a C-terminal SH2 domain [ ]. They regulate actin cytoskeleton dynamics by linking proline-rich effector molecules to protein tyrosine kinases and phosphorylated signaling intermediates []. They function downstream of the PDGFbeta receptor and are involved in Rho GTPase signaling and actin dynamics []. They associate with tyrosine-phosphorylated growth factor receptors or their cellular substrates [, ]. There are two vertebrate Nck proteins, Nck1 and Nck2.
Protein Domain
Name: Small acid-soluble spore protein, alpha/beta-type
Type: Family
Description: Small, acid-soluble spore proteins (SASP or ASSP) are proteins bound to the spore DNA of bacteria of the genera Bacillus, Thermoactynomycetes, and Clostridium [ , ]. They are double-stranded DNA-binding proteins that cause DNA to change to an A-like conformation. They protect the DNA backbone from chemical and enzymatic cleavage and are thus involved in dormant spore's high resistance to UV light. SASP are degraded in the first minutes of spore germination and provide amino acids for both new protein synthesis and metabolism.There are two distinct families of SASP: the alpha/beta type and the gamma-type. Alpha/beta SASP are small proteins of about sixty to seventy amino acid residues that are generally coded by a multigene family. The N terminus of alpha/beta SASP contains the site which is cleaved by a SASP-specific protease that acts during germination while the C terminus and is probably involved in DNA-binding.
Protein Domain
Name: Floricaula/leafy, DNA-binding C-terminal domain
Type: Domain
Description: This domain is found in various plant development proteins which are homologues of floricaula (FLO) and Leafy (LFY) proteins which are floral meristem identity proteins. Mutations in the sequences of these proteins affect flower and leaf development. LFY is a plant-specific transcription factor (TF) essential for flower development. It is one of the few master regulators of flower development, as it integrates environmental and endogenous signals to orchestrate the whole floral network. Transcription factors such as LFY recognize short DNA motifs primarily through their DNA-binding domain. Upon binding to short stretches of DNA called cis-elements or TF binding sites (TFBS), they regulate gene expression.This entry represents the DNA binding domain found in C-terminal of LFY proteins in plants. Structure-function studies have demonstrated that LFY binds semi-palindromic 19-bp DNA elements through its highly conserved C-terminal DBD, a unique helix-turn-helix fold that by itself dimerizes on DNA [ ].
Protein Domain
Name: NAF domain
Type: Domain
Description: The NAF domain is a 24 amino acid domain that is found in a plant-specific subgroup of serine-threonine protein kinases (CIPKs), that interact with calcineurin B-like calcium sensor proteins (CBLs). Whereas the N-terminal part of CIPKs comprises a conserved catalytic domain typical of Ser-Thr kinases, the much less conserved C-terminal domain appears to be unique to this subgroup of kinases. The only exception is the NAF domain that forms an 'island of conservation' in this otherwise variable region. The NAF domain has been named after the prominent conserved amino acids Asn-Ala-Phe. It represents a minimum protein interaction module that is both necessary and sufficient to mediate the interaction with the CBL calcium sensor proteins [ ].The secondary structure of the NAF domain is currently not known, but secondary structure computation of the C-terminal region of Arabidopsis thaliana CBL-interacting protein kinase 1 revealed a long helical structure [ ].
Protein Domain
Name: Fasciclin-like arabinogalactan protein, group A
Type: Family
Description: Fasciclin-like arabinogalactan proteins (FLAs) proteins from Arabidopsis, a subclass of arabinogalactan proteins (AGPs), can be classified in four groups (A-D) based on pair-wise sequence similarity, domain structure, and phylogenetic analysis [ ]. Group A FLAs are characterised by a single fasciclin domain that is flanked by AGP regions and a C-terminal GPI-anchor signal; group B contain two fasciclin domains flanking an AGP region and lack a C-terminal GPI anchor signal sequence; group C have a variable domain structure with one or two fasciclin domains, one or two AGP regions, and a C-terminal GPI anchor signal; FLAs 4 and 19 to 21 formed group D as they have low similarity to each other or to any of the other FLAs [].This family represents Group A Fasciclin-like arabinogalactan proteins (FLAs) from plants, which includes FLAs 6, 7, 9,11,12 and 13 from Arabidopsis [ ]. They are probably cell surface adhesion proteins.
Protein Domain
Name: Phosphoprotein P, oligomerisation domain 1
Type: Homologous_superfamily
Description: Phosphoprotein P, an indispensable subunit of the viral polymerase complex, is a modular protein organised into two moieties that are both functionally and structurally distinct: a well-conserved C-terminal moiety that contains all the regions required for transcription, and a poorly conserved, intrinsically unstructured N-terminal moiety that provides several additional functions required for replication. The N-terminal moiety is responsible for binding to newly synthesised free N(0) (nucleoprotein that has not yet bound RNA), in order to prevent the binding of N(0) to cellular RNA. The C-terminal moiety consists of an oligomerisation domain, an N-RNA (nucleoprotein-RNA)-binding domain and an L polymerase-binding domain [ , ]. The oligomerisation domain reveals a homotetrameric coiled coil structure with many details that are different from classic coiled coils with canonical hydrophobic heptad repeats [].This superfamily represents domain 1 of the phosphoprotein P oligomerisation domain from Sendai virus as well as from close family members.
Protein Domain
Name: Small acid-soluble spore protein, alpha/beta-type, conserved site
Type: Conserved_site
Description: Small, acid-soluble spore proteins (SASP or ASSP) are proteins bound to the spore DNA of bacteria of the genera Bacillus, Thermoactynomycetes, and Clostridium [ , ]. They are double-stranded DNA-binding proteins that cause DNA to change to an A-like conformation. They protect the DNA backbone from chemical and enzymatic cleavage and are thus involved in dormant spore's high resistance to UV light. SASP are degraded in the first minutes of spore germination and provide amino acids for both new protein synthesis and metabolism.There are two distinct families of SASP: the alpha/beta type and the gamma-type. Alpha/beta SASP are small proteins of about sixty to seventy amino acid residues that are generally coded by a multigene family. The N terminus of alpha/beta SASP contains the site which is cleaved by a SASP-specific protease that acts during germination while the C terminus and is probably involved in DNA-binding.
Protein Domain
Name: BZW1/2, W2 domain
Type: Domain
Description: The eIF5-mimic protein 1/2 (also known as basic leucine zipper and W2 domain-containing proteins 2 and 1 (BZW2 and BZW1), respectively), are paralogous human proteins containing C-terminal HEAT domains that resemble the HEAT domain of eIF5 [ ]. BZW1 plays an important role in the cell cycle and transcriptionally control the histone H4 gene during G1/S phase []. The Drosophila ortholog, kra (krasavietz) or exba (extra bases), may be involved in translational inhibition in neural development. The structure of this C-terminal W2 domain resembles that of a set of concatenated HEAT repeats [, , ].The W2 domain has a globular fold and is exclusively composed out of α-helices [ , , ]. The structure can be divided into a structural C-terminal core onto which the two N-terminal helices are attached. The core contains two aromatic/acidic residue-rich regions (AA boxes), which are important for mediating protein-protein interactions.
Protein Domain
Name: E3 ubiquitin-protein ligase RING1/RING2
Type: Family
Description: E3 ubiquitin-protein ligase RING1 ( ) is one of the three E3 ubiquitin-protein ligases that mediate monoubiquitination of 'Lys-119' of histone H2A, which is a specific tag for repression of epigenetic transcription and particularly for inactivation of chromosome X of female mammals [ , ]. RING1 is a component of the Polycomb group (PcG) multiprotein PRC1-like complex, the repressive BCOR complex [], and the the E2F6.com-1 complex in G0 phase []. RING1 has a RING-type zinc finger.E3 ubiquitin-protein ligase RING2 ( ) mediates monoubiquitination of 'Lys-119' of histone H2A (H2AK119Ub) [ ], which is a specific tag for epigenetic transcriptional repression and participates in X chromosome inactivation of female mammals []. RING2 is a component of the PRC1 complex [], and other chromatin-associated Polycomb (PcG) complexes such as BCOR []. It is also a component of the MLL1/MLL complex [].This entry includes E3 ubiquitin-protein ligase RING1/RING2.
Protein Domain
Name: Acyl-protein synthetase, LuxE
Type: Domain
Description: LuxE is an acyl-protein synthetase found in bioluminescent bacteria. LuxE catalyses the formation of an acyl-protein thiolester from a fatty acid and a protein. This is the second step in the bioluminescent fatty acid reduction system, which converts tetradecanoic acid to the aldehyde substrate of the luciferase-catalysed bioluminescence reaction [ ]. A conserved cysteine found at position 364 in Photobacterium phosphoreum LuxE () is thought to be acylated during the transfer of the acyl group from the synthetase subunit to the reductase. The C-terminal of the synthetase is though to act as a flexible arm to transfer acyl groups between the sites of activation and reduction [ ]. A LuxE domain is also found in the Vibrio cholerae RBFN protein (), which is involved in the biosynthesis of the O-antigen component 3-deoxy-L-glycero-tetronic acid. This entry represents the LuxE domain, which is found in archaeal and bacterial proteins.
Protein Domain
Name: F-actin capping protein, beta subunit, conserved site
Type: Conserved_site
Description: The actin filament system, a prominent part of the cytoskeleton in eukaryotic cells, is both a static structure and a dynamic network that can undergo rearrangements: it is thought to be involved in processes such as cell movement and phagocytosis [ ], as well as muscle contraction.The F-actin capping protein binds in a calcium-independent manner to the fast growing ends of actin filaments (barbed end) thereby blocking the exchange of subunits at these ends. Unlike gelsolin (see ) and severin this protein does not sever actin filaments. The F-actin capping protein is a heterodimer composed of two unrelated subunits: alpha and beta. Neither of the subunits shows sequence similarity to other filament-capping proteins [ ].The beta-subunit is a protein of about 280 amino acid residues whose sequence is well conserved in eukaryotic species [ ]. The signature pattern in this entry is a conserved hexapeptide in the N-terminal region of the beta-subunit.
Protein Domain
Name: Histidine biosynthesis, HisF
Type: Family
Description: Histidine is formed by several complex and distinct biochemical reactions catalysed by eight enzymes. Proteins involved in steps 4 and 6 of the histidine biosynthesis pathway are contained in one family. These enzymes are called His6 and His7 in eukaryotes and HisA and HisF in prokaryotes [, , ]. HisA is a phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide isomerase (), involved in the fourth step of histidine biosynthesis. The bacterial HisF protein is a cyclase which catalyzes the cyclization reaction that produces D-erythro-imidazole glycerol phosphate during the sixth step of histidine biosynthesis. HisF is also known as imidazole glycerol phosphate synthase [ , , ]. The yeast His7 protein is a bifunctional protein which catalyzes an amido-transferase reaction that generates imidazole-glycerol phosphate and 5-aminoimidazol-4-carboxamide. The latter is the ribonucleotide used for purine biosynthesis. The enzyme also catalyzes the cyclization reaction that produces D-erythro-imidazole glycerol phosphate, and is involved in the fifth and sixth steps in histidine biosynthesis.This family describes the histidine biosynthesis protein, HisF.
Protein Domain
Name: Photosystem I reaction center subunit V
Type: Family
Description: Photosystem I (PSI) [ ] is an integral membrane protein complex that uses light energy to mediate electron transfer from plastocyanin to ferredoxin. It is found in the chloroplasts of plants and cyanobacteria. PSI is composed of at least 14 different subunits, two of which are small hydrophobic proteins of about 7 to 9 Kd and evolutionary related, PsaG (also known as PSI-G) and PsaK (also known as PSI-K), both integral membrane proteins. Cyanobacteria contain only PsaK []. While cyanobacterial PSI have phycobilisomes to harvest light, eukaryotic PSI have a membrane-imbedded peripheral antenna []. This protein family represents Photosystem I reaction center subunit V (PsaG) found in plants, predominantly in Streptophytes. In Arabidopsis thaliana, PsaG is involved in the binding dynamics of plastocyanin to PSI, in the stability of the PSI complex and in light-harvesting [ , , ]. The crystal structure of the plant PSI complex show this protein is closely related to the similar subunit PsaK [].
Protein Domain
Name: Mitochondrial carrier UCP-like
Type: Family
Description: This protein family includes a variety of substrate carrier proteins that are involved in energy transfer and are found in the inner mitochondrial membrane [ , , , , ]. UCP are mitochondrial transporter proteins that create proton leaks across the inner mitochondrial membrane, thus uncoupling oxidative phosphorylation. As a result, energy is dissipated in the form of heat.Mitochondrial brown fat uncoupling protein 1 from chordates (UCP1) from chordates is responsible for thermogenic respiration, a specialised capacity of brown adipose tissue to the regulation of energy balance. It regulates the production of reactive oxygen species/ROS by mitochondria [ , ].The protein has a tripartite structure, with each of the 3 similar domains displaying a transmembrane helix, a loop, an amphipathic helix and another transmembrane helix. Overall, it shows a channel-like structure [ ].The domains exhibit striking conservation of several residues, especially of glycine and proline, which may constitute structurally strategicpositions [ ].
Protein Domain
Name: Cullin protein, neddylation domain
Type: Domain
Description: This is the neddylation site of cullin proteins, which are a family of structurally related proteins containing an evolutionarily conserved cullin domain. With the exception of APC2, each member of the cullin family is modified by Nedd8 and several cullins function in Ubiquitin-dependent proteolysis, a process in which the 26S proteasome recognises and subsequently degrades a target protein tagged with K48-linked poly-ubiquitin chains. Cullins are molecular scaffolds responsible for assembling the ROC1/Rbx1 RING-based E3 ubiquitin ligases, of which several play a direct role in tumorigenesis. Nedd8/Rub1 is a small ubiquitin-like protein, which was originally found to be conjugated to Cdc53, a cullin component of the SCF (Skp1-Cdc53/CUL1-F-box protein) E3 Ub ligase complex in Saccharomyces cerevisiae (Baker's yeast), and Nedd8 modification has now emerged as a regulatory pathway of fundamental importance for cell cycle control and for embryogenesis in metazoans. The only identified Nedd8 substrates are cullins. Neddylation results in covalent conjugation of a Nedd8 moiety onto a conserved cullin lysine residue [ ].
Protein Domain
Name: Exosome-associated factor Rrp47/DNA strand repair C1D
Type: Family
Description: The CD1 family of proteins includes exosome-associated cofactor Rrp47, and DNA double-strand repair protein C1D. Both of these proteins are implicated in DNA double-strand repair and in nuclear exosome activity.Rrp47 functions in nuclear RNA processing. Rrp47 is associated with nuclear exosomes, and with the nuclear exosome-specific 3'-5' exonuclease Rrp6.Rrp47 appears to be a substrate-specific nuclear cofactor that is required for the 3' processing of pre-rRNAs, snoRNAs and snRNAs (specifically U4 and U5) [ ]. However, Rrp47 is not required for cytoplasmic exosome mRNA degradation. It also appears to have a role in DNA strand repair.C1D is an inducible nuclear matrix proteins that interacts with and activates the DNA-dependent protein kinase essential for DNA double-strand repair and V(D)J recombination [ ]. C1D can interact with condensin, which is required for DNA repair and replication checkpoint control []. C1D also binds to nuclear exosomes and might regulate theirfunctional activities, as it has RNA-binding activity [ ].
Protein Domain
Name: CRC domain
Type: Domain
Description: The following proteins of the tesmin/TSO1 family contain two cysteines-rich repeats with the consensus C-X-C-X(4)-C-X(3)-Y-C-X-C-X(6)-C-X(3)-C-X-C-X(2)-Cseparated by a region of variable length containing the short conserved sequence R-N-P-X-A-F-X-P-K:Animal tesmin or MTL5, originally identified by its specific expression in testes, but subsequently it was also detected at specific stages of ovary development.Animal tesmin-like (tesl) or LIN54.Drosophila melanogaster tombola (tomb), a meiotic arrest protein which is expressed specifically in testis.Arabidopsis thaliana TSO1, 'tso' means 'ugly' in Chinese and refers to the appearance of tso1 mutant flowers.Arabidopsis thaliana TSO1-like 1 and 2 (SOL1 and SOL2).Legume Cysteine-rich Polycomb-like Protein 1 (CPP1), a DNA-binding protein acting as a negative regulator of the leghemoglobin gene.This domain has been named the CRC domain (C1-RNPXAFXPK-C2). It binds zinc andis able to bind DNA [ , , , , ].The CRC domain shows some similarity to the CXC domain found in the E(z)-type of Polycomb group proteins []. However, a clear distinctioncan be made, since the CXC domain lacks the RNPXAFXPK motif.
Protein Domain
Name: Iron-binding zinc finger, CDGSH type
Type: Domain
Description: This entry represents iron-sulphur domain containing proteins that have a CDGSH sequence motif (although the Ser residue can also be an Ala or Thr), and is found in proteins from a wide range of organisms with the exception of fungi. The CDGSH-type domain binds a redox-active pH-labile 2Fe-2S cluster. The conserved sequence C-X-C-X2-(S/T)-X3-P-X-C-D-G-(S/A/T)-H is a defining feature of this family [ ].CDGSH-type domains are found in mitoNEET, an iron-containing integral protein of the outer mitochondrian membrane (OMM). MitoNEET forms a dimeric structure with a NEET fold, and contains two domains: a β-cap region and a cluster-binding domain that coordinated two acid-labile 2Fe-2S clusters (one bound to each protomer) [ ]. The CDGSH iron-sulphur domain is oriented towards the cytoplasm and is tethered to the mitochondrial membrane by a more N-terminal domain found in higher vertebrates, () [ , ]. The whole protein regulates oxidative capacity and may function in electron transfer, for instance in redox reactions with metabolic intermediates, cofactors and/or proteins localized at the OMM.
Protein Domain
Name: Methylthioribose-1-phosphate isomerase
Type: Family
Description: This family contains proteins designated as 5-methylthioribose-1-phosphate isomerase (MTNA, , [ ]). It also contains putative translation initiation factor 2B subunit proteins.5-methylthioribose-1-phosphate isomerase (MTNA) participates in the methionine salvage pathway catalysing the isomerisation of 5-methylthioribose-1-phosphate to 5-methylthio ribulose-1-phosphate [ ]. The methionine salvage pathway leads to the synthesis of methionine from methylthioadenosine, the end product of the spermidine and spermine anabolism in many species.The delineation of this family is based in part on a discussion and neighbour-joining phylogenetic study [ , ], of archaeal and other proteins homologous to the alpha, beta, and delta subunits of eukaryotic initiation factor 2B (eIF-2B), a five-subunit molecule that catalyzes GTP recycling for eIF-2. The archaeal proteins are related to the common ancestor of eIF-2B alpha, beta, and delta rather then specifically to any one of them, and that designation of particular archaeal members as corresponding to eIF-2B alpha or eIF-2B delta is imprecise. It has been suggested that designating the archaeal proteins as translation initiation factors remains unproven.
Protein Domain
Name: YbhB/YbcL
Type: Family
Description: This family contains Escherichia coli YbhB and YbcL that are possibly RKIP homologues, found in the cytoplasm and periplasm.The crystal structures of YbhB and YbcL demonstrates that they belong to the same structural family as the mammalian RKIP/PEBP proteins. In rat and human cells, RKIP (previously known as PEBP) has been characterised as an inhibitor of the MEK phosphorylation by Raf-1. The general structural fold and the anion binding site of these proteins are extremely well conserved between mammals and bacteria and suggest functional similarities. However, the bacterial proteins also exhibit some specific structural features, like a substrate binding pocket formed by the dimerisation interface and the absence of cis peptide bonds. The parallel between the cellular signalling mechanisms in eukaryotes and prokaryotes suggests that the proteins in this family could be involved in the regulation of protein phosphorylation by kinases. The structural variety observed for YbhB and YbcL indicates the possible recognition of multiple cellular partners [ ].
Protein Domain
Name: Anaphase-promoting complex subunit 5 domain
Type: Domain
Description: This entry represents a domain found in Apc5 (anaphase-promoting complex subunit 5).Apc5 is a subunit of the anaphase-promoting complex/cyclosome (APC/C) which is a multisubunit ubiquitin ligase that mediates the proteolysis of cell cycle proteins in mitosis and G1. Tetratricopeptide repeat protein 19 is a mitochondrial protein required for formation of the mitochondrial complex III [ ].Apc5, although it does not harbour a classical RNA binding domain, Apc5 binds the poly(A) binding protein (PABP), which directly binds the internal ribosome entry site (IRES) of growth factor 2 mRNA. PABP was found to enhance IRES-mediated translation, whereas Apc5 over-expression counteracted this effect. In addition to its association with the APC/C complex, Apc5 binds much heavier complexes and co-sediments with the ribosomal fraction [ , ]. The N terminus of Afi1 serves to stabilise the union between Apc4 and Apc5, both of which lie towards the bottom-front of the APC []. This region of the Apc5 member proteins carries a TPR-like motif.
Protein Domain
Name: Carboxypeptidase-like, regulatory domain superfamily
Type: Homologous_superfamily
Description: This domain superfamily identifies a number of eukaryotic carboxypeptidases, these include carboxypeptidase D, E (H), N, X, X2 and Z. These are metallopeptidases belong to MEROPS peptidase family M14 (clan MC), subfamily M14B.Carboxypeptidase D (CPD) is a new B-type metallocarboxypeptidase that is membrane bound and has an acidic pH optimum. A hydrophobic region at the N terminus represents the signal peptide, and one near the C terminus that probably represents the transmembrane anchor. A regulatory domain within the protein has been identified as a β-sandwich, comprising 7 strands in 2 sheets in a greek-key topology. Some family members have an additional 1-2 strands to the common fold [ ].The bacterial and archaeal sequences having this signature are variously annotated, examples are:Hypothetical/conserved/membrane/cell surface protein N-acetylglucosamine deacetylaseSide tail fibre protein homologue from lambdoid prophage RacHypothetical tonB-linked outer membrane receptorOmpA-related proteinPutative outer membrane protein, probably involved in nutrient bindingTonB-dependent receptorThis entry also includes the teneurin family members, which may function as cellular signal transducers.
Protein Domain
Name: PUA-like superfamily
Type: Homologous_superfamily
Description: This superfamily represents domains with a PUA-like structure, consisting of a pseudo-barrel composed of mixed folded sheets of five strands. This structural motif is found in:PUA-containing proteins.The N-terminal of ATP sulphurylases, which contains extra structures, some similar to the PK β-barrel domain [ ].Several bacterial hypothetical proteins, such as the N-terminal domain of YggJ [ ].The PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was named after the proteins in which it was first found [ ]. PUA is a highly conserved RNA-binding motif found in a wide range of archaeal, bacterial and eukaryotic proteins, including enzymes that catalyse tRNA and rRNA post-transcriptional modifications, proteins involved in ribosome biogenesis and translation, as well as in enzymes involved in proline biosynthesis [, ]. The structures of several PUA-RNA complexes reveal a common RNA recognition surface, but also some versatility in the way in which the motif binds to RNA []. PUA motifs are involved in dyskeratosis congenita and cancer, pointing to links between RNA metabolism and human diseases [].
Protein Domain
Name: Chlorophyll A-B binding protein, plant and chromista
Type: Family
Description: The light-harvesting complex (LHC) consists of chlorophylls A and B and the chlorophyll A-B binding protein. LHC functions as a light receptor that captures and delivers excitation energy to photosystems I and II with which it is closely associated. Under changing light conditions, the reversible phosphorylation of light harvesting chlorophyll a/b binding proteins (LHCII) represents a system for balancing the excitation energy between the two photosystems [ ].The N terminus of the chlorophyll A-B binding protein extends into the stroma where it is involved with adhesion of granal membranes and photo-regulated by reversible phosphorylation of its threonine residues [ ]. Both these processes are believed to mediate the distribution of excitation energy between photosystems I and II.This family also includes the photosystem II protein PsbS, which plays a role in energy-dependent quenching that increases thermal dissipation of excess absorbed light energy in the photosystem [ ].This entry is limited to plant and chromista proteins.
Protein Domain
Name: Trigger factor
Type: Family
Description: The trigger factor is found in several prokaryotes, and is involved in protein export. Trigger factor is a ribosome-associated molecular chaperone and is the first chaperone to interact with nascent polypeptide. It acts as a chaperone by maintaining the newly synthesised protein in an open conformation. It consists of three domains, an N-terminal ribosome-binding domain (RBD), a central peptidyl-prolyl cis/trans isomerase (PPIase) domain and a C-terminal substrate-binding domain (SBD) which is stabilised by a linker between the RBD and PPIase domains [ , , ]. The association between its N-terminal domain with the ribosomal protein L23 located next to the peptide tunnel exit is essential for the interaction with nascent polypeptides and its in vivo function. Trigger factor can bind at the same time as the signal recognition particle (SRP), but is excluded by the SRP receptor (FtsY) [].The trigger factor-like protein included in this entry can be found in plant chloroplasts, where it may be involved in protein export.
Protein Domain
Name: Bax inhibitor 1-related
Type: Family
Description: Bax inhibitor-1 (BI-1) [ ] is a suppressor of apoptosis that interacts with BCL2 and BCL-X. Human Bax BI-1 is an evolutionarily conserved integral membrane protein containing multiple membrane-spanning segments localised to the ER membrane. It has 6-7 membrane-spanning domains. The C termini of the mammalian BI-1 proteins are comprised of basic amino acids resembling some nuclear targeting sequences, but otherwise the predicted proteins lack motifs that suggest this function. BI-1 also regulates cell death triggered by ER stress [, , ]. BI-1 appears to exert its effect through an interaction with calmodulin []. Crystal structure of a bacterial member reveals that these proteins mediate a calcium leak across the membrane in a pH-dependent manner. Calcium homoeostasis balances passive calcium leak with active calcium uptake. The structure exists in a pore-closed and pore-open conformation, at pHs of 8 and 6 respectively [, ].This entry represents BI-1 and related sequences, including lifeguard proteins, which resemble BI-1 and also act as apoptotic regulators [ ].
Protein Domain
Name: EGF-like, conserved site
Type: Conserved_site
Description: A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown [, , , ] to be present, in a moreor less conserved form, in a large number of other, mostly animal proteins. The list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF domains inwhat appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandinG/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded β-sheet followed by a loop to a C-terminal short two-stranded sheet.Subdomains between the conserved cysteines vary in length. This entry represents a conserved site in the EGF-like domain.
Protein Domain
Name: Mce/MlaD
Type: Domain
Description: This domain is found in all 24 mce genes associated with the four mammalian cell entry (mce) operons of Mycobacterium tuberculosis and MlaD proteins from other Actinomycetales [ , ]. The archetype (mce1A, Rv0169), was isolated as being necessary for colonisation of, and survival within, the macrophage []. The domain is also found in: Chloroplast Ycf22 and related cyanobacterial homologues, the majority of which have an N-terminal transmembrane domain and are putative ABC transporters. Proteobacterial homologues, which include MlaD, PqiB, YrbD, YebT, VpsC and Ttg2C. MlaD is part of the ABC transporter complex MlaFEDB that actively prevents phospholipid accumulation at the cell surface []. MlaFEDB complex is composed of two ATP-binding proteins (MlaF), two transmembrane proteins (MlaE), two cytoplasmic solute-binding proteins (MlaB) and a probable periplamic solute-binding protein (MlaD). Through the Mla pathway, Gram-negative bacteria maintains lipid asymmetry in the outer membrane by retrograde trafficking of phospholipids from the outer membrane to the inner membrane [].
Protein Domain
Name: Transport-associated and nodulation domain, bacteria
Type: Domain
Description: The BON domain is typically ~60 residues long and has an α/β fold. There is a conserved glycine residue and several hydrophobic regions which suggests a binding function, and, actually, it contains a phospholipid-binding site , ]. Most proteobacteria seem to possess one or two BON-containing proteins, typically of the OsmY-type proteins [, , ]; outside of this group the distribution is more disparate. The OsmY protein is an Escherichia coli 20kDa outer membrane or periplasmic protein that is expressed in response to a variety of stress conditions, in particular, helping to provide protection against osmotic shock. One hypothesis is that OsmY prevents shrinkage of the cytoplasmic compartment by contacting the phospholipid interfaces surrounding the periplasmic space. The domain architecture of two BON domains alone suggests that these domains contact the surfaces of phospholipids, with each domain contacting a membrane [ ].In the potassium binding protein Kbp, this domain is able to bind K+ [ ].
Protein Domain
Name: Adhesin B
Type: Family
Description: The Streptococcus pneumoniae psaA gene encodes a protein with significant similarity to previously-reported Streptococcal proteins, SsaB (80% similarity) and FimA (92.3% similarity), from Streptococcus sanguis and Streptococcus parasanguis [ ]. These homologues are associated with bacterial adhesion, and PsaA may play a similar role [].The SsaB protein has a putative hydrophobic 19-amino-acid signal sequence yielding a 32,620-Mr secreted protein [ ]. SsaB is hydrophilic and appears not to have a hydrophobic membrane anchor in its C-terminal region. A high degree of similarity exists between S. sanguis ssaB and type 1 fimbrial genes []. Comparison of the gene products reveals close similarity of the two proteins. It is thought that ssaB adhesion may play a role in oral colonisation by binding either to a receptor on saliva or to a receptor on Actinomyces.This sub-family is described by from conserved regions spanning the full alignment length, focusing on those sections that characterise the adhesin Bprecursors but distinguish them from the rest of the adhesin family.
Protein Domain
Name: DNA helicase subunit AddB
Type: Family
Description: DNA repair is accomplished by several different systems in prokaryotes. Recombinational repair of double-stranded DNA breaks involves the RecBCD pathway in some lineages, and AddAB (also called RexAB) in others. AddA is conserved between the firmicutes and the alphaproteobacteria, while its partner protein (RexB) is not. This entry describes the ATP-dependent nuclease subunit B (AddB/RexB) protein as found Bacillus subtilis and related species. Although the RexB protein of Streptococcus and Lactococcus is considered to be orthologous and functionally equivalent, merely named differently, all members of this protein family have a P-loop nucleotide binding motif GxxGxGK[ST] at the N terminus, unlike RexB proteins, and a CxxCxxxxxC motif at the C terminus, both of which may be relevant to function.The AddA/AddB heterodimer acts as both an ATP-dependent DNA helicase and an ATP-dependent, dual-direction single-stranded exonuclease. It recognises the chi site generating a DNA molecule suitable for the initiation of homologous recombination. The AddB nuclease domain is not required for chi fragment generation; this subunit has 5' ->3' nuclease activity.
Protein Domain
Name: RBM6, OCRE domain
Type: Domain
Description: RNA-binding protein 6 (RBM6) contains two RNA-binding motifs and shares protein sequence similarity with RBM5 [ ]. RBM6 is targeted to the splicing speckles and nascent transcripts []. It may act mainly as a splicing activator []. The elevated expression of RBM6 has been linked to certain tumours []. A RBM6-CSF1R fusion has been found in acute megakaryoblastic leukemia [].RBM6 contains two RNA recognition motifs (RRMs), an OCtamer REpeat (OCRE) domain, two C2H2-type zinc fingers, a nuclear localization signal, and a G-patch/D111 domain. In contrast to RBM5, RBM6 has an additional unique domain, the POZ (poxvirus and zinc finger) domain, which may be involved in protein-protein interactions and inhibit binding of target sequences by zinc fingers. This entry represents the OCRE domain.The OCRE (OCtamer REpeat) domain contains five repeats of an 8-residue motif, which were shown to form β-strands. Based on the architectures of proteins containing OCRE domains, a role in RNA metabolism and/or signalling has been proposed [ ].
Protein Domain
Name: Capsular polysaccharide synthesis, CpsB/CapC
Type: Family
Description: Capsular polysaccharide biosynthesis proteins are critical for the production of a mature capsule in vitro. Members of Streptococcus pneumoniae have variable capsules, with about 90 known capsular serotypes that all have their own polysaccharide structure. cps14B to cps14H products are similar to other proteins involved in bacterial polysaccharide biosynthesis in both Gram-negative and -positive bacteria [ ]. CpsA to cpsD are common to most serotypes. CpsD is an autophosphorylating protein-tyrosine kinase. CpsC is required for CpsD tyrosine-phosphorylation, and CpsB is required for dephosphorylation of CpsD. CpsB is a novel manganese-dependent phosphotyrosine-protein phosphatase that belongs to the PHP (polymerase and histidinol phosphatase) family of phosphoesterases []. Phosphorylation of proteins on hydroxyl amino acids (serine, threonine, tyrosine) occurs at different stages of pathenogenesis, including cell-cell interaction and adherence, translocation of bacterial effectors into host cells, and changes in host cellular structure and function induced by infection []. CpsB, CpsC, CpsD, and ATP form a stable complex that enhances capsule synthesis and acts to regulate it [].
Protein Domain
Name: Major capsid L1 (late) protein, Papillomavirus
Type: Family
Description: This entry represents the major late capsid protein L1 from Papillomaviruses, such as Human papillomavirus (HPV) [ ]. Papillomaviruses are members of the papovavirus superfamily. More than 70 different types of papillomavirus have been discovered in humans, some of which have been shown to cause genital carcinomas and cutaneous warts []. The viruses contain a circular dsDNA genome surrounded by an icosahedral capsid (non-enveloped). Two proteins are involved in capsid formation: a major (L1) and a minor (L2) protein, in the approximate proportion 95:5%. L1 forms a pentameric assembly unit of the viral shell in a manner that closely resembles VP1 from polyomaviruses. Intermolecular disulphide bonding holds the L1 capsid proteins together []. L1 capsid proteins can bind via its nuclear localisation signal (NLS) to karyopherins Kapbeta(2) and Kapbeta(3) and inhibit the Kapbeta(2) and Kapbeta(3) nuclear import pathways during the productive phase of the viral life cycle []. Surface loops on L1 pentamers contain sites of sequence variation between HPV types.
Protein Domain
Name: ATR13 superfamily
Type: Homologous_superfamily
Description: The avirulence protein ATR13 is expressed by the plant pathogen oomycete Hyaloperonospora. Such phytopathogenic oomycetes like the one that infects Arabidopsis, Hyaloperonospora arabidopsidis (Hpa), grow intercellularly, forming parasitic structures called haustoria. Haustoria play a role in feeding and suppression of host defence systems. A whole range of pathogen proteins, called effectors, are secreted across this haustorial membrane, a subset of which are further translocated across the plant plasma membrane by an unknown mechanism that is present in both plants and animals. ATR13 is an RxLR effector from the downy mildew oomycete, and is a very dynamic protein. It contains two surface-exposed patches of polymorphism, one of which is involved in the specific recognition by host R-genes. The R-gene-products detect the presence of the infection by recognising the effector proteins. Once detected, the host R-genes trigger apoptosis of the host cell. The R-gene-products carry a specific motif, RxLR, that is recognises the effector proteins [ ].
Protein Domain
Name: 14-3-3 theta
Type: Family
Description: 14-3-3 tau/theta (tau in humans, theta in mice) isoform is encoded by the YWHAQ gene in humans and plays an important role in controlling apoptosis through interactions with ASK1, c-jun NH-terminal kinase, and p38 mitogen-activated protein kinase (MAPK). Its interaction with CDC25c regulates entry into the cell cycle and subsequent interaction with Bad prevents apoptosis. 14-3-3 theta protein expression is induced in patients with amyotrophic lateral sclerosis []. 14-3-3 tau is often overexpressed in breast cancer, which is associated with the downregulation of p21, a p53 target gene, and thus leads to tamoxifen resistance in MCF7 breast cancer cells and shorter patient survival. Therefore, 14-3-3 tau may be a potential therapeutic target in breast cancer []. Additionally, 14-3-3 theta mediates nucleocytoplasmic shuttling of the coronavirus nucleocapsid protein which causes severe acute respiratory syndrome []. 14-3-3 domain is an essential part of 14-3-3 proteins, a ubiquitous class of regulatory, phosphoserine/threonine-binding proteins found in all eukaryotic cells, including yeast, protozoa and mammalian cells [].
Protein Domain
Name: Receptor-type tyrosine-protein phosphatase mu, PTPase domain, repeat 1
Type: Domain
Description: Receptor-type tyrosine-protein phosphatase mu (R-PTP-mu) belongs to the type 2B subfamily of receptor protein tyrosine phosphatases (RPTPs), which belong to the larger family of classical tyrosine-specific protein tyrosine phosphatases (PTPs). PTPs catalyze the dephosphorylation of phosphotyrosine peptides [ , , ]. R-PTP-mu is a homophilic cell adhesion molecule expressed in cells which display a cellular network such as neurons and glia in the central nervous system [, ]. It is required for neurite outgrowth []. Loss of this protein contributes to tumor cell migration and dispersal of human glioblastomas [, , , , ]. It contains an extracellular region with an Meprin-A5 (neuropilin)-mu (MAM) domain, an immunoglobulin (Ig) domain, and four fibronectin type III (FN3) repeats, a transmembrane domain, and an intracellular segment with a juxtamembrane domain similar to the cytoplasmic domain of classical cadherins and two tandem PTP domains [, ]. This entry represents the first catalytic PTP domain (repeat 1) found in the protein R-PTP-mu found in chordates.
Protein Domain
Name: Sequestosome-1, PB1 domain
Type: Domain
Description: The PB1 domain is an essential part of p62 scaffold protein (alias sequestosome 1, SQSTM) involved in cell signaling, receptor internalization, and protein turnover [ , ]. It binds ubiquitinated substrates and aid their aggregation and degradation by macroautophagy []. The PB1 domain is a modular domain mediating specific protein-protein interaction which play roles in many critical cell processes. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domains, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. Interactions of PB1 domains with other protein domains have been described as noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants [ ].
Protein Domain
Name: Pre-mRNA-splicing factor Cwc2, RNA recognition motif
Type: Domain
Description: This entry represents the RNA recognition motif (RRM) of yeast protein Cwc2, also termed Cwf2, one of the components of the Prp19-associated complex [nineteen complex (NTC)]. that can bind to RNA. NTC is composed of the scaffold protein Prp19 and a number of associated splicing factors, and plays a crucial role in intron removal during premature mRNA splicing in eukaryotes. Cwc2 functions as an RNA-binding protein that can bind both small nuclear RNAs (snRNAs) and pre-mRNA in vitro. It interacts directly with the U6 snRNA to link the NTC to the spliceosome during pre-mRNA splicing []. In the N-terminal half, Cwc2 contains a CCCH-type zinc finger (Znf domain), a RNA recognition motif (RRM), and an intervening loop, also termed RNA-binding loop or RB loop, between the Znf and the RRM, all of which are necessary and sufficient for RNA binding. The Znf is also responsible for mediating protein-protein interaction. The C-terminal flexible region of Cwc2 interacts with the WD40 domain of Prp19 [ ].
Protein Domain
Name: UNC5C, death domain
Type: Domain
Description: This entry represents the death domain (DD) found in Uncoordinated-5C (UNC5C), which is a receptor for the secreted netrin-1 and plays a role in axonal guidance, angiogenesis, and apoptosis. It belongs to the UNC-5 family. UNC5C plays a critical role in the development of spinal accessory motor neurons [ ]. Methylation of the UNC5C gene is associated with early stages of colorectal carcinogenesis [, ].UNC5 proteins are transmembrane proteins with an extracellular domain consisting of two immunoglobulin repeats, two thrombospondin type-I modules and an intracellular region containing a ZU-5 domain, UPA domain and a DD [ , , ].DDs (Death domains) are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes [ , ].
Protein Domain
Name: UNC5A, death domain
Type: Domain
Description: This entry represents the death domain (DD) found in Uncoordinated-5A (UNC5A), which is a receptor for the secreted netrin-1 and plays a critical role in neuronal development and differentiation, as well as axon-guidance [ ]. It also plays a role in regulating apoptosis in non-neuronal cells as a downstream target of p53 []. It belongs to the UNC-5 family.UNC5 proteins are transmembrane proteins with an extracellular domain consisting of two immunoglobulin repeats, two thrombospondin type-I modules and an intracellular region containing a ZU-5 domain, UPA domain and a DD [ , , ].DDs (Death domains) are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily including CARD (Caspase activation and recruitment domain), DED (Death Effector Domain), and PYRIN. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes [ , ].
Protein Domain
Name: Midkine heparin-binding growth factor
Type: Family
Description: Several extracellular heparin-binding proteins involved in regulation of growth and differentiation belong to a new family of growth factors. These growth factors are highly related proteins of about 140 amino acids that contain 10 conserved cysteines probably involved in disulphide bonds, and include pleiotrophin [ ] (also known as heparin-binding growth-associated molecule HB-GAM, heparin-binding growth factor 8 HBGF-8, heparin-binding neutrophic factor HBNF and osteoblast specific protein OSF-1); midkine (MK) []; retinoic acid-induced heparin-binding protein (RIHB) []; and pleiotrophic factors alpha-1 and -2 and beta-1 and -2 from Xenopus laevis, the homologues of midkine and pleiotrophin respectively. Pleiotrophin is a heparin-binding protein that has neurotrophic activity and has mitogenic activity towards fibroblasts. It is highly expressed in brain and uterus tissues, but is also found in gut, muscle and skin. It is thought to possess an important brain-specific function. Midkine is a regulator of differentiation whose expression is regulated by retinoic acid, and, like pleiotrophin, is a heparin-binding growth/differentiation factor that acts on fibroblasts and nerve cells.
Protein Domain
Name: K02A2.6-like peptidase catalytic domain
Type: Domain
Description: This entry includes a retropepsin-like domain of invertebrate retrotransposons with long terminal repeats [ ]. Although none of the proteins have been characterised, the retropepsin-like domain is presumed to be an aspartate endopeptidase. While fungal and mammalian pepsins are bilobal proteins with structurally related N- and C-termini, retropepsins are half as long as their fungal and mammalian counterparts []. The monomers are structurally related to one lobe of the pepsin molecule and retropepsins function as homodimers. The active site aspartate occurs within a motif (Asp-Thr/Ser-Gly), as it does in pepsin. In aspartate endopeptidases, Asp residues are ligands of an activated water molecule in all examples where catalytic residues have been identified. The retrotransposon aspartyl endopeptidase is synthesized as part of a polyprotein that also contains a reverse transcriptase and an integrase. The polyprotein is presumed to undergo specific enzymatic cleavage to yield the mature proteins. This group of aspartate endopeptidases is classified by MEROPS as peptidase family A28 [, ].
Protein Domain
Name: Major capsid L1 (late) superfamily, Papillomavirus
Type: Homologous_superfamily
Description: This entry represents the major late capsid protein L1 from Papillomaviruses, such as Human papillomavirus (HPV) [ ]. Papillomaviruses are members of the papovavirus superfamily. More than 70 different types of papillomavirus have been discovered in humans, some of which have been shown to cause genital carcinomas and cutaneous warts []. The viruses contain a circular dsDNA genome surrounded by an icosahedral capsid (non-enveloped). Two proteins are involved in capsid formation: a major (L1) and a minor (L2) protein, in the approximate proportion 95:5%. L1 forms a pentameric assembly unit of the viral shell in a manner that closely resembles VP1 from polyomaviruses. Intermolecular disulphide bonding holds the L1 capsid proteins together []. L1 capsid proteins can bind via its nuclear localisation signal (NLS) to karyopherins Kapbeta(2) and Kapbeta(3) and inhibit the Kapbeta(2) and Kapbeta(3) nuclear import pathways during the productive phase of the viral life cycle []. Surface loops on L1 pentamers contain sites of sequence variation between HPV types.
Protein Domain
Name: Bacteriophage T4, Gp32, single-stranded DNA-binding
Type: Family
Description: Single-stranded DNA-binding protein (also known as Gp32 or SSB) is essential for bacteriophage T4 DNA replication, recombination and repair, acting to stimulate replisome processing and accuracy through its binding to ssDNA as the replication fork advances [ ]. The crystal structure of Gp32 shows an ssDNA binding cleft comprised of regions from three structural subdomains, through which ssDNA can slide freely []. The structure of Gp32 is similar to other phage ssDNA-binding proteins such as Gp2.5 from bacteriophage T4, and gene V protein, both of which have a nucleic acid-binding OB-type fold. However, Gp32 contains a zinc-finger subdomain at residues 63-111 that is not found in the other two phage proteins.This protein stimulates the activities of viral DNA polymerase and DnaB-like SF4 replicative helicase, probably via its interaction with the helicase assembly factor [ ], and together with DnaB-like SF4 replicative helicase and the helicase assembly factor, promotes pairing of two homologous DNA molecules containing complementary single-stranded regions and mediates homologous DNA strand exchange [].
Protein Domain
Name: HflC
Type: Family
Description: This model characterizes proteins similar to prokaryotic HflC (High frequency of lysogenization C). Although many members of the SPFH (or band 7) superfamily are lipid raft associated, prokaryote plasma membranes lack cholesterol and are unlikely to have lipid raft domains. Individual proteins of this SPFH domain superfamily may cluster to form membrane microdomains which may in turn recruit multiprotein complexes [ ]. Escherichia coli HflC is an integral membrane protein which may localize to the plasma membrane. HflC associates with another SPFH superfamily member (HflK) to form an HflKC complex []. HflKC interacts with FtsH in a large complex termed the FtsH holo-enzyme []. FtsH is an AAA ATP-dependent protease which exerts progressive proteolysis against membrane-embedded and soluble substrate proteins []. HflKC can modulate the activity of FtsH []. HflC is considered to be a protease inhibitor and is a member of inhibitor family I87 in the MEROPS database. HHflKC plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection [].
Protein Domain
Name: CD20-like family
Type: Family
Description: This entry includes B-lymphocyte antigen CD20 and other members of the membrane-spanning 4-domains subfamily A (MS4A), and closely related TMEM176 proteins [ , ]. It also includes sarcospan and some uncharacterised proteins. The MS4A family includes the B-cell-specific antigen CD20, hematopoietic-cell-specific protein HTm4, high affinity IgE receptor beta chain (FceRIbeta), and related proteins [ ]. Members of this family have four putative transmembrane segments and are predominantly expressed in hematopoietic cells, with important roles in immunity [].Sarcospan is a transmembrane component of dystrophin-glycoprotein complex (DGC), a complex that spans the muscle plasma membrane and forms a link between the F-actin cytoskeleton and the extracellular matrix. Sarcospan preferentially associates with the sarcoglycan subcomplex of the DGC. Loss of the sarcoglycan complex and sarcospan alone is sufficient to cause muscular dystrophy. The role of the sarcoglycan complex and sarcospan is thought to be to strengthen the dystrophin axis connecting the basement membrane with the cytoskeleton [ ].
Protein Domain
Name: bMERB domain
Type: Domain
Description: A variety of different effector proteins interact specifically with GTP-bound Rab proteins and mediate their versatile roles in membrane trafficking,including budding of vesicles from a donor membrane, directed transport through the cell and finally tethering and fusion with a target membrane. The"bivalent Mical/EHBP Rab binding"(bMERB) domain is a Rab effector domain that is present in proteins of the Mical and EHBP families, both known to act inendosomal trafficking. The bMERB domain displays a preference for Rab8 family proteins (Rab8, 10, 13 and 15) and at least some of the bMERB domains containtwo separate binding sites for Rab-proteins, allowing Micals and EHBPs to bind two Rabs simultaneously. The strong similarity between the two binding siteswithin one bMRB domain strongly suggests an evolutionarily development via duplication of a common ancestor supersecondary structure element [].The bMERB domain has a completely α-helical fold consisting of a central helix and N- and C-terminal helices folding back on this central helix [].
Protein Domain
Name: Hypersensitivity response secretion-like, HrpJ
Type: Domain
Description: This entry represents a conserved region approximately 200 residues long within a number of bacterial hypersensitivity response secretion protein HrpJ and similar proteins. HrpJ forms part of a type III secretion system through which, in phytopathogenic bacterial species, virulence factors are thought to be delivered to plant cells [].This entry also includes the InvE invasion protein from Salmonella. This protein is involved in host parasite interactions and mutations in the InvE gene render Salmonella typhimurium non-invasive [ ]. InvE S. typhimurium mutants fail to elicit a rapid Ca2+ increase in cultured cells, an important event in the infection procedure and internalisation of S. typhimurium into epithelial cells []. This family includes bacterial SepL and SsaL proteins. SepL plays an essential role in the infection process of enterohemorrhagic Escherichia coli and is thought to be responsible for the secretion of EspA, EspD, and EspB []. SsaL of Salmonella typhimurium is thought to be a component of the type III secretion system [].
Protein Domain
Name: RND efflux pump, membrane fusion protein, barrel-sandwich domain
Type: Domain
Description: CusB can be divided into four different domains. The first three domains of the protein are mostly β-strands. However, the fourth domain forms an all α-helical domain, which is folded into a three-helix bundle secondary structure [ ]. This is the combined domains 2 and 3 of the membrane-fusion proteins CusB and HlyD, which forms a barrel-sandwich. CusB and HlyD proteins are membrane fusion proteins of the CusCFBA copper efflux system in E.coli and related bacteria. The whole molecule hinges between D2 and D3. Efflux systems of this resistance-nodulation-division group - RND - have been developed to excrete poisonous metal ions, and in E.coli the only one that deals with silver and copper is the CusA transporter. The transporter CusA works in conjunction with a periplasmic component that is a membrane fusion protein, eg CusB, and an outer-membrane channel component CusC in a CusABC complex driven by import of protons [ ].
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom