Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 12701 to 12800 out of 30763 for seed protein

Category restricted to ProteinDomain (x)

0.036s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Name: MPP7, SH3 domain
Type: Domain
Description: MPP7 is a scaffolding protein that binds to DLG1 and promotes tight junction formation and epithelial cell polarity [, ]. Mutations in the MPP7 gene may be associated with the pathogenesis of diabetes [] and extreme bone mineral density []. It is one of seven vertebrate homologues of the Drosophila Stardust protein, which is required in establishing cell polarity, and it contains two L27 domains followed by the core of three domains characteristic of MAGUK (membrane-associated guanylate kinase) proteins: PDZ, SH3, and guanylate kinase (GuK) []. The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. This entry represents the SH3 domain of MPP7.
Protein Domain
Name: SOCS4, SOCS box domain
Type: Domain
Description: This entry represents the SOCS box domain of SOCS4.SOCS family proteins form part of a classical negative feedback system that regulates cytokine signal transduction. SOCS4 (Suppressor of cytokine signalling 4) is the substrate-recognition component of a SCF-like ECS (Elongin BC-CUL2/5-SOCS-box protein) E3 ubiquitin-protein ligase complex which mediates the ubiquitination and subsequent proteasomal degradation of target proteins [ ]. SOCS4 and SOCS5 regulate epidermal growth factor receptor signaling [].The general function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions [, ].
Protein Domain
Name: Lipovitellin-phosvitin complex, superhelical domain
Type: Homologous_superfamily
Description: Vitellinogen precursors provide the major egg yolk proteins that are a source of nutrients in the yolk of egg-laying animals. Vitellogenin is a large multidomain protein. In vertebrates, a complete vitellinogen is composed of an N-terminal signal peptide for export, followed by four regions that can be cleaved into yolk proteins: lipovitellin-1 (LV1), phosvitin (PV), lipovitellin-2 (LV2), and a von Willebrand factor type D domain (YGP40) [ ].The superhelical domain occurs at the N-terminal chain of lipovitellin-1 (LV1N) [ ]. This domain is multihelical, forming two curved, alpha/alpha layers with an overall right-handed superhelix fold. This domain also occurs in other proteins, such as microsomal triglyceride transfer proteins, responsible for the assembly of low-density lipoproteins [].
Protein Domain
Name: 4Fe4S-binding SPASM domain
Type: Domain
Description: This domain contains regions binding additional 4Fe4S clusters found in various radical SAM proteins C-terminal to the domain described by . Radical SAM enzymes with this domain tend to be involved in protein modification, including anaerobic sulphatase maturation proteins, a quinohemoprotein amine dehydrogenase biogenesis protein, the Pep1357-cyclizing radical SAM enzyme, and various bacteriocin biosynthesis proteins [, ]. SkfB from Bacillus subtilis, which catalyses the formation of the thioether bond required for production of the sporulation killing factor SkfA [ ], also contains this domain. The motif CxxCxxxxxCxxxC is nearly invariant for members of this family, although PqqE has a variant form. This domain has been named SPASM for Subtilosin, PQQ, Anaerobic Sulphatase, and Mycofactocin.
Protein Domain
Name: Synaptobrevin-like
Type: Family
Description: Synaptobrevin is an intrinsic membrane protein of small synaptic vesicles [ ], specialised secretory organelles of neurons that actively accumulate neurotransmitters and participate in their calcium-dependent release by exocytosis. Vesicle function is mediated by proteins in their membranes, although the precise nature of the protein-protein interactions underlying this are still uncertain []. Synaptobrevin may play a role in the molecular events underlying neurotransmitter release and vesicle recycling and may be involved in the regulation of membrane flow in the nerve terminal, a process mediated by interaction with low molecular weight GTP-binding proteins []. Synaptic vesicle-associated membrane proteins (VAMPs) from Torpedo californica (Pacific electric ray) and SNC1 from yeast are related to synaptobrevin.
Protein Domain
Name: Reovirus, outer capsid sigma-3 domain superfamily
Type: Homologous_superfamily
Description: Reoviruses are double-stranded RNA viruses that lack a membrane envelope. Their capsid is organised in two concentric icosahedral layers: an inner core and an outer capsid layer. The outer capsid is made up of the major proteins mu1 and sigma3, and the minor protein sigma1. The inner core structure is composed of the major core proteins lambda1 and sigma2, core spike protein lambda2, and minor core proteins lambda3 and mu2. The inner core encases the 10 segments of double-stranded RNA (dsRNA) which comprise the genome [ ].The structure of the Reovirus outer capsid protein sigma3 reveals a two-lobed structure organised around a long central helix. The smaller of the two lobes includes a CCHC zinc-binding site [ ].
Protein Domain
Name: Flavivirus glycoprotein central and dimerisation domain
Type: Domain
Description: Flaviviruses and alphaviruses are arthropod-borne, enveloped RNA viruses that cause infections in vertebrate hosts [ ]. These viruses have two envelope glycoproteins (also known as 'spike' glycoproteins), one that undergoes proteolytic cleavage to prime the virus (glycoproteins M and P62 in flaviviruses and alphaviruses, respectively), and the other to mediate receptor binding and fusion (glycoproteins E and E1 in flaviviruses and alphaviruses, respectively). Glycoprotein E/E1 is comprised of three domains: domain I (dimerisation domain) is a β-barrel, domain II (central domain) is an elongated β-stranded and α-helical domain, and domain III (immunoglobulin-like domain) is an IgC-like β-sandwich. This entry represents the intertwined central and dimerisation domains found in flaviviral glycoprotein E [, ].
Protein Domain
Name: Inosine triphosphate pyrophosphatase-like
Type: Homologous_superfamily
Description: This entry represents the inosine triphosphate pyrophosphatase (ITPase)-like proteins. Members in this family include NTPases, such as Ham1-like protein (non-canonical purine NTP pyrophosphatase) and non-canonical purine NTP phosphatase (YjjX-like), Maf-like proteins and PRRC1.Non-canonical purine NTP phosphatases are prokaryotic phosphatases that hydrolyse non-canonical purine nucleotides such as XTP and ITP to their respective diphosphate derivatives [ ].Ham1 is a pyrophosphatase that hydrolyses the non-canonical purine nucleotides inosine triphosphate (ITP), deoxyinosine triphosphate (dITP) as well as 2'-deoxy-N-6-hydroxylaminopurine triphosphate (dHAPTP) and 5-bromodeoxyuridine 5'-triphosphate (BrdUTP) to their respective monophosphate derivatives [ ].Maf proteins exhibit nucleotide pyrophosphatase activity against canonical and modified nucleotides [ ]. Proline-rich and coiled-coil-containing protein 1 (PRRC1) is a Golgi-associated protein of unknown function [].
Protein Domain
Name: Prokaryotic E2 family B domain
Type: Domain
Description: This entry represents a domain found in the E2/UBC superfamily of proteins found in several bacteria. The active site residues are similar to the eukaryotic E2 proteins but lack the conserved asparagine [ , ]. Members of this family are usually fused to an E1 domain at the C terminus. The protein is usually in the gene neighbourhood of a gene encoding a member of the pol-beta nucleotidyltransferase superfamily []. Many of the operons in this family are in ICE-like mobile elements and plasmids [].Proteins containing this domain include CD-NTase-associated protein 2 (Cap2) from Escherichia coli. Cap2 is associated with CD-NTase protein, which synthesizes cyclic nucleotides in response to infection; these serve as specific second messenger signals [ ].
Protein Domain
Name: CD34/Podocalyxin
Type: Family
Description: This family consists of several mammalian CD34 antigen proteins. The CD34 antigen is a human leukocyte membrane protein expressed specifically by lymphohematopoietic progenitor cells. CD34 is a phosphoprotein. Activation of protein kinase C (PKC) has been found to enhance CD34 phosphorylation [ , ]. This family contains several eukaryotic podocalyxin proteins. Podocalyxin is a major membrane protein of the glomerular epithelium and is thought to be involved in maintenance of the architecture of the foot processes and filtration slits characteristic of this unique epithelium by virtue of its high negative charge. Podocalyxin functions as an anti-adhesin that maintains an open filtration pathway between neighbouring foot processes in the glomerular epithelium by charge repulsion [].
Protein Domain
Name: RNA methyltransferase TrmA, active site
Type: Active_site
Description: In Escherichia coli, the trmA protein is a tRNA (uracil-5-)-methyltransferase () that catalyses the S-adenosylmethionine dependent methylation of U54 in all tRNAs. Orthologues of trmA are found in many eubacterial species. A number of uncharacterised homologues of trmA have been found [ ]:Escherichia coli hypothetical protein ygcA and HI0333, the corresponding Haemophilus influenzae protein.Haemophilus influenzae hypothetical protein HI0958.Chlamydia trachomatis protein HOM1.Fission yeast hypothetical protein SpAC4G8.07c.It is probable that ygcA/HI0333 and HI0958 are responsible for the methylation of U747 and U1939 in 23S rRNA. In trmA, a cysteine is known toparticipate in the catalytic mechanism by forming a covalent adduct to C6 of uracil.This entry represents a conserved region that includes the active site cysteine.
Protein Domain
Name: RNA methyltransferase TrmA, conserved site
Type: Conserved_site
Description: In Escherichia coli, the trmA protein is a tRNA (uracil-5-)-methyltransferase () that catalyses the S-adenosylmethionine dependent methylation of U54 in all tRNAs. Orthologues of trmA are found in many eubacterial species. A number of uncharacterised homologues of trmA have been found [ ]:Escherichia coli hypothetical protein ygcA and HI0333, the corresponding Haemophilus influenzae protein.Haemophilus influenzae hypothetical protein HI0958.Chlamydia trachomatis protein HOM1.Fission yeast hypothetical protein SpAC4G8.07c.It is probable that ygcA/HI0333 and HI0958 are responsible for the methylation of U747 and U1939 in 23S rRNA. In trmA, a cysteine is known toparticipate in the catalytic mechanism by forming a covalent adduct to C6 of uracil.This entry represents a conserved site located at the C-terminal end. It contains a conserved histidine.
Protein Domain
Name: FAM69, N-terminal
Type: Domain
Description: The FAM69 family of cysteine-rich type II transmembrane proteins localise to the endoplasmic reticulum (ER) in cultured cells, probably via N-terminal di-arginine motifs. These proteins carry at least 14 luminal cysteines which are conserved in all FAM69s. There are currently few indications of the involvement of FAM69 members in human diseases [ ]. It would appear that FAM69 proteins are predicted to be have a protein kinase structure and function. Analysis of three-dimensional structure models and conservation of the classic catalytic motifs of protein kinases in four of human FAM69 proteins suggests they might have retained catalytic phosphotransferase activity. An EF-hand Ca2+-binding domain, inserted within the structure of the kinase domain, suggests they function as Ca2+-dependent kinases [].
Protein Domain
Name: Peroxysomal long chain fatty acyl transporter
Type: Family
Description: Eukaryotic ABC transporters are found either as complete transporters or as half transporters, which dimerise to form an active transporter. Four half ABC transporter proteins have been identified in the human peroxisome membrane: the 70kDa peroxisomal membrane protein (PMP70), the adrenoleukodystrophy protein (ALDP), the PMP70-related protein (P70R) and the ALDP-related protein (ALDR) [ ]. This entry includes PMP70 [, ] (renamed as ATP-binding cassette sub-family D member 3, ABCD3) and ALDP [, ] (ATP-binding cassette sub-family D member 1, ABCD1) from human, and yeast peroxisomal long-chain fatty acid import proteins PAT1 (PXA2) and PAT2 (PXA1) [].The members of this family are involved in the import of activated long-chain fatty acids from the cytosol to the peroxisomal matrix.
Protein Domain
Name: Globin-like superfamily
Type: Homologous_superfamily
Description: Globins are haem-containing proteins involved in binding and/or transporting oxygen. They belong to a very large and well studied family that is widely distributed in many organisms [ ]. Globins have evolved from a common ancestor and can be divided into three groups: single-domain globins, and two types of chimeric globins, flavohaemoglobins and globin-coupled sensors. Bacteria have all three types of globins, while archaea lack flavohaemoglobins, and eukaryotes lack globin-coupled sensors []. Several functionally different haemoglobins can coexist in the same species. The major types of globins include:Haemoglobin (Hb): tetramer of two alpha and two beta chains, although embryonic and foetal forms can substitute the alpha or beta chain for ones with higher oxygen affinity, such as gamma, delta, epsilon or zeta chains. Hb transports oxygen from lungs to other tissues in vertebrates [ ]. Hb proteins are also present in unicellular organisms where they act as enzymes or sensors [].Myoglobin (Mb): monomeric protein responsible for oxygen storage in vertebrate muscle [ ].Neuroglobin: a myoglobin-like haemprotein expressed in vertebrate brain and retina, where it is involved in neuroprotection from damage due to hypoxia or ischemia [ ]. Neuroglobin belongs to a branch of the globin family that diverged early in evolution. Cytoglobin: an oxygen sensor expressed in multiple tissues. Related to neuroglobin [ ].Erythrocruorin: highly cooperative extracellular respiratory proteins found in annelids and arthropods that are assembled from as many as 180 subunit into hexagonal bilayers [ ].Leghaemoglobin (legHb or symbiotic Hb): occurs in the root nodules of leguminous plants, where it facilitates the diffusion of oxygen to symbiotic bacteriods in order to promote nitrogen fixation.Non-symbiotic haemoglobin (NsHb): occurs in non-leguminous plants, and can be over-expressed in stressed plants [ ].Flavohaemoglobins (FHb): chimeric, with an N-terminal globin domain and a C-terminal ferredoxin reductase-like NAD/FAD-binding domain. FHb provides protection against nitric oxide via its C-terminal domain, which transfers electrons to haem in the globin [ ].Globin-coupled sensors: chimeric, with an N-terminal myoglobin-like domain and a C-terminal domain that resembles the cytoplasmic signalling domain of bacterial chemoreceptors. They bind oxygen, and act to initiate an aerotactic response or regulate gene expression [ , ]. Protoglobin: a single domain globin found in archaea that is related to the N-terminal domain of globin-coupled sensors [ ].Truncated 2/2 globin: lack the first helix, giving them a 2-over-2 instead of the canonical 3-over-3 α-helical sandwich fold. Can be divided into three main groups (I, II and II) based on structural features [ ].This superfamily represents proteins with a globin-like fold consisting of six helices in a partly opened, folded leaf topology, as well as the truncated globins that lack the initial helix. This includes both the globins themselves, and the phycocyanin-like phycobilisome proteins (phycocyanin, allophycocyanin, phycoerythrin and phycoerythrocyanin). Phycobilisome proteins are oligomers of two different types of globin-like subunits that contain two extra helices at the N terminus, and which are use to bind a bilin chromophore. They occur in red algae and cyanobacteria, where they are used for light-harvesting [ ].
Protein Domain
Name: Aminomethyltransferase, folate-binding domain
Type: Domain
Description: This domain is found at the N terminus of glycine cleavage T-proteins, which are part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyses the catabolism of glycine in eukaryotes. The T-protein (aminomethyltransferase, ) is a folate-dependent enzyme that catalyses the release of ammonia and the transfer of the methylene carbon unit (C1 unit) to tetrahydrofolate (H4folate) from the aminomethyl intermediate attached to the lipoate cofactor of H-protein [, ].This domain is also found in YgfZ proteins. YgfZ in E.coli is a folate binding protein involved in RNA modification and regulation of chromosomal replication initiation [ ]. YgfZ is not an aminomethyltransferase but is likely a folate-dependent regulatory protein []. This domain could represent a folate-binding domain.
Protein Domain
Name: WD40/YVTN repeat-like-containing domain superfamily
Type: Homologous_superfamily
Description: This superfamily represents a WD40/YVTN repeat-like domain. Both the WD40 and the YVTN repeated motifs consist of about 40 residues, and although they consist of distinct sequences, they do share a similar structure. Structurally, both the WD40 and the YVTN repeated motifs form seven-bladed propellers (although some members can contain eight blades), which consist of seven 4-stranded β-sheets. The WD40-type repeat domain is found in the beta-1 subunit of the signal-transducing G protein [ ], in yeast Tup1 protein [], in Groucho [], in the yeast cell cycle protein Cdc4 [] and in actin-interacting protein 1 [].The YVTN-type repeat domain is found in archaeal surface layer proteins (SLPs) that protect cells from extreme environments [], in quinohemoprotein amine dehydrogenase (QHNDH) [], and in methylamine dehydrogenase [].
Protein Domain
Name: EF-hand domain pair
Type: Homologous_superfamily
Description: This domain superfamily consists of a duplication of two EF-hand units, where each unit is composed of two helices connected by a twelve-residue calcium-binding loop. The calcium ion in the EF-hand loop is coordinated in a pentagonal bipyramidal configuration. Many calcium-binding proteins contain an EF-hand type calcium-binding domain [ , ]. These include: calbindin D9K, S100 proteins such as calcyclin, polcalcin phl p 7 (a calcium-binding pollen allergen), osteonectin, parvalbumin, calmodulin [] family of proteins (troponin C, caltractin, cdc4p, myosin essential chain, calcineurin, recoverin, neurocalcin), plasmodial-specific CaII-binding protein Cbp40, penta-EF-Hand proteins [] (sorcin, grancalcin, calpain), as well as multidomain proteins such as phosphoinositide-specific phospholipase C, dystrophin, Cb1 and alpha-actinin. The fold consists of four helices and an open array of two hairpins.
Protein Domain
Name: Transcription regulator IclR, C-terminal
Type: Domain
Description: Many bacterial transcription regulation proteins which bind DNA through a 'helix-turn-helix' motif can be classified into subfamilies on the basis of sequence similarities. One of these subfamilies, called 'iclR', groups several proteins including:gylR, a possible activator protein for the gylABX glycerol operon in Streptomyces.iclR, the repressor of the acetate operon (also known as glyoxylate bypass operon) in Escherichia coli and Salmonella typhimurium. These proteins have a Helix-Turn-Helix motif at the N terminus that is similar to that of other DNA-binding proteins [ ]. This entry represents their C-terminal domain, which shares protein structural similarity with the GAF-like domain. This domain binds to regulatory substrates including glyoxylate, allantoin/ate, indole, aromatic hydrocarbons, sugar acids, succinic semialdehyde, benzoate derivatives, ascorbic acid, glycerol-3-phosphate, glyceraldehyde-3-phosphate and pyruvate [].
Protein Domain
Name: Succinate dehydrogenase, cytochrome b subunit, conserved site
Type: Conserved_site
Description: Succinate dehydrogenase (SDH) is a membrane-bound complex of two main components: a membrane-extrinsic component composed of an FAD-binding flavoprotein and an iron-sulphur protein, and a hydrophobic component composed of a cytochrome b and a membrane anchor protein. The cytochrome b component is a mono-haem transmembrane protein [ , , ] belonging to a family that includes:Cytochrome b-556 from bacterial SDH (gene sdhC).Cytochrome b560 from the mammalian mitochondrial SDH complex, which is encoded in the mitochondrial genome of some algae and in the plant Marchantia polymorpha.Cytochrome b from yeast mitochondrial SDH complex (gene SDH3 or CYB3).Protein cyt-1 from Caenorhabditis elegans.These cytochromes are proteins of about 130 residues that comprise three transmembrane regions. There are two conserved histidines which may be involved in binding the haem group.
Protein Domain
Name: EF-G binding protein, N-terminal domain superfamily
Type: Homologous_superfamily
Description: This superfamily represents a domain found in the N terminus of the FusB ( ), FusC ( ), and FusD ( ) proteins from Staphylococcus aureus. They are elongation factor G (EF-G) binding proteins that are linked to the fusidic acid (FA) resistance in S. aureus [ , ]. The FusB proteins are two-domain metalloproteins, and this N-terminal domain forms a four-helical bundle whose helices help to stabilise the conformation of the treble-clef zinc-finger in the C-terminal domain [ ]. FA is an antibiotic that binds to EF-G, preventing its release from the ribosome, thus stalling bacterial protein synthesis. The FusB proteins provide FA resistance by preventing formation or facilitating dissociation of the FA-locked EF-G-ribosome complex during elongation and ribosome recycling [].
Protein Domain
Name: FtsK domain
Type: Domain
Description: The FtsK domain is a hydrophilic domain of about 200 residues, which is found in:Bacterial cell division protein ftsK (known as sporulation protein SpoIIIE in Bacillus subtilis).A set of conjugative plasmid- and conjugative transposon-encoded proteins, generally called Tra proteins. These proteins come from an extremly widerange of species, including Gram-positive and Gram-negative bacilli and cocci, Streptomyces species, Agrobacterium spp., and archaebacteria. Incases in which a function is known, the protein is required for intercellular DNA transfer.The FtsK domain contains a highly conserved putative ATP-binding P-loop motif and is assumed to be cytoplasmic. It can be found in one to three copies and is thought to be involved in DNA translocation by couplingATP hydrolysis to movement relative to the long axis of DNA [ , , ].
Protein Domain
Name: DUF34/NIF3, animal
Type: Family
Description: This entry represents DUF34/metal-binding proteins (also referred to as NIF3-like protein 1) from animals. They share protein sequence similarity with budding yeast NIF3, which interacts with the yeast transcriptional coactivator Ngg1p that is part of the ADA complex [ , ].This entry includes the DUF34/metal-binding protein/NIF3 proteins, which are widely distributed across superkingdoms. They were previously annotated as GTP cyclohydrolase 1 type 2 [ ] and, recently, through a comprehensive literature review and integrative bioinformatic analyses it was revealed that annotations for these members are misleading as they were based on a single set of in vitro results examining the NIF3 homologue of Helicobacter pylori []. Actually, they have varied phenotypes with the unifying functional role as metal-binding proteins [].
Protein Domain
Name: DAZ/BOULE, RNA recognition motif
Type: Domain
Description: This entry represents the RNA recognition motif (RRM) domain of Deleted in AZoospermia (DAZ) homologues from invertebrates. Invertebrates contain a single DAZ homologue, known as BOULE in Drosophila, while vertebrates, other than catarrhine primates, possess both BOULE and DAZL (DAZ-like) genes. The catarrhine primates possess BOULE, DAZL, and DAZ genes [ ]. The family members encode closely related RNA-binding proteins that are required for fertility in numerous organisms. These proteins contain an RNA recognition motif (RRM) and a varying number of copies of a DAZ motif, believed to mediate protein-protein interactions. DAZL and BOULE contain a single copy of the DAZ motif, while DAZ proteins can contain 8-24 copies of this repeat [ ]. DAZ proteins are involved in gametogenesis [].
Protein Domain
Name: Octanoyltransferase LipL
Type: Family
Description: Lipoic acid is an organosulphur compound that is an essential coenzyme in several multienzyme complexes, including pyruvate dehydrogenase, 2-oxoglutarate dehydrogenase and the glycine cleavage system. Many Gram-positive bacteria, such as Bacillus subtilis, require three proteins for lipoic acid cofactor biosynthesis: LipJ, LipL and LipM [ , ]. LipM is a lipoate:protein ligase that transfers an octanoyl moiety from acyl-carrier protein to the GcvH protein of the glycine cleavage system. LipL, an octanoyltransferase, then transfers this moiety from GcvH to other enzyme complexes. LipA inserts the sulphur group to form the active lipoate cofactor.This entry represents the LipL component of this system. It includes Lipoyl-[GcvH]:protein N-lipoyltransferase from Listeria monocytogenes serovar 1/2a () [ ] and Octanoyl-[GcvH]:protein N-octanoyltransferase ( ) from Bacillus subtilis [ ].
Protein Domain
Name: Lamprin
Type: Family
Description: This family consists of several lamprin proteins from Petromyzon marinus (sea lamprey). Lamprin, an insoluble non-collagen, non-elastin protein, is the major connective tissue component of the fibrillar extracellular matrix of lamprey annular cartilage. Although not generally homologous to any other protein, soluble lamprins contain a tandemly repeated peptide sequence (GGLGY), which is present in both silkmoth chorion proteins and spider dragline silk. Strong homologies to this repeat sequence are also present in several mammalian and avian elastins. It is thought that these proteins share a structural motif which promotes self-aggregation and fibril formation in proteins through interdigitation of hydrophobic side chains in β-sheet/β-turn structures, a motif that has been preserved in recognisable form over several hundred million years of evolution [ ].
Protein Domain
Name: Elongation factor G-binding protein, N-terminal
Type: Domain
Description: This domain can be found in the N terminus of the FusB ( ), FusC ( ), and FusD ( ) proteins from Staphylococcus aureus. They are elongation factor G (EF-G) binding proteins that are linked to the fusidic acid (FA) resistance in S. aureus [ , ]. The FusB proteins are two-domain metalloproteins, and this N-terminal domain forms a four-helical bundle whose helices help to stabilise the conformation of the treble-clef zinc-finger in the C-terminal domain [ ]. FA is an antibiotic that binds to EF-G, preventing its release from the ribosome, thus stalling bacterial protein synthesis. The FusB proteins provide FA resistance by preventing formation or facilitating dissociation of the FA-locked EF-G-ribosome complex during elongation and ribosome recycling [].
Protein Domain
Name: DSC E3 ubiquitin ligase complex subunit 3, ubiquitin-like domain
Type: Domain
Description: This is the N-terminal ubiquitin-like domain of DSC E3 ubiquitin ligase complex subunit 3 (Dsc3). Dsc3 is a component of the DSC E3 ubiquitin ligase complex (a Golgi-specific protein ubiquitination system) that functions in protein homeostasis under non-stress conditions, playing a role in protein quality control through endosome and Golgi-associated degradation pathway (EGAD) which targets membrane proteins at Golgi and endosomes for degradation by cytosolic proteasomes [ , , ]. Dsc3 is also involved in endocytic protein trafficking [].Yeast DSC E3 ubiquitin ligase complex is the homologue of Hrd1 E3 ligase complex from mammals, in which Dsc1, Dsc2 and Dsc3 corresponds to Hrd1, Der1, and Usa1, respectively. Dsc3 is a Herp-like protein that acts as a bridge between Dsc1 and Dsc2 for their interaction [ ].
Protein Domain
Name: RNA-directed RNA polymerase, thumb domain, Flavivirus
Type: Domain
Description: Flaviviruses produce a large polyprotein from the ssRNA genome, encoding structural proteins required for virus assembly and non-structural (NS1-5) proteins involved in replication of the viral genome [ , , ]. This polyprotein is cleaved by viral and cellular proteases to produce mature viral proteins. NS5 is the largest mature viral protein and contains a N-terminal methyltransferase (MTase) domain separated by a short linker from the C-terminal RNA-directed RNA polymerase domain (RdRp) that adopts a characteristic right-handed fingers-palm-thumb fold and possesses a number of short regions and motifs homologous to other RdRps [, ]. This entry represents the thumb domain of NS5 RdRp. NS5 binds to a the stem loop A (SLA) at the 5' extremity of Flavivirus genome and regulates translation of the viral genome [].
Protein Domain
Name: Enkurin domain
Type: Domain
Description: The transient receptor potential-canonical (TRPC) family of cation channel has been implicated in receptor- or phospholipase C (PLC)-mediated Ca(2+) entry into animal cells. Enkurin (derived from the Greek verb enkuros: to trip or to stumble upon) interacts with several TRPC proteins and colocalises with these channels in sperm. Three protein-protein interaction domains were identified in enkurin: a C-terminal region is essential for channel interaction; an IQ motif binds the Ca(2+) sensor, calmodulin, in a Ca(2+) dependent manner; and a proline-rich N-terminal region contains predicted ligand sequences for SH3 domain proteins, including the SH3 domain of the p95 regulatory subunit of 1-phosphatidylinositol-3-kinase. Enkurin then has the anticipated chraracteristics of a scaffold protein that tethers cargo, such as SH3 domain proteins, to a subset of TRPC channels [ ].
Protein Domain
Name: Lipid transport protein, beta-sheet shell
Type: Homologous_superfamily
Description: This superfamily represents β-sheet shell domains found in lipid transport proteins such as vitellinogen. Vitellinogen precursors provide the major egg yolk proteins that are a source of nutrients during early development of oviparous vertebrates and invertebrates. Vitellinogen precursors are multi-domain apolipoproteins that are cleaved into distinct yolk proteins. Different vitellinogen precursors exist, which are composed of variable combinations of yolk protein components; however, the cleavage sites are conserved. In vertebrates, a complete vitellinogen is composed of an N-terminal signal peptide for export, followed by four regions that can be cleaved into yolk proteins: lipovitellin-1, phosvitin, lipovitellin-2, and a von Willebrand factor type D domain (YGP40) [ , ]. Domains with a β-sheet shell topology can be found within both lipovitellin-1 and lipoviteliin-2 peptide products.
Protein Domain
Name: Shank1, SH3 domain
Type: Domain
Description: Shank1, also called SSTRIP (Somatostatin receptor-interacting protein), is a brain-specific protein that plays a role in the construction of postsynaptic density (PSD) and the maturation of dendritic spines [ , ]. Mice deficient in Shank1 show altered PSD composition, thinner PSDs, smaller dendritic spines, and weaker basal synaptic transmission, although synaptic plasticity is normal. They show increased anxiety and impaired fear memory, but also show better spatial learning [, , ]. A Shank protein carries scaffolding functions through multiple sites of protein-protein interaction in its domain architecture, including ankyrin (ANK) repeats, a long proline rich region, as well as SH3, PDZ, and SAM domains.This entry represents the SH3 domain of Shank1 which binds GRIP, a scaffold protein that binds AMPA receptors and Eph receptors/ligands [ ].
Protein Domain
Name: Transthyretin-like
Type: Family
Description: Transthyretin-related proteins form a nematode-specific expanded protein family that comprises 59 members, 54 of which are predicted to be secreted. The proteins show weak similarity to transthyretin (formerly called prealbumin) which transports thyroid hormones. [ ]. Some transthyretin-related genes are induced in response to environmental challenges such as oxidative stress or pathogen exposure. These proteins possess 4 cysteine residues that are conserved within the C. elegans transthyretin-related protein family, which could serve as reactive oxygen sensors or scavengers [].Transthyretin-related protein 52 (TTR-52), a member of this family, is required for efficient cell corpse engulfment by the recognition the surface exposed phosphatidylserine (PS). TTR-52 mediates recognition of apoptotic cells by cross-linking the PS signal with the phagocyte receptor CED-1 [ ].
Protein Domain
Name: AraR-like, winged helix DNA-binding domain
Type: Domain
Description: This entry represents the C-terminal DNA-binding winged-helix-turn-helix (wHTH) domain of bacterial Nudix hydrolase domain-containing proteins such as AraR from Bacteroides thetaiotaomicron [ ]. This protein is involved in regulating Arabinose utilisation []. Winged helix DNA-binding proteins share a related winged helix-turn-helix DNA-binding motif, where the "wings", or loops, are small β-sheets. The winged helix motif consists of two wings (W1, W2), three α-helices (H1, H2, H3) and three β-sheets (S1, S2, S3) arranged in the order H1-S1-H2-H3-S2-W1-S3-W2 [ ]. The DNA-recognition helix makes sequence-specific DNA contacts with the major groove of DNA, while the wings make different DNA contacts, often with the minor groove or the backbone of DNA. Several winged-helix proteins display an exposed patch of hydrophobic residues thought to mediate protein-protein interactions.
Protein Domain
Name: Peptide N glycanase, PAW domain
Type: Domain
Description: The PAW domain (present in PNGases and other worm proteins) is found as a single copy at the C terminus of metazoan peptide:N-glycanase (PNGase)and in multiple copies in hypothetical Caenorhabditis elegans proteins peptide:N-glycanases (PNGases) [ ]. The C-terminal PAW domain of PNGase binds to the mannose moieties of N-linked oligosaccharide chains [].The PAW domain is a slightly elongated molecule and displays a β-sandwich architecture, which is composed of two layers, containing nine and eightantiparallel β-strands, respectively, and three additional short helices [ ].Some proteins known to contain a PAW domain are listed below:Animal peptide:N-glycanase (PNGase) , catalyses the deglycosylation of several misfolded N-linked glycoproteins by cleaving thebulky glycan chain before the proteins are degraded by the proteasome. Caenorhabditis elegans putative uncharacterised protein C17B7.5.
Protein Domain
Name: Major capsid protein, C-terminal
Type: Domain
Description: The entry includes major capsid proteins (vp54 and vp72) found in Iridoviruses, Phycodnaviruses, Asfarviruses and Ascoviruses, which are all type II dsDNA viruses with no RNA stage. This is the most abundant structural protein and can account for up to 45% of virion protein []. The structure of vp54 has been determined from Paramecium bursaria Chlorella virus 1 (PBCV-1), a very large icosahedral virus containing an internal membrane enclosed within a glycoprotein coat. The vp54 protein is a duplication consisting of two domains with a similar fold packed together like the nucleoplasmin subunits. The vp54 protein forms a trimer, where the domains are arranged around a pseudo 6-fold axis. The domains have a β-sandwich structure consisting of 8 strands in two sheets with a jelly-roll topology [].
Protein Domain
Name: Leucine-rich repeat domain superfamily
Type: Homologous_superfamily
Description: This superfamily represents a leucine-rich repeat (LRR), right-handed beta-alpha superhelix domain, such as that found in bacterial invasion protein internalin [ ] or in the L domain from members of the epidermal growth-factor receptor (EGFR) family [].Leucine-rich repeats (LRR) consist of 2-45 motifs of 20-30 amino acids in length that generally folds into an arc or horseshoe shape [ ]. LRRs occur in proteins ranging from viruses to eukaryotes, and appear to provide a structural framework for the formation of protein-protein interactions [, ].Proteins containing LRRs include tyrosine kinase receptors, cell-adhesion molecules, virulence factors, and extracellular matrix-binding glycoproteins, and are involved in a variety of biological processes, including signal transduction, cell adhesion, DNA repair, recombination, transcription, RNA processing, disease resistance, apoptosis, and the immune response [].
Protein Domain
Name: Agouti
Type: Family
Description: The agouti protein regulates pigmentation in the mouse hair follicle producing a black hair with a subapical yellow band. A highly homologous protein agouti signal protein (ASIP) is present in humans and is expressed at highest levels in adipose tissue where it may play a role in energy homeostasis and possibly human pigmentation [ , ]. This family also includes the Agouti-related protein (Agrp), involved in energy balance, body weight regulation and metabolism. It interacts with melanocortin receptors MC3R, MC4R and MC5R [].This family also includes Toxin Tbo-IT2 from Oblong running crab spider, which contains an inhibitor cystine knot (ICK) fold with a spatial structure and very similar Cys distribution to agouti-signaling proteins (ASIP) and agouti-related proteins (AGRP) [ ].
Protein Domain
Name: Ubiquitin thioesterase Otubain
Type: Family
Description: Otubain family members include OTUB1, OTUB2 from mammals and otubain-like proteins from insects, worms and plants. They are a group of deubiquitylating enzymes that can remove conjugated ubiquitin from proteins and plays an important regulatory role at the level of protein turnover by preventing degradation [ ]. A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families [ ]. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid, N-ethylmaleimide or p-chloromercuribenzoate. Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [ ].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [ ]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [ ]. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.
Protein Domain
Name: ABC transporter, F420-0 import, ATP-binding protein, predicted
Type: Family
Description: ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [ ].The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [, , ].The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly β-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel β-sheet of armI by a two-fold axis [, , , , , ].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions [ ]. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [, ]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [, , ].This entry represents a small clade of ABC-type transporter ATP-binding protein components encoded as part of a three gene cassette along with a periplasmic substrate-binding protein ( ) and a permease ( ). The organisms containing this cassette are all Actinobacteria and contain numerous proteins requiring the coenzyme F420. The model in this entry was defined based on five such organisms, four of which are lacking all F420 biosynthetic capability save the final side-chain polyglutamate attachment step (via the gene cofE: ). In Jonesia denitrificans DSM 20603 and marine actinobacterium PHSC20C1 this cassette is in an apparent operon with the cofE gene and, in PHSC20C1, also with a F420-dependent glucose-6-phosphate dehydrogenase ( ). Based on these observations this ATP-binding protein is predicted to be a component of an F420-0 (that is, F420 lacking only the polyglutamate tail) transporter.
Protein Domain
Name: Insulin-like growth factor II E-peptide, C-terminal
Type: Domain
Description: The insulin family of proteins groups together several evolutionarily related active peptides [ ]: these include insulin [, ], relaxin [, ], insect prothoracicotropic hormone (bombyxin) [], insulin-like growth factors (IGF1 and IGF2) [, ], mammalian Leydig cell-specific insulin-like peptide (gene INSL3), early placenta insulin-like peptide (ELIP) (gene INSL4), locust insulin-related peptide (LIRP), molluscan insulin-related peptides (MIP), and Caenorhabditis elegans insulin-like peptides. The 3D structures of a number of family members have been determined [, , ]. The fold comprises two polypeptide chains (A and B) linked by two disulphide bonds: all share a conserved arrangement of 4 cysteines in their A chain, the first of which is linked by a disulphide bond to the third, while the second and fourth are linked by interchain disulphide bonds to cysteines in the B chain. Insulin is found in many animals, and is involved in the regulation of normal glucose homeostasis. It also has other specific physiological effects, such as increasing the permeability of cells to monosaccharides, amino acids and fatty acids, and accelerating glycolysis and glycogen synthesis in the liver [ ]. Insulin exerts its effects by interaction with a cell-surface receptor, which may also result in the promotion of cell growth []. Insulin is synthesised as a prepropeptide from which an endoplasmic reticulum-targeting sequence is cleaved to yield proinsulin. The sequence of prosinsulin contains 2 well-conserved regions (designated A and B), separated by an intervening connecting region (C), which is variable between species [ ]. The connecting region is cleaved, liberating the active protein, which contains the A and B chains, held together by 2 disulphide bonds []. Insulin-like Growth Factor Binding Proteins (IGFBP) are a group of vertebrate secreted proteins, which bind to IGF-I and IGF-II with high affinity and modulate the biological actions of IGFs. The IGFBP family has six distinct subgroups, IGFBP-1 through 6, based on conservation of gene (intron-exon) organisation, structural similarity, and binding affinity for IGFs. Across species, IGFBP-5 exhibits the most sequence conservation, while IGFBP-6 exhibits the least sequence conservation. The IGFBPs contain inhibitor domain homologues, which are related to MEROPS protease inhibitor family I31 (equistatin, clan IX). All IGFBPs share a common domain architecture ( : ). While the N-terminal ( , IGF binding protein domain), and the C-terminal ( , thyroglobulin type-1 repeat) domains are conserved across vertebrate species, the mid-region is highly variable with respect to protease cleavage sites and phosphorylation and glycosylation sites. IGFBPs contain 16-18 conserved cysteines located in the N-terminal and the C-terminal regions, which form 8-9 disulphide bonds [ ]. As demonstrated for human IGFBP-5, the N terminus is the primary binding site for IGF. This region, comprised of Val49, Tyr50, Pro62 and Lys68-Leu75, forms a hydrophobic patch on the surface of the protein [ ]. The C terminus is also required for high affinity IGF binding, as well as for binding to the extracellular matrix [] and for nuclear translocation [, ] of IGFBP-3 and -5. IGFBPs are unusually pleiotropic molecules. Like other binding proteins, IGFBP can prolong the half-life of IGFs via high affinity binding of the ligands. In addition to functioning as simple carrier proteins, serum IGFBPs also serve to regulate the endocrine and paracrine/autocrine actions of IGF by modulating the IGF available to bind to signalling IGF-I receptors [ , ]. Furthermore, IGFBPs can function as growth modulators independent of IGFs. For example, IGFBP-5 stimulates markers of bone formation in osteoblasts lacking functional IGFs []. The binding of IGFBP to its putative receptor on the cell membrane may stimulate the signalling pathway independent of an IGF receptor, to mediate the effects of IGFBPs in certain target cell types. IGFBP-1 and -2, but not other IGFBPs, contain a C-terminal Arg-Gly-Asp integrin-binding motif. Thus, IGFBP-1 can also stimulate cell migration of CHO and human trophoblast cells through an action mediated by alpha 5 beta 1 integrin []. Finally, IGFBPs transported into the nucleus (via the nuclear localisation signal) may also exert IGF-independent effects by transcriptional activation of genes.This domain is the C-terminal domain of insulin-like growth factor II proteins (IGF-2, also see ) in vertebrates and seems to represent the E-peptide [ , ].
Protein Domain
Name: Aconitase, putative
Type: Family
Description: Aconitase (aconitate hydratase; ) is an iron-sulphur protein that contains a [4Fe-4S]-cluster and catalyses the interconversion of isocitrate and citrate via a cis-aconitate intermediate. Aconitase functions in both the TCA and glyoxylate cycles, however unlike the majority of iron-sulphur proteins that function as electron carriers, the [4Fe-4S]-cluster of aconitase reacts directly with an enzyme substrate. In eukaryotes there is a cytosolic form (cAcn) and a mitochondrial form (mAcn) of the enzyme. In bacteria there are also 2 forms, aconitase A (AcnA) and B (AcnB). Several aconitases are known to be multi-functional enzymes with a second non-catalytic, but essential function that arises when the cellular environment changes, such as when iron levels drop [ , ]. Eukaryotic cAcn and mAcn, and bacterial AcnA have the same domain organisation, consisting of three N-terminal alpha/beta/alpha domains, a linker region, followed by a C-terminal 'swivel' domain with a beta/beta/alpha structure (1-2-3-linker-4), although mAcn is smaller than cAcn. However, bacterial AcnB has a different organisation: it contains an N-terminal HEAT-like domain, followed by the 'swivel' domain, then the three alpha/beta/alpha domains (HEAT-4-1-2-3) [].Eukaryotic cAcn enzyme balances the amount of citrate and isocitrate in the cytoplasm, which in turn creates a balance between the amount of NADPH generated from isocitrate by isocitrate dehydrogenase with the amount of acetyl-CoA generated from citrate by citrate lyase. Fatty acid synthesis requires both NADPH and acetyl-CoA, as do other metabolic processes, including the need for NADPH to combat oxidative stress. The enzymatic form of cAcn predominates when iron levels are normal, but if they drop sufficiently to cause the disassembly of the [4Fe-4S]-cluster, then cAcn undergoes a conformational change from a compact enzyme to a more open L-shaped protein known as iron regulatory protein 1 (IRP1; or IRE-binding protein 1, IREBP1) [, ]. As IRP1, the catalytic site and the [4Fe-4S]-cluster are lost, and two new RNA-binding sites appear. IRP1 functions in the post-transcriptional regulation of genes involved in iron metabolism - it binds to mRNA iron-responsive elements (IRE), 30-nucleotide stem-loop structures at the 3' or 5' end of specific transcripts. Transcripts containing an IRE include ferritin L and H subunits (iron storage), transferrin (iron plasma chaperone), transferrin receptor (iron uptake into cells), ferroportin (iron exporter), mAcn, succinate dehydrogenase, erythroid aminolevulinic acid synthetase (tetrapyrrole biosynthesis), among others. If the IRE is in the 5'-UTR of the transcript (e.g. in ferritin mRNA), then IRP1-binding prevents its translation by blocking the transcript from binding to the ribosome. If the IRE is in the 3'-UTR of the transcript (e.g. transferrin receptor), then IRP1-binding protects it from endonuclease degradation, thereby prolonging the half-life of the transcript and enabling it to be translated [ ].IRP2 is another IRE-binding protein that binds to the same transcripts as IRP1. However, since IRP1 is predominantly in the enzymatic cAcn form, it is IRP2 that acts as the major metabolic regulator that maintains iron homeostasis [ ]. Although IRP2 is homologous to IRP1, IRP2 lacks aconitase activity, and is known only to have a single function in the post-transcriptional regulation of iron metabolism genes [ ]. In iron-replete cells, IRP2 activity is regulated primarily by iron-dependent degradation through the ubiquitin-proteasomal system.Bacterial AcnB is also known to be multi-functional. In addition to its role in the TCA cycle, AcnB was shown to be a post-transcriptional regulator of gene expression in Escherichia coli and Salmonella enterica [ , ]. In S. enterica, AcnB initiates a regulatory cascade controlling flagella biosynthesis through an interaction with the ftsH transcript, an alternative RNA polymerase sigma factor. This binding lowers the intracellular concentration of FtsH protease, which in turn enhances the amount of RNA polymerase sigma32 factor (normally degraded by FtsH protease), and sigma32 then increases the synthesis of chaperone DnaK, which in turn promotes the synthesis of the flagellar protein FliC. AcnB regulates the synthesis of other proteins as well, such as superoxide dismutase (SodA) and other enzymes involved in oxidative stress.This entry represents a small family of proteins homologous and likely functionally equivalent to aconitase 1. Members are found, so far in the anaerobe Clostridium acetobutylicum, in the microaerophilic, early-branching bacterium Aquifex aeolicus, and in the halophilic archaeon Halobacterium sp. NRC-1. No member is experimentally characterised.
Protein Domain
Name: Cysteine protease S273R
Type: Family
Description: This group represents a group of proteins predominantly found in African swine fever virus (ASVF), including Cysteine protease S273R (VPRT). This enzyme catalyses the maturation of the pp220 and pp62 polyprotein precursors into core-shell proteins [ , ].Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families [ ]. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid, N-ethylmaleimide or p-chloromercuribenzoate. Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [ ]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [ ]. The active site consists of a His/Cys catalytic dyad.
Protein Domain
Name: Lateral organ boundaries, LOB
Type: Domain
Description: The lateral organ boundaries (LOB) gene is expressed at the adaxial base of initiating lateral organs and encodes a plant-specific protein of unknown function. The N-terminal one half of the LOB protein contains a conserved approximately 100-amino acid domain (the LOB domain) that is present in 42 other Arabidopsis thaliana proteins and in proteins from a variety of other plant species. Genes encoding LOB domain (LBD) proteins are expressed in a variety of temporal- and tissue-specific patterns, suggesting that they may function in diverse processes [ ] The LOB domain contains conserved blocks of amino acids that identify the LBD gene family. In particular, a conserved C-x(2)-C-x(6)-C-x(3)-C motif, which is defining feature of the LOB domain, is present in all LBD proteins. It is possible that this motif forms a new zinc finger [].
Protein Domain
Name: Mu homology domain
Type: Domain
Description: The mu homology domain (MHD) is an ~280 residue protein-protein interaction module, which is found in endocytotic proteins involved in clathrin-mediatedendocytosis [ , , , ]:Mu subunits of adaptor protein (AP) complexes, AP-1, AP-2, AP-3, and AP-4.Proteins of the stonin family.Proteins of the muniscin family: Syp1, FCHO1/2 and SGIP1.The MHD domain has an elongated, banana-shaped, all β-sheet structure. It can be considered as two β-sandwich subdomains (A and B), with subdomain B inserted between strands 6 and 15 of subdomain A, and joined edge to edge such that the convex surface is a continuous nine-stranded mixed β-sheet that runs the whole length of the molecule. The tyrosine based signal binds to a site on the surface of two parallel β-sheet strands (beta1 and beta16) in subdomain A [, ].
Protein Domain
Name: Stn1, C-terminal domain superfamily
Type: Homologous_superfamily
Description: Stn1 and Ten1 are DNA-binding proteins with specificity for telomeric DNA substrates and both protect chromosome termini from unregulated resection and regulate telomere length. Stn1 complexes with Ten1 and Cdc13 to function as a telomere-specific replication protein A (RPA)-like complex [ ]. These three interacting proteins associate with the telomeric overhang in budding yeast, whereas a single protein known as Pot1 (protection of telomeres-1) performs this function in fission yeast. Two separate protein complexes are required for chromosome end protection in fission yeast while a two-subunit complex consisting of POT1 and TPP1 associates with telomeric ssDNA in humans. Protection of telomeres by multiple proteins with OB-fold domains is conserved in eukaryotic evolution [].This entry represents the C-terminal domain of Stn1 and consists of tandem winged helix-turn-helix motifs [ , ].
Protein Domain
Name: Major capsid protein, C-terminal domain superfamily
Type: Homologous_superfamily
Description: The entry includes major capsid proteins (vp54 and vp72) found in Iridoviruses, Phycodnaviruses, Asfarviruses and Ascoviruses, which are all type II dsDNA viruses with no RNA stage. This is the most abundant structural protein and can account for up to 45% of virion protein [ ]. The structure of vp54 has been determined from Paramecium bursaria Chlorella virus 1 (PBCV-1), a very large icosahedral virus containing an internal membrane enclosed within a glycoprotein coat. The vp54 protein is a duplication consisting of two domains with a similar fold packed together like the nucleoplasmin subunits. The vp54 protein forms a trimer, where the domains are arranged around a pseudo 6-fold axis. The domains have a β-sandwich structure consisting of 8 strands in two sheets with a jelly-roll topology [].
Protein Domain
Name: TRAP transporter solute receptor DctP superfamily
Type: Homologous_superfamily
Description: The tripartite ATP-independent periplasmic (TRAP) transporters are substrate-binding protein (SBP)-dependent secondary transporters found in prokaryotes. They consist of a substrate-binding protein (SBP) of the DctP or TAXI families and two integral membrane proteins that form the DctQ and DctM protein families [ ].DctP is part of the DctP-TRAP (tripartite ATP-independent periplasmic) transporter involved in binding extracellular solutes for transport across the bacterial cytoplasmic membrane. Proteins in this family include DctP from R. capsulatus, SiaP from Haemophilus influenzae [ ], DctB from Bacillus subtilis [], and TeaA from Halomonas elongata []. The structure of the SiaP receptor has revealed an overall topology similar to ATP binding cassette ESR (extracytoplasmic solute receptors) proteins []. Upon binding of sialic acid, SiaP undergoes domain closure about a hinge region and kinking of an α-helix hinge component [].
Protein Domain
Name: PHR domain superfamily
Type: Homologous_superfamily
Description: This domain is called PHR as it was originally found in the E3 ubiquitin-protein ligase proteins PAM ( ), highwire ( ) and RPM-1 ( ) [ ].PHR proteins are conserved, large multi-domain E3 ubiquitin ligases with modular architecture. PHR proteins presynaptically control synaptic growth and axon guidance and postsynaptically regulate endocytosis of glutamate receptors. Dysfunction of neuronal ubiquitin-mediated proteasomal degradation is implicated in various neurodegenerative diseases. PHR proteins are characterised by the presence of two PHR domains near the N terminus, which are essential for proper localisation and function. The domain has a β-sandwich fold composed of 11 anti-parallel β-strands [ ].The C-terminal region of the protein BTBD1 includes the PHR domain and is known to interact with Topoisomerase I, an enzyme which relaxes DNA supercoils [ ].
Protein Domain
Name: Interleukin-10/19/20/22/24/26 family
Type: Family
Description: Interleukin-10 (IL-10) is a protein that inhibits the synthesis of a number of cytokines, including IFN-gamma, IL-2, IL-3, TNF and GM-CSF produced by activated macrophages and by helper T cells. Structurally, IL-10 is a protein of about 160 amino acids that contains four conserved cysteines involved in disulphide bonds [ ]. IL-10 is highly similar to the Epstein-Barr virus (strain GD1) (HHV-4) (Human herpesvirus 4) BCRF1 protein which inhibits the synthesis of gamma-interferon and to Equid herpesvirus 2 (Equine herpesvirus 2) protein E7.It is also similar, but to a lesser degree, with human protein mda-7 [ ], a protein which has antiproliferative properties in human melanoma cells. Mda-7 only contains two of the four cysteines of IL-10.This entry represents the interleukin-10, interleukin-19, interleukin-20, interleukin-22, interleukin-24 and interleukin-26 family.
Protein Domain
Name: DUF34/NIF3 superfamily
Type: Homologous_superfamily
Description: This superfamily includes DUF34/metal-binding proteins (also known as GTP cyclohydrolase 1 type 2 proteins) from bacteria, NIF3 from budding yeasts and NIF3-like proteins from animals.This entry includes the DUF34/metal-binding protein/NIF3 proteins, which are widely distributed across superkingdoms. They were previously annotated as GTP cyclohydrolase 1 type 2 [ ] and, recently, through a comprehensive literature review and integrative bioinformatic analyses it was revealed that annotations for these members are misleading as they were based on a single set of in vitro results examining the NIF3 homolog of Helicobacter pylori []. Actually, they have varied phenotypes with the unifying functional role as metal-binding proteins [].NIF3 interacts with the yeast transcriptional coactivator Ngg1p which is part of the ADA complex, the exact function of this interaction is unknown [ , ].
Protein Domain
Name: HscB, C-terminal domain superfamily
Type: Homologous_superfamily
Description: This superfamily represents the C-terminal oligomerisation domain found in HscB (heat shock cognate protein B), which is also known as HSC20 (20K heat shock cognate protein) and J-protein Jac1 in yeast mitochondria [ ]. HscB acts as a co-chaperone to regulate the ATPase activity and peptide-binding specificity of the molecular chaperone HscA, also known as HSC66 (HSP70 class). HscB proteins contain two domains, an N-terminal J-domain, which is involved in interactions with HscA, connected by a short loop to the C-terminal oligomerisation domain; the two domains make contact through a hydrophobic interface. The core of the oligomerisation domain is thought to bind and target proteins to HscA and consists of an open, three-helical bundle []. HscB, along with HscA, has been shown to play a role in the biogenesis of iron-sulphur proteins.
Protein Domain
Name: Stn1, C-terminal, fungi
Type: Domain
Description: Stn1 and Ten1 are DNA-binding proteins with specificity for telomeric DNA substrates and both protect chromosome termini from unregulated resection and regulate telomere length. Stn1 complexes with Ten1 and Cdc13 to function as a telomere-specific replication protein A (RPA)-like complex [ ]. These three interacting proteins associate with the telomeric overhang in budding yeast, whereas a single protein known as Pot1 (protection of telomeres-1) performs this function in fission yeast. Two separate protein complexes are required for chromosome end protection in fission yeast while a two-subunit complex consisting of POT1 and TPP1 associates with telomeric ssDNA in humans. Protection of telomeres by multiple proteins with OB-fold domains is conserved in eukaryotic evolution [].This entry represents the C-terminal domain of Stn1 and consists of tandem winged helix-turn-helix motifs [ , ].
Protein Domain
Name: MoaD, archaeal-type
Type: Family
Description: Members of this family appear to be mainly archaeal and bacterial versions of MoaD, subunit 1 of molybdopterin converting factor. Small archaeal modifier protein 1 (SAM1) is a member of this family and it is involved in protein tagging in an ubiquitin-like system from archaea known as SAMPylation [ ]. It is not known whether it is implicated in the targeting of proteins to the proteasome for degradation []. SAM1 is also essential for MoCo-dependent dimethyl sulphoxide reductase activity, suggesting that it functions in the sulphur-containing molybdenum cofactor (MoCo) biosynthesis [].SAM proteins, as Ub/Ubl proteins, have a β-grasp fold common to a growing superfamily of proteins involved in diverse functions. Among these functions, sulphur activation for the biosynthesis of thiamine, tungsten and molybdenum cofactors bears striking resemblance to the activation of Ub/Ubl [ ].
Protein Domain
Name: PHR
Type: Domain
Description: This domain is called PHR as it was originally found in the E3 ubiquitin-protein ligase proteins PAM ( ), highwire ( ) and RPM-1 ( ) [ ].PHR proteins are conserved, large multi-domain E3 ubiquitin ligases with modular architecture. PHR proteins presynaptically control synaptic growth and axon guidance and postsynaptically regulate endocytosis of glutamate receptors. Dysfunction of neuronal ubiquitin-mediated proteasomal degradation is implicated in various neurodegenerative diseases. PHR proteins are characterised by the presence of two PHR domains near the N terminus, which are essential for proper localisation and function. The domain has a β-sandwich fold composed of 11 anti-parallel β-strands [ ].The C-terminal region of the protein BTBD1 includes the PHR domain and is known to interact with Topoisomerase I, an enzyme which relaxes DNA supercoils [ ].
Protein Domain
Name: PDCD5-like superfamily
Type: Homologous_superfamily
Description: Proteins in this entry are found in archaea and eukaryota, they contain a predicted DNA-binding domain [ ] and may function as DNA-binding proteins. Methanobacterium thermoautotrophicum MTH1615 was predicted to bind DNA based on structural proteomics data, and this was confirmed by the demonstration that it can interact non-specifically with a randomly chosen 20-mer of double stranded DNA []. This suggests that the human protein may be involved in nucleic acid binding or metabolism.The human programmed cell death protein 5 (PDCD5, also known as TFAR19) encodes a protein which shares significant homology to the corresponding proteins of species ranging from yeast to mice. PDCD5 exhibits a ubiquitous expression pattern and its expression is up-regulated in the tumour cells undergoing apoptosis. PDCD5 may play a general role in the apoptotic process [ , ].
Protein Domain
Name: GTPase GIMA/IAN/Toc
Type: Family
Description: This entry includes two subfamilies of the TRAFAC (translation factor related) class AIG1/Toc34/Toc159-like paraseptin GTPase family.The GTPases of immunity-associated protein (GIMAP)/immune-associated nucleotide-binding protein (IAN) subfamily is conserved among vertebrates and angiosperm plants and has been postulated to regulate apoptosis, particularly in context with diseases such as cancer, diabetes, and infections. The function of GIMAP/IAN GTPases has been linked to self defense in plants and to the development of T cells in vertebrates [, ].Plant-specific Toc (translocon at the outer envelope membrane of chloroplasts) proteins. Toc proteins function as integral components of the chloroplast protein import machinery. The Toc translocon contains the two membrane-bound GTPases Toc33/34 and Toc 159, which expose their G domains to the cytosol that recognise and then deliver precursor proteins through the translocation pore Toc75 [ , ].
Protein Domain
Name: Ste50, sterile alpha motif
Type: Domain
Description: Ste50-like proteins have a SAM domain at the N terminus and Ras-associated UBQ superfamily domain at the C terminus. They participate in regulation of mating pheromone response, invasive growth and high osmolarity growth response, and contribute to cell wall integrity in vegetative cells. Ste50 of S.cerevisiae acts as an adaptor protein between G protein and MAP triple kinase Ste11. Ste50 proteins are able to form homo-oligomers, binding each other via their SAM domains, as well as heterodimers and heterogeneous complexes with SAM domain or SAM homodimers of MAPKKK Ste11 protein kinase [, , , , , , , , ].The fungal Ste50 SAM domain consists of five helices, which form a compact, globular fold. It is required for mediation of homodimerisation and heterodimerisation (and in some cases oligomerisation) of the protein [ ].
Protein Domain
Name: AIP/AIPL1/TTC9
Type: Family
Description: Proteins in this entry have an N-terminal FKBP-type peptidyl-prolyl cis-trans isomerase domain followed by a C-terminal tetratricopeptide repeat-containing domain. Included in this entry are: aryl-hydrocarbon-interacting protein-like 1 (AIPL1), which is associated with inherited blindness and interacts with cell cycle regulator protein NUB1 [ ]; and AH receptor-interacting protein (AIP), which interacts with the tyrosine kinase receptor RET [], and mutations in AIP are associated with familial isolated pituitary adenomas [].This entry also matches Tetratricopeptide repeat protein 9 (TTC9). TTC9 contains tetratricopeptide repeat (TPR) domains constituted of the 34 amino acid consensus motif present in various number of tandem repeats [ ]. Tetratricopeptide repeat protein 9A (TTC9A) is involved anxiety-like behaviors through estrogen action in female mice [, ], and is thought to play a role in cancer cell invasion and metastasis [].
Protein Domain
Name: Glycoprotein C/ glycoprotein A
Type: Family
Description: Equid herpesvirus 1 (Equine herpesvirus 1, EHV-1) glycoprotein 13 (EHV-1 gp13) has the characteristic features of a membrane-spanning protein: an N-terminal signal sequence; a hydrophobic membrane anchor region; a charged C-terminal cytoplasmic tail; and an exterior domain with nine potential N-glycosylation sites [ ]. EHV-1 gp13 is the structural homologue of the gC-like glycoproteins of the Human herpesvirus 1 (HHV-1) and Human herpesvirus 2 (HHV-2) (gC-1 and gC-2 respectively), Pseudorabies virus (strain Indiana-Funkhauser/Becker) (PRV) (gIII) and Human herpesvirus 3 (HHV-3) (gp66). Secretory glycoprotein GP57-65 precursor (glycoprotein A - GA) is similar to Herpesvirus glycoprotein C, and belongs to the immunoglobulin gene superfamily [ , ]. GA is thought to play an immunoevasive role in the pathogenesis of Marek's disease. It is a candidate for causing the early-stage immunosuppression that occurs after MDHV infection.
Protein Domain
Name: LGFP
Type: Repeat
Description: This 54 amino acid repeat is found in many hypothetical proteins. Several hypothetical proteins from Corynebacterium glutamicum (Brevibacterium flavum) and Corynebacterium efficiens along with PS1 protein contain this repeat region. The N-terminal region of PS1 contains an esterase domain which transfers corynomycolic acid. The C-terminal region consists of 4 tandem LGFP repeats. It is hypothesised that the PS1 proteins in Corynebacterium, when associated with the cell wall, may be anchored via the LGFP tandem repeats that may be important for maintaining cell wall integrity. Deletion of protein results in a 10-fold increase in the cell volume of the organism and infers the corresponding involvement of the protein in the cell shape formation [ ]. The secondary structure of each repeat is predicted to comprise two β-strands and one α-helix.
Protein Domain
Name: Disks Large homologue 3, SH3 domain
Type: Domain
Description: DLG3 (discs large homologue 3), also called synapse-associated protein 102 (SAP102), is a scaffolding protein that clusters at synapses and plays an important role in synaptic development and plasticity [ ]. It associates with GluN2 subunits of NMDA receptors, particularly GluN2B which is the dominant subunit in cortical neurons during early development []. Mutations in DLG3 are associated with nonsyndromic X-linked mental retardation in humans [, ]. DLG3 is a member of the MAGUK (membrane-associated guanylate kinase) protein family, which is characterized by the presence of a core of three domains: PDZ, SH3, and guanylate kinase (GuK). The GuK domain in MAGUK proteins is enzymatically inactive; instead, the domain mediates protein-protein interactions and associates intramolecularly with the SH3 domain. DLG3 contains three PDZ domains [].This entry represents the SH3 domain of DLG3.
Protein Domain
Name: PACSIN2, F-BAR domain
Type: Domain
Description: PACSIN2 (protein kinase C and casein kinase substrate in neurons protein 2, also known as Syndapin-2) belongs to the PACSIN family that contains a N-terminal F-BAR (FCH-BAR) domain and a C-terminal SH3 domain [ ]. They are cytoplasmic phosphoproteins that play a role in vesicle formation and transport []. PACSIN2 interacts with several proteins such as Rac1, dynamin, Neuronal Wiskott-Aldrich Syndrome Protein (N-WASP), and synaptojanin via its C-terminal Src homology 3 (SH3) domain [ ]. PACSIN2 negatively regulates the EGF (epidermal growth factor) receptor activation and signaling [, ]. It plays an important role in caveolae membrane sculpting []. This entry represents the F-BAR domain of PACSIN2. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization [ ].
Protein Domain
Name: BICC1, SAM domain
Type: Domain
Description: Bicaudal-C (BICC, BICC1 in vertebrates) is an RNA-binding protein with translational repression function [ ]. It is involved in the regulation of embryonic differentiation and plays a role in the regulation of Dvl (Dishevelled) signaling, particularly in the correct cilia orientation and nodal flow generation []. In Drosophila, disruption of BICC can disturb the normal migration direction of the anterior follicle cell of oocytes []. In mammals, mutations in this gene are associated with polycystic kidney disease and it was suggested that the BICC1 protein can indirectly interact with ANKS6 protein (ANKS6 is also associated with polycystic kidney disease) through some protein and RNA intermediates [].BICC1 contains N-terminal K homology RNA-binding vigilin-like repeats and a C-terminal SAM domain. This entry represents the SAM (sterile alpha motif) domain, which is a protein-protein interaction domain [ ].
Protein Domain
Name: Co-chaperone HscB, C-terminal oligomerisation domain
Type: Domain
Description: This entry represents the C-terminal oligomerisation domain found in HscB (heat shock cognate protein B), which is also known as HSC20 (20K heat shock cognate protein) and J-protein Jac1 in yeast mitochondria [ ]. HscB acts as a co-chaperone to regulate the ATPase activity and peptide-binding specificity of the molecular chaperone HscA, also known as HSC66 (HSP70 class). HscB proteins contain two domains, an N-terminal J-domain, which is involved in interactions with HscA, connected by a short loop to the C-terminal oligomerisation domain; the two domains make contact through a hydrophobic interface. The core of the oligomerisation domain is thought to bind and target proteins to HscA and consists of an open, three-helical bundle []. HscB, along with HscA, has been shown to play a role in the biogenesis of iron-sulphur proteins.
Protein Domain
Name: Apolipoprotein C-II
Type: Family
Description: Apolipoprotein CII (apoC-II) is a surface constituent of plasma lipoproteins and the activator for lipoprotein lipase (LPL). It is therefore central for lipid transport in blood. Lipoprotein lipase is a key enzyme in the regulation of triglyceride levels in human serum [ ]. It is the C-terminal helix of apoC-II that is responsible for the activation of LPL []. The active peptide of apoC-II occurs at residues 44-79 and has been shown to reverse the symptoms of genetic apoC-II deficiency in a human subject [].Micellar SDS, a commonly used mimetic of the lipoprotein surface, inhibits the aggregation of apoC-II and induces a stable structure containing approximately 60% α-helix. The first 12 residues of apoC-II are structurally heterogeneous but the rest of the protein forms a predominantly helical structure [ ].
Protein Domain
Name: Radical SAM GDL-associated
Type: Family
Description: This narrowly distributed protein family contains an N-terminal radical SAM domain. It occurs in Pseudomonas fluorescens Pf0-1, Ralstonia solanacearum, and numerous species and strains of Burkholderia. Members always occur next to a trio of three mutually homologous genes, all of which contain the domain as the whole of the protein (about 60 amino acids) or as the C-terminal domain. The function is unknown, but the fact that all phylogenetically correlated proteins are mutually homologous with prominent invariant motifs (an invariant tyrosine and a GDL motif) and as small as 60 amino acids suggests that post-translational modification of domain-containing proteins may be its function. This view is supported by closer homology to the PqqE radical SAM protein involved in PQQ biosynthesis from the PqqA precursor peptide than to other characterised radical SAM proteins.
Protein Domain
Name: DSC E3 ubiquitin ligase complex subunit 3
Type: Family
Description: This entry represents DSC E3 ubiquitin ligase complex subunit 3 (Dsc3), a component of the DSC E3 ubiquitin ligase complex (a Golgi-specific protein ubquitination system) that functions in protein homeostasis under non-stress conditions, playing a role in protein quality control through endosome and Golgi-associated degradation pathway (EGAD) which targets membrane proteins at Golgi and endosomes for degradation by cytosolic proteasomes [, , ]. Dsc3 is also involved in endocytic protein trafficking []. Dsc3 has an ubiquitin-like domain and two C-terminal transmembrane regions.Yeast DSC E3 ubiquitin ligase complex is the homologue of Hrd1 E3 ligase complex from mammals, in which Dsc1, Dsc2 and Dsc3 corresponds to Hrd1, Der1, and Usa1, respectively. Dsc3 is a Herp-like protein that acts as a bridge between Dsc1 and Dsc2 for their interaction [ ].
Protein Domain
Name: Protease A inhibitor IA3 domain superfamily
Type: Homologous_superfamily
Description: This superfamily represents a domain found in N-terminal of IA3 protein (also known as Pai3).The IA3 polypeptide of Saccharomyces cerevisiae (also known as Pai3) is an 8kDa inhibitor of the vacuolar aspartic proteinase (proteinase A or saccharopepsin, MEROPS peptidase family A1). It belongs to MEROPS inhibitor family I34, clan JA. No other aspartic proteinase has been found to be inhibited by IA3, and at least 15 aspartic proteinases related to YprA cleave IA3 as a substrate. Ligand- free IA3 has little intrinsic secondary structure, however, upon contact with proteinase A, residues 2-32 of the inhibitor become ordered and adopt a near perfect α-helical conformation occupying the active site cleft of the enzyme. This potent, specific interaction is directed primarily by hydrophobic interactions made by three key features in the inhibitory sequence [ ].
Protein Domain
Name: SNX7, PX domain
Type: Domain
Description: Sorting nexin-7 (SNX7) belongs to the sorting nexin family, which contains a conserved PX (phox homology) domain that is responsible for binding to specific phosphoinositides [ ]. It may be involved in several stages of intracellular trafficking.The Phox Homology (PX) domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds phosphoinositides (PIs) and targets the protein to PI-enriched membranes [ , ]. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway [, , ].
Protein Domain
Name: Peptidase S8A, subtilisin-related protease, kinetoplastidia
Type: Family
Description: Limited proteolysis of most large protein precursors is carried out in vivo by the subtilisin-like pro-protein convertases. Many important biological processes such as peptide hormone synthesis, viral protein processing and receptor maturation involve proteolytic processing by these enzymes [ ]. The subtilisin-serine protease (SRSP) family hormone and pro-protein convertases (furin, PC1/3, PC2, PC4, PACE4, PC5/6, and PC7/7/LPC) act within the secretory pathway to cleave polypeptide precursors at specific basic sites, generating their biologically active forms. Serum proteins, pro-hormones, receptors, zymogens, viral surface glycoproteins, bacterial toxins, amongst others, are activated by this route []. The SRSPs share the same domain structure, including a signal peptide, the pro-peptide, the catalytic domain, the P/middle or homo B domain, and the C terminus.This entry contains putative subtilisin-related proteases from Kinetoplastida (Trypanosomatid protozoa).
Protein Domain
Name: Dinoflagellate luciferase, N-terminal
Type: Domain
Description: Proteins in this entry belong to a family of dinoflagellate luciferase and luciferin binding proteins. Luciferase is involved in catalysing the light emitting reaction in bioluminescence and luciferin binding protein (LBP) is known to bind to luciferin (the substrate for luciferase) to stop it reacting with the enzyme and therefore switching off the bioluminescence function. The expression of these two proteins is controlled by a circadian clock at the translational level, with synthesis and degradation occurring on a daily basis [ ].This entry consists of a presumed N-terminal domain that is conserved between dinoflagellate luciferase and luciferin binding proteins. This domain is not, however, the catalytic part of the protein. It has been suggested that this region may mediate an interaction between LBP and Luciferase or their association with the vacuolar membrane [ ].
Protein Domain
Name: Potyvirus NIa protease (NIa-pro) domain
Type: Domain
Description: Tobacco etch virus (TEV), tomato vein mottling virus (TVMV), and plum pox virus (PPV) are members of the Potyviridae family. The potyvirus genomeis a (+) stranded RNA and is translated into a single polyprotein upon infection, which is processed by the virally encoded proteases P1, HC-Pro, andNIa. Most of the cleavage events are performed by NIa (nuclear inclusion protein a) protease (NIa-pro). NIa-pro processes seven sites present in thepotyvirus polyprotein, named as A, B, C, D, V, E, and F. NIa-pros obtained from potyviruses have similar structures and functions. The potyvirus NIa-prohas a His-Asp-Cys catalytic triad, which is homologous to the trypsin-like proteases except for Cys replacing Ser. NIa-pros obtainedfrom potyviruses share certain sequence identities; however they recognise distinct amino acid sequences at each recognition sites. Consequently, theycannot recognise the cleavage sites of each other efficiently [ ]. Nia-probelongs to peptidase family C4. In addition to the catalytic activity NIa-pro possesses also sequence non-specific RNA-binding activity and RNApolymerase (NIb) binding activity [ ].The potyvirus NIa protein contains the following two domains; the VPg domainat the N terminus and the NIa-pro domain at the C terminus [ , ]. The ~250-amino acid NIaPro domain adopts the characteristic two-domain antiparallelβ-barrel fold that is the hallmark of trypsin-like serine proteases, with the catalytic triad residues His, Asp, and Cys located at the interfacebetween domains [ ].This entry represents the NIa-pro domain.A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families [ ]. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid, N-ethylmaleimide or p-chloromercuribenzoate. Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [ ].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [ ]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [ ]. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.
Protein Domain
Name: GroES chaperonin family
Type: Family
Description: The chaperonins are 'helper' molecules required for correct folding and subsequent assembly of some proteins [ ]. These are required for normal cell growth [], and are stress-induced, acting to stabilise or protect disassembled polypeptides under heat-shock conditions. Type I chaperonins present in eubacteria, mitochondria and chloroplasts require the concerted action of 2 proteins, chaperonin 60 (cpn60) and chaperonin 10 (cpn10) []. The 10kDa chaperonin (cpn10 - or groES in bacteria) exists as a ring-shaped oligomer of between six to eight identical subunits, while the 60kDa chaperonin (cpn60 - or groEL in bacteria) forms a structure comprising 2 stacked rings, each ring containing 7 identical subunits [ ]. These ring structures assemble by self-stimulation in the presence of Mg2+-ATP. The central cavity of the cylindrical cpn60 tetradecamer provides as isolated environment for protein folding whilst cpn-10 binds to cpn-60 and synchronizes the release of the folded protein in an Mg 2+-ATP dependent manner [ ]. The binding of cpn10 to cpn60 inhibits the weak ATPase activity of cpn60.Escherichia coli GroES has also been shown to bind ATP cooperatively, and with an affinity comparable to that of GroEL [ ]. Each GroEL subunit contains three structurally distinct domains: an apical, an intermediate and an equatorial domain. The apical domain contains the binding sites for both GroES and the unfolded protein substrate. The equatorial domain contains the ATP-binding site and most of the oligomeric contacts. The intermediate domain links the apical and equatorial domains and transfers allosteric information between them. The GroEL oligomer is a tetradecamer, cylindrically shaped, that is organised in two heptameric rings stacked back to back. Each GroEL ring contains a central cavity, known as the 'Anfinsen cage', that provides an isolated environment for protein folding. The identical 10kDa subunits of GroES form a dome-like heptameric oligomer in solution. ATP binding to GroES may be important in charging the seven subunits of the interacting GroEL ring with ATP, to facilitate cooperative ATP binding and hydrolysis for substrate protein release.
Protein Domain
Name: Chaperonin GroES, conserved site
Type: Conserved_site
Description: The chaperonins are 'helper' molecules required for correct folding and subsequent assembly of some proteins [ ]. These are required for normal cell growth [], and are stress-induced, acting to stabilise or protect disassembled polypeptides under heat-shock conditions. Type I chaperonins present in eubacteria, mitochondria and chloroplasts require the concerted action of 2 proteins, chaperonin 60 (cpn60) and chaperonin 10 (cpn10) []. The 10kDa chaperonin (cpn10 - or groES in bacteria) exists as a ring-shaped oligomer of between six to eight identical subunits, while the 60kDa chaperonin (cpn60 - or groEL in bacteria) forms a structure comprising 2 stacked rings, each ring containing 7 identical subunits [ ]. These ring structures assemble by self-stimulation in the presence of Mg2+-ATP. The central cavity of the cylindrical cpn60 tetradecamer provides as isolated environment for protein folding whilst cpn-10 binds to cpn-60 and synchronizes the release of the folded protein in an Mg 2+-ATP dependent manner [ ]. The binding of cpn10 to cpn60 inhibits the weak ATPase activity of cpn60.Escherichia coli GroES has also been shown to bind ATP cooperatively, and with an affinity comparable to that of GroEL []. Each GroEL subunit contains three structurally distinct domains: an apical, an intermediate and an equatorial domain. The apical domain contains the binding sites for both GroES and the unfolded protein substrate. The equatorial domain contains the ATP-binding site and most of the oligomeric contacts. The intermediate domain links the apical and equatorial domains and transfers allosteric information between them. The GroEL oligomer is a tetradecamer, cylindrically shaped, that is organised in two heptameric rings stacked back to back. Each GroEL ring contains a central cavity, known as the 'Anfinsen cage', that provides an isolated environment for protein folding. The identical 10kDa subunits of GroES form a dome-like heptameric oligomer in solution. ATP binding to GroES may be important in charging the seven subunits of the interacting GroEL ring with ATP, to facilitate cooperative ATP binding and hydrolysis for substrate protein release.
Protein Domain
Name: Zinc finger, Sec23/Sec24-type
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. COPII (coat protein complex II)-coated vesicles carry proteins from the endoplasmic reticulum (ER) to the Golgi complex [ ]. COPII-coated vesicles form on the ER by the stepwise recruitment of three cytosolic components: Sar1-GTP to initiate coat formation, Sec23/24 heterodimer to select SNARE and cargo molecules, and Sec13/31 to induce coat polymerisation and membrane deformation []. Sec23 p and Sec24p are structurally related, folding into five distinct domains: a β-barrel, a zinc-finger, an α/β trunk domain ( ), an all-helical region ( ), and a C-terminal gelsolin-like domain ( ). This entry describes an approximately 55-residue Sec23/24 zinc-binding domain, which lies against the β-barrel at the periphery of the complex.
Protein Domain
Name: Glutaredoxin-like
Type: Family
Description: Glutaredoxins [ , , ], also known as thioltransferases (disulphide reductases), are small proteins of approximately one hundred amino-acid residues which utilise glutathione and NADPH as cofactors. Oxidized glutathione is regenerated by glutathione reductase. Together these components compose the glutathione system [].Glutaredoxin functions as an electron carrier in the glutathione-dependent synthesis of deoxyribonucleotides by the enzyme ribonucleotide reductase. Like thioredoxin (TRX), which functions in a similar way, glutaredoxin possesses an active centre disulphide bond [ ]. It exists in either a reduced or an oxidized form where the two cysteine residues are linked in an intramolecular disulphide bond. It contains a redox active CXXC motif in a TRX fold and uses a similar dithiol mechanism employed by TRXs for intramolecular disulfide bond reduction of protein substrates. Unlike TRX, GRX has preference for mixed GSH disulfide substrates, in which it uses a monothiol mechanism where only the N-terminal cysteine is required. The flow of reducing equivalents in the GRX system goes from NADPH ->GSH reductase ->GSH ->GRX ->protein substrates [ , , , ]. By altering the redox state of target proteins, GRX is involved in many cellular functions including DNA synthesis, signal transduction and the defense against oxidative stress.Glutaredoxin has been sequenced in a variety of species. On the basis of extensive sequence similarity, it has been proposed [ ] that Vaccinia virus protein O2L is most probably a glutaredoxin. Finally, it must be noted that Bacteriophage T4 thioredoxin seems also to be evolutionary related. In position 5 of the pattern T4 thioredoxin has Val instead of Pro.This family contains several viral glutaredoxins, and many related bacterial and eukaryotic proteins of unknown function. The best characterised member of this family is G4L ( ) from Vaccinia virus (strain Western Reserve/WR) (VACV), which is necessary for virion morphogenesis and virus replication [ ]. This is a cytoplasmic protein which functions as a shuttle in a redox pathway between membrane-associated E10R and L1R or F9L [].
Protein Domain
Name: Zinc finger, FYVE/PHD-type
Type: Homologous_superfamily
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. The FYVE zinc finger domain is conserved from yeast to man, and is named after four proteins that it has been found in: Fab1, YOTB/ZK632.12, Vac1, and EEA1. It functions in the membrane recruitment of cytosolic proteins by binding to phosphatidylinositol 3-phosphate (PI3P), which is found mainly on endosomes [ , ].The plant homeodomain (PHD) zinc finger domain has a C4HC3-type motif, and is widely distributed in eukaryotes, being found in many chromatin regulatory factors [ ].Both the FYVE and the PHD zinc finger motifs display strikingly similar dimetal(zinc)-bound alpha+beta folds.
Protein Domain
Name: Zinc finger, lateral root primordium type 1
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. These sequences contain a putative zinc finger domain found predominantly in plants. Arabidopsis thaliana (Mouse-ear cress) has at least 10 distinct members. Proteins containing this domain, including LRP1 (lateral root primordium 1)[ ], generally share the same size, about 300 amino acids, and architecture. This 43-residue domain, and a more C-terminal companion domain of similar size, appear as tightly conserved islands of sequence similarity. The remainder consists largely of low-complexity sequence. Several animal proteins have regions with matching patterns of Cys, Gly, and His residues. But are excluded from this family because of their low similarity.
Protein Domain
Name: Zinc finger, Sec23/Sec24-type superfamily
Type: Homologous_superfamily
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. COPII (coat protein complex II)-coated vesicles carry proteins from the endoplasmic reticulum (ER) to the Golgi complex [ ]. COPII-coated vesicles form on the ER by the stepwise recruitment of three cytosolic components: Sar1-GTP to initiate coat formation, Sec23/24 heterodimer to select SNARE and cargo molecules, and Sec13/31 to induce coat polymerisation and membrane deformation []. Sec23 p and Sec24p are structurally related, folding into five distinct domains: a β-barrel, a zinc-finger, an α/β trunk domain ( ), an all-helical region ( ), and a C-terminal gelsolin-like domain ( ). This entry describes an approximately 55-residue Sec23/24 zinc-binding domain, which lies against the β-barrel at the periphery of the complex.
Protein Domain
Name: DPH-type metal-binding domain superfamily
Type: Homologous_superfamily
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents the DPH-type metal binding domain consists of a three-stranded β-sandwich with one sheet comprising two parallel strands: (i) β1 and (ii) β6 and one antiparallel strand: β5. The second sheet in the β-sandwich is comprised of strands β2, β3, and β4 running anti-parallel to each other. The two β-sheets are separated by a short stretch α-helix. It can be found in proteins such as DPH3 and DPH4. This domain is also found associated with N-terminal domain of heat shock protein DnaJ domain [ , , ].
Protein Domain
Name: Neurogenic locus Notch 4
Type: Family
Description: Notch cell surface receptors are large, single-pass type-1 transmembrane proteins found in a diverse range of metazoan species, from human to Caenorhabditis species. The fruit fly, Drosophila melanogaster, possesses only one Notch protein, whereas in C.elegans, two receptors have been found; by contrast, four Notch paralogues (designated N1-4) have been identified in mammals, playing both unique and redundant roles. The hetero-oligomer Notch comprises a large extracellular domain (ECD), containing 10-36 tandem Epidermal Growth Factor (EFG)-like repeats, which are involved in ligand interactions; a negative regulatory region, including three cysteine-rich Lin12-Notch Repeats (LNR); a single trans-membrane domain (TM); a small intracellular domain (ICD), which includes a RAM (RBPjk-association module) domain; six ankyrin repeats (ANK), which are involved in protein-protein interactions; and a PEST domain. Drosophila Notch also contains an OPA domain [ ]. Notch signalling is an evolutionarily conserved pathway involved in a wide variety of developmental processes, including adult homeostasis and stem cell maintenance, cell proliferation and apoptosis [ ]. Notch is activated by a range of ligands -the so-called DSL ligands (Delta/Seratte/LAG-2). Activation is also mediated by a sequence of proteolytic events: ligand binding leads to cleavage of Notch by ADAM proteases [] at site 2 (S2) and presenilin-1/g-secretase at sites 3 (S3)and 4 (S4) [].The last cleavage releases the Notch intracellular part of the protein (NICD) from the membrane and, upon release, the NICD translocates to the nucleus where it associates with a CBF1/RBJk/Su(H)/Lag1 (CSL) family of DNA-binding proteins. The subsequent recruitment of a co-activator mastermind like (MAML1) protein [] promotes transcriptional activation of Notch target genes: well established Notch targets are the Hes and Hey gene families. Aberrant Notch function and signalling has been associated with a number of human disorders, including Allagile syndrome, spondylocostal dysostosis, aortic valve disease, CADASIL (Cerebral Autosomal Dominant Arteriopathy with Subcortical Infarcts and Leukoencephalopathy), and T-cell Acute Lympho-blastic Leukemia (T-ALL); it has also been implicated in various human carcinomas [ , ]. Notch 4 is selectively expressed in vascular endothelium, and regulates vascular remodelling [ ].
Protein Domain
Name: Notch
Type: Family
Description: Notch cell surface receptors are large, single-pass type-1 transmembrane proteins found in a diverse range of metazoan species, from human to Caenorhabditis species. The fruit fly, Drosophila melanogaster, possesses only one Notch protein, whereas in C.elegans, two receptors have been found; by contrast, four Notch paralogues (designated N1-4) have been identified in mammals, playing both unique and redundant roles. The hetero-oligomer Notch comprises a large extracellular domain (ECD), containing 10-36 tandem Epidermal Growth Factor (EFG)-like repeats, which are involved in ligand interactions; a negative regulatory region, including three cysteine-rich Lin12-Notch Repeats (LNR); a single trans-membrane domain (TM); a small intracellular domain (ICD), which includes a RAM (RBPjk-association module) domain; six ankyrin repeats (ANK), which are involved in protein-protein interactions; and a PEST domain. Drosophila Notch also contains an OPA domain [ ]. Notch signalling is an evolutionarily conserved pathway involved in a wide variety of developmental processes, including adult homeostasis and stem cell maintenance, cell proliferation and apoptosis [ ]. Notch is activated by a range of ligands -the so-called DSL ligands (Delta/Seratte/LAG-2). Activation is also mediated by a sequence of proteolytic events: ligand binding leads to cleavage of Notch by ADAM proteases [] at site 2 (S2) and presenilin-1/g-secretase at sites 3 (S3)and 4 (S4) [].The last cleavage releases the Notch intracellular part of the protein (NICD) from the membrane and, upon release, the NICD translocates to the nucleus where it associates with a CBF1/RBJk/Su(H)/Lag1 (CSL) family of DNA-binding proteins. The subsequent recruitment of a co-activator mastermind like (MAML1) protein [] promotes transcriptional activation of Notch target genes: well established Notch targets are the Hes and Hey gene families. Aberrant Notch function and signalling has been associated with a number of human disorders, including Allagile syndrome, spondylocostal dysostosis, aortic valve disease, CADASIL (Cerebral Autosomal Dominant Arteriopathy with Subcortical Infarcts and Leukoencephalopathy), and T-cell Acute Lympho-blastic Leukemia (T-ALL); it has also been implicated in various human carcinomas [ , ].
Protein Domain
Name: Zinc finger, MYM-type
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. MYM-type zinc fingers were identified in MYM family proteins [ ]. Human protein is involved in a chromosomal translocation and may be responsible for X-linked retardation in XQ13.1 [ ]. is also involved in disease. In myeloproliferative disorders it is fused to FGF receptor 1 [ ]; in atypical myeloproliferative disorders it is rearranged []. Members of the family generally are involved in development. This Zn-finger domain functions as a transcriptional trans-activator of late vaccinia viral genes, and orthologues are also found in all nucleocytoplasmic large DNA viruses, NCLDV. This domain is also found fused to the C termini of recombinases from certain prokaryotic transposons [].
Protein Domain
Name: GroES chaperonin superfamily
Type: Homologous_superfamily
Description: The chaperonins are 'helper' molecules required for correct folding and subsequent assembly of some proteins [ ]. These are required for normal cell growth [], and are stress-induced, acting to stabilise or protect disassembled polypeptides under heat-shock conditions. Type I chaperonins present in eubacteria, mitochondria and chloroplasts require the concerted action of 2 proteins, chaperonin 60 (cpn60) and chaperonin 10 (cpn10) []. The 10kDa chaperonin (cpn10 - or groES in bacteria) exists as a ring-shaped oligomer of between six to eight identical subunits, while the 60kDa chaperonin (cpn60 - or groEL in bacteria) forms a structure comprising 2 stacked rings, each ring containing 7 identical subunits [ ]. These ring structures assemble by self-stimulation in the presence of Mg2+-ATP. The central cavity of the cylindrical cpn60 tetradecamer provides as isolated environment for protein folding whilst cpn-10 binds to cpn-60 and synchronizes the release of the folded protein in an Mg 2+-ATP dependent manner [ ]. The binding of cpn10 to cpn60 inhibits the weak ATPase activity of cpn60.Escherichia coli GroES has also been shown to bind ATP cooperatively, and with an affinity comparable to that of GroEL [ ]. Each GroEL subunit contains three structurally distinct domains: an apical, an intermediate and an equatorial domain. The apical domain contains the binding sites for both GroES and the unfolded protein substrate. The equatorial domain contains the ATP-binding site and most of the oligomeric contacts. The intermediate domain links the apical and equatorial domains and transfers allosteric information between them. The GroEL oligomer is a tetradecamer, cylindrically shaped, that is organised in two heptameric rings stacked back to back. Each GroEL ring contains a central cavity, known as the 'Anfinsen cage', that provides an isolated environment for protein folding. The identical 10kDa subunits of GroES form a dome-like heptameric oligomer in solution. ATP binding to GroES may be important in charging the seven subunits of the interacting GroEL ring with ATP, to facilitate cooperative ATP binding and hydrolysis for substrate protein release.
Protein Domain
Name: Concentrative nucleoside transporter, metazoan/bacterial
Type: Family
Description: Nucleosides are hydrophilic molecules and require specialised transport proteins for permeation of cell membranes. There are two types of nucleoside transport processes: equilibrative bidirectional processes driven by chemical gradients, and inwardly directed concentrative processes driven by an electrochemical gradient [ ]. The two types of nucleoside transporters are classified into two families: the solute carrier (SLC) 29 and SLC28 families, corresponding to equilibrative and concentrative nucleoside transporters, respectively [].The microbial proteins include broad specificity transporters, such as the Escherichia coli NupC protein which transports all nucleosides (both ribo- and deoxyribonucleosides) except hypoxanthine and guanine nucleosides [ ]. Bacillus subtilis NupC transporter has been shown to be involved in transport of the pyrimidine nucleoside uridine []. A recently characterised fungal protein, the first transporter of this type to be described in eukaryotes, exhibited transport activity for adenosine, uridine, inosine and guanosine but not cytidine, thymidine or the nucleobase hypoxanthine [].The characterised mammalian proteins can be divided into three subgroups; CNT1, CNT2 and CNT3 [ ]. CNT1 preferentially transports pyrimidines and weakly transports adenosine. Several antiviral and anticancer nucleoside analogues, including AZT and dFdC are also substrates for CNT1. CNT2 selectively transports purines, and the human form has also been shown to facilitate the uptake of some antiviral compounds including ddI and ribavirin. CNT3 has a broader specificity, transporting both purines and pyrimidines. Several anticancer nucleoside analogues such as CdA, dFdC and FdU are also transported by CNT3. Substrate specificity appears to depend on a region containing transmembrane regions 7, 8 and 9. Mutation of just four residues in this region was sufficient to convert the activity of human CNT1 to that of CNT2. At least three other concentrative nucleoside transport activities have been described in mammalian cells, but the proteins responsible for these activities have not yet been identified.This entry represents a family of Concentrative Nucleoside Transporter (CNT) proteins found in bacteria and animals.
Protein Domain
Name: E2/EBNA1, C-terminal
Type: Homologous_superfamily
Description: This superfamily represents a ferredoxin-like fold found in the C terminus of the Papillomavirus E2 protein and the the Epstein-Barr virus (strain GD1) nuclear antigen 1 (EBNA1). This region is the DNA-binding and dimerisation domain found in EBNA1 [ ].E2 is an early regulatory protein found in the dsDNA papillomaviruses. The viral genome is a 7.9-kb circular DNA that codes for at least eight early and two late (capsid) proteins. The products of the early genes E6 and E7 are oncoproteins that destabilise the cellular tumour suppressors p53 and pRB. The product of the E1 gene is a helicase necessary for viral DNA replication. The products of the E2 gene play key roles in the regulation of viral gene transcription and DNA replication. During early stages of viral infection, the E2 protein represses the transcription of the oncogenes E6 and E7, reintroduction of E2 into cervical cancer cell-lines leads to repression of E6/E7 transcription, stabilisation of the tumour suppressor p53, andcell-cycle arrest at the G1 phase of the cell cycle. E2 can also induce apoptosis by a p53-independent mechanism. E2 proteins from all papillomavirus strains bind a consensus palindromic sequence ACCgNNNNcGGT present in multiple copies in the regulatory region. It can either activate or repress transcription, depending on E2RE's position with regard to proximal promoter elements. Repression occurs by sterically hindering the assembly of the transcription initiation complex. The E2 protein is composed of a C-terminal DNA-binding domain and an N-terminal trans-activation domain. E2 exists in solution and binds to DNA as a dimer The E2-DNA binding domain forms a dimeric β-barrel, with each subunit contributing an anti-parallel 4-stranded β-sheet "half-barrel"[ , ]. The topology of each subunit is beta1-1-beta2-beta3-2-beta4. Helix 1 is the recognition helix housing all of the amino acid residues involved in direct DNA sequence specification. Upon dimerisation, strands beta2 and beta4 at the edges of each subunit participate in a continuous hydrogen-bonding network, which results in an 8-stranded β-barrel. The dimer interface is extensive, made up of hydrogen bondsbetween subunits and a substantial hydrophobic β-barrel core.
Protein Domain
Name: Cell wall/choline-binding repeat
Type: Repeat
Description: The cell wall-binding repeat (CW) is an about 20 amino acid residue module, essentially found in two bacterial Gram-positive protein families; the choline binding proteins and glucosyltransferases ( ). In choline-binding proteins cell wall binding repeats bind to choline moieties of both teichoic and lipoteichoic acids, two components peculiar to the cell surface of Gram-positive bacteria [ , ]. In glucosyltransferases the region spanning the CW repeats is a glucan binding domain [].Several crystal structures of CW have been solved [ , ]. In the choline binding protein LytA, the repeats adopt a solenoid fold consisting exclusively of β-hairpins that stack to form a left-handed superhelix with a boomerang-like shape. The choline groups bind between β-hairpin 'steps' of the superhelix [ ]. In Cpl-1 CW repeats assemble in two sub-domains: an N-terminal superhelical moiety similar to the LytA one and a C-terminal β-sheet involved in interactions with the lysozyme domain. Choline is bound between repeats 1 and 2, and, 2 and 3 of the superhelical sub-domain [].Some proteins known to contain cell-wall binding repeats include:Pneumococcal N-acetylmuramoyl-L-alanine amidase (autolysin, lytA) ( ). It is a surface-exposed enzyme that rules the self-destruction of pneumococcal cells through degradation of their peptidoglycan backbone. It mediates the release of toxic substances that damage the host tissues. Pneumococcal endo-beta-N-acetylglucosaminidase (lytB) ( ). It plays an important role in cell wall degradation and cell separation. Pneumococcal teichoic acid phosphorylcholine esterase (pce or cbpE), a cell wall hydrolase important for cellular adhesion and colonisation.Lactobacillales glucosyltransferase. It catalyses the transfer of glucosyl units from the cleavage of sucrose to a growing chain of glucan.Clostridium difficile toxin A (tcdA) and toxin B (tcdb). They are the causative agents of the antibiotic-associated pseudomembranous colitis. They are intracellular acting toxins that reach their targets after receptor-mediated endocytosis.Clostridium acetobutylicum cspA protein.Siphoviridae bacteriophages N-acetylmuramoyl-L-alanine amidase. It lyses the bacterial host cell wall.Podoviridae lysozyme protein (cpl-1). It is capable of digesting the pneumococcal cell wall.The cell wall binding repeats are also known as the choline-binding repeats (ChBr) or the choline-binding domain (ChBD).
Protein Domain
Name: FARP1/FARP2/FRMD7, FERM domain C-lobe
Type: Domain
Description: Proteins containing this domain include FARP1, FARP2 and FRMD7. FARP1 and FARP2 are members of the Dbl family guanine nucleotide exchange factors (GEFs) which are upstream positive regulators of Rho GTPases [ ]. FARP1 has increased expression in differentiated chondrocytes. FARP2 is thought to regulate neurite remodeling by mediating the signaling pathways from membrane proteins to Rac. It is found in brain, lung, and testis, as well as embryonic hippocampal and cortical neurons []. They are composed of a N-terminal FERM domain, a proline-rich (PR) domain, Dbl-homology (DH), and two C-terminal PH domains.FRMD7 (FERM domain-containing protein 7) and Caenorhabditis elegans CFRM3 have a FERM domain that is closely related to that in FARP1 and FARP2. Both have unknown functions. They contain an N-terminal FERM domain, a PH domain, followed by a FA (FERM adjacent) domain [ ]. FRMD7 has been linked to Idiopathic congenital nystagmus , an infant-onset disease with the typical features of bilateral ocular oscillations, visual impairment, and abnormal head movement []. The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. Like most other ERM members they have a phosphoinositide-binding site in their FERM domain. The FERM C domain is the third structural domain within the FERM domain. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs) , the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites [ , ].
Protein Domain
Name: Complement B/C2
Type: Family
Description: These proteins belong to MEROPS peptidase family S1 (chymotrypsin family, clan PA(S)), subfamily S1A.This family contains two mammalian proteins, complement C2 and complement factor B, which, respectively, have analogous roles in the classical and alternative pathways of complement activation. These proteins are composed of three regions, an N-terminal three-module complement control protein domain, a von Willebrand factor A domain, and a C-terminal serine protease domain. Briefly, they are activated by cleavage and function as the serine protease components of the C3/C5 convertases, which play similar roles in these pathways although composed of different proteins. Homologs in non-mammalian species are often more or less equally related to mammalian C2 and B and may be designated as complement B/C2. Strongylocentrotus purpuratus (Purple sea urchin) has an atypical factor B with a five-module complement control protein domain.The structures of the von Willebrand factor A and serine protease domains from human complement factor B ( ) have been analysed [ , ]. The A domain forms the classical vWF A domain fold, which consists of a central β-sheet flanked on both sides by amphipathic alpha helices. It contains an integrin-like MIDAS (metal ion-dependent adhesion site) motif that adopts the open conformation typical of integrin-ligand complexes, with an acidic residue from another A domain (provided by a fortuitous crystal contact) completing the coordination of the metal ion. Although a closed conformation was not observed, modelling studies suggest that the A domain could adopt this conformation, implying that as with integrins, ligand-binding may induce conformational changes which transduce a signal to other domains in the protein []. The serine protease domain forms a chymotrypsin fold with several novel features [ ]. Like other serine proteases it forms two β-sheets, composed of six β-strands each, surrounded by surface helices and loops. However, several novel deletions and insertions occur within these surface helices and loops, and differences in active site conformation also exist.
Protein Domain
Name: JAK1-3/TYK2, pleckstrin homology-like domain
Type: Domain
Description: This entry represents the pleckstrin homology-like (PHL) subdomain found in JAK1-3/TYK2 proteins. PHL (residues 283-419) together with the N-terminal ubiquitin-like subdomain (residues 36-111) and an acyl-coenzyme A binding protein-like subdomain (residues 148-282), associate into a canonical tri-lobed FERM domain [ ].Janus kinases (JAKs) are tyrosine kinases that function in membrane-proximal signalling events initiated by a variety of extracellular factors binding to cell surface receptors [ ]. Many type I and II cytokine receptors lack a protein tyrosine kinase domain and rely on JAKs to initiate the cytoplasmic signal transduction cascade. Ligand binding induces oligomerisation of the receptors, which then activates the cytoplasmic receptor-associated JAKs. These subsequently phosphorylate tyrosine residues along the receptor chains with which they are associated. The phosphotyrosine residues are a target for a variety of SH2 domain-containing transducer proteins. Amongst these are the signal transducers and activators of transcription (STAT) proteins, which, after binding to the receptor chains, are phosphorylated by the JAK proteins. Phosphorylation enables the STAT proteins to dimerise and translocate into the nucleus, where they alter the expression of cytokine-regulated genes. This system is known as the JAK-STAT pathway. Four mammalian JAK family members have been identified: JAK1, JAK2, JAK3, and TYK2. They are relatively large kinases of approximately 1150 amino acids, with molecular weights of ~120-130kDa. Their amino acid sequences are characterised by the presence of 7 highly conserved domains, termed JAK homology (JH) domains. The C-terminal domain (JH1) is responsible for the tyrosine kinase function. The next domain in the sequence (JH2) is known as the tyrosine kinase-like domain, as its sequence shows high similarity to functional kinases but does not possess any catalytic activity. Although the function of this domain is not well established, there is some evidence for a regulatory role on the JH1 domain, thus modulating catalytic activity. The N-terminal portion of the JAKs (spanning JH7 to JH3) is important for receptor association and non-catalytic activity, and consists of JH3-JH4, which is homologous to the SH2 domain, and lastly JH5-JH7, which is a FERM domain.
Protein Domain
Name: Peptidase S8A, subtilisin-related, proteobacteria-2
Type: Family
Description: Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes [ ]. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Many families of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases [].Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base [ ]. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ].Limited proteolysis of most large protein precursors is carried out in vivo by the subtilisin-like pro-protein convertases. Many important biological processes such as peptide hormone synthesis, viral protein processing and receptor maturation involve proteolytic processing by these enzymes [ ]. The subtilisin-serine protease (SRSP) family hormone and pro-protein convertases (furin, PC1/3, PC2, PC4, PACE4, PC5/6, and PC7/7/LPC) act within the secretory pathway to cleave polypeptide precursors at specific basic sites, generating their biologically active forms. Serum proteins, pro-hormones, receptors, zymogens, viral surface glycoproteins, bacterial toxins, amongst others, are activated by this route []. The SRSPs share the same domain structure, including a signal peptide, the pro-peptide, the catalytic domain, the P/middle or homo B domain, and the C terminus.This entry contains unassigned serine peptidases belonging to MEROPS peptidase family S8A (subtilisin family, clan SB) from the proteobacteria.
Protein Domain
Name: Peptidase S8A, subtilisin-related, bacteroidetes-2
Type: Family
Description: Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes [ ]. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Many families of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases [].Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base [ ]. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ].Limited proteolysis of most large protein precursors is carried out in vivo by the subtilisin-like pro-protein convertases. Many important biological processes such as peptide hormone synthesis, viral protein processing and receptor maturation involve proteolytic processing by these enzymes [ ]. The subtilisin-serine protease (SRSP) family hormone and pro-protein convertases (furin, PC1/3, PC2, PC4, PACE4, PC5/6, and PC7/7/LPC) act within the secretory pathway to cleave polypeptide precursors at specific basic sites, generating their biologically active forms. Serum proteins, pro-hormones, receptors, zymogens, viral surface glycoproteins, bacterial toxins, amongst others, are activated by this route []. The SRSPs share the same domain structure, including a signal peptide, the pro-peptide, the catalytic domain, the P/middle or homo B domain, and the C terminus.This entry contains unassigned serine peptidases belonging to MEROPS peptidase family S8A (subtilisin family, clan SB) from the Bacteroidetes.
Protein Domain
Name: Peptidase S8A, subtilisin-related, Moth-2364-type
Type: Family
Description: Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes [ ]. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Many families of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases [].Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base [ ]. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ].Limited proteolysis of most large protein precursors is carried out in vivo by the subtilisin-like pro-protein convertases. Many important biological processes such as peptide hormone synthesis, viral protein processing and receptor maturation involve proteolytic processing by these enzymes [ ]. The subtilisin-serine protease (SRSP) family hormone and pro-protein convertases (furin, PC1/3, PC2, PC4, PACE4, PC5/6, and PC7/7/LPC) act within the secretory pathway to cleave polypeptide precursors at specific basic sites, generating their biologically active forms. Serum proteins, pro-hormones, receptors, zymogens, viral surface glycoproteins, bacterial toxins, amongst others, are activated by this route []. The SRSPs share the same domain structure, including a signal peptide, the pro-peptide, the catalytic domain, the P/middle or homo B domain, and the C terminus.This entry contains serine peptidases belonging to MEROPS peptidase family S8A (subtilisin family, clan SB), which include unassigned peptidases from Moorella thermoacetica (Clostridium thermoaceticum), Syntrophomonas wolfei and Pelotomaculum thermopropionicum.
Protein Domain
Name: CIDE-N domain
Type: Domain
Description: The CIDE-N or CAD domain is a ~78 amino acid protein-protein interaction domain in the N-terminal part of Cell death-Inducing DFF45-like Effector (CIDE) proteins, involved in apoptosis. At the final stage of programmed cell death, chromosomal DNA is degraded into fragments by Caspase-activated DNase (CAD), also named DNA fragmentation factor 40kDa (DFF40). In normal cells CAD/DFF40 is completely inhibited by its binding to DFF45 or Inhibitor of CAD (ICAD). Apoptotic stimuli provoke cleavage of ICAD/DFF45 by caspases, resulting in self-assembly of CAD/DFF40 into the active dimer [ ].Both CAD/DFF40 and ICAD/DFF45 possess an N-terminal CIDE-N domain that is involved in their interaction. The name of the CIDE-N domain refers to the CIDE proteins and CAD, where the domain forms the N-terminal part [, ]. The CIDE-N domains from different proteins can interact, e.g. CIDE-N of CIDE-B and ICAD/DFF45 with CIDE-N of CAD/DFF40, and such interactions can also be needed for proper folding [, ].Tertiary structures show that the CIDE-N domain forms an alpha/beta roll fold of five β-strands forming a single, mixed parallel/anti-parallel β-sheet with one [] or two [, ] α-helices packed against the sheet. Binding surfaces of the CIDE-N domain form a central hydrophobic cluster, while specific binding interfaces can be formed by charged patches.Some proteins known to contain a CIDE-N domain include: Mammalian DNA fragmentation factor 40kDa (DFF40) or Caspase-activated deoxyribonuclease (CAD), an endonuclease that induces DNA fragmentation and chromatin condensation during apoptosis. The degradation of chromosomal DNA by CAD/DFF40 will kill the cells.Mammalian DNA fragmentation factor 45kDa (DFF45) or Inhibitor of CAD (ICAD), which controls the activity and proper folding of CAD/DFF40. Mammalian CIDE-A and CIDE-B, activators of cell death and DNA fragmentation that can be inhibited by ICAD/DFF45. In contrast with CAD and ICAD, the CIDE proteins are expressed in a highly restricted way and show pronounced tissue specificity.Fruit fly DNAation factor DREP1, a DFF45 homologue that can inhibit CIDE-A-induced apoptosis.
Protein Domain
Name: Peptidase S8A, subtilisin-related, proteobacteria
Type: Family
Description: Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes [ ]. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Many families of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases [].Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base [ ]. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ].Limited proteolysis of most large protein precursors is carried out in vivo by the subtilisin-like pro-protein convertases. Many important biological processes such as peptide hormone synthesis, viral protein processing and receptor maturation involve proteolytic processing by these enzymes [ ]. The subtilisin-serine protease (SRSP) family hormone and pro-protein convertases (furin, PC1/3, PC2, PC4, PACE4, PC5/6, and PC7/7/LPC) act within the secretory pathway to cleave polypeptide precursors at specific basic sites, generating their biologically active forms. Serum proteins, pro-hormones, receptors, zymogens, viral surface glycoproteins, bacterial toxins, amongst others, are activated by this route []. The SRSPs share the same domain structure, including a signal peptide, the pro-peptide, the catalytic domain, the P/middle or homo B domain, and the C terminus.This entry contains unassigned serine peptidases belonging to MEROPS peptidase family S8A (subtilisin family, clan SB).
Protein Domain
Name: DNA endonuclease activator Ctp1, C-terminal
Type: Domain
Description: This entry represents the C-terminal domain of the fission yeast Ctip (Ctp1) protein. Proteins containing this domain include DNA endonuclease RBBP8 (also known as CtBP-interacting protein, CtIP) from animals, protein gamma response 1 (GR1) from Arabidopsis and SAE2 from S. cerevisiae [ , ]. SAE2 is a protein involved in repairing meiotic and mitotic double-strand breaks in DNA [, , ].Although proteins containing this domain were described as endonucleases, it is now known that they actually function as endonuclease activators that cooperates with the MRE11-RAD50-NBN (MRN) complex in processing meiotic and mitotic double-strand breaks (DSBs) by ensuring both resection and intrachromosomal association of the broken ends [ , , ]. This domain contains highly conserved residues at its 15-residue extreme that are indispensable for MRN (Mre11-Rad50-Nbs1) complex activation, through the stimulation of Mre11 endonuclease activity [].
Protein Domain
Name: RWD domain
Type: Domain
Description: The RWD domain is a conserved region of about 110 amino acid residues, which has been identified in the mouse GCN2 eIF2alpha kinase and histidyl-tRNAsynthetase and in presumed orthologues in other eukaryotic species from yeast to vertebrates. Additionally, it is also found in WD repeat containing proteins,yeast DEAD (DEXD)-like helicases, many RING-finger containing proteins, the UPF0029 uncharacterised protein family and a range of hypothetical proteins. The RWD domain has been named after the better characterised RING finger and WD repeat containing proteins and DEAD-like helicases. It has been proposed that the RWD domain might have a function in protein interaction []. The RWD domain is predicted to have an alpha/beta secondary structure and is thought to be related to ubiquitin-conjugating enzymes (UBCc) domain, althoughthe catalytic cysteine critical for ubiquitin-conjugating activity is not conserved in most members of the novel subfamily [].
Protein Domain
Name: Sugar phosphate transporter
Type: Family
Description: Proteins in this group are involved in the transport system that mediates the uptake of a number of sugar phosphates as well as the regulatory components that are responsible for induction of this transport system by external glucose 6-phosphate. In Escherichia coli its role in transmembrane signalling may involve sugar-phosphate-binding sites and transmembrane orientations similar to those of the transport protein [ ]. The following proteins in this entry, involved in the uptake of phosphorylated metabolites,are evolutionary related [ , ]:E. coli, Bacillus subtilis and Haemophilus influenzae glycerol-3- phosphate transporter (gene glpT).Salmonella typhimurium phosphoglycerate transporter (gene pgtP).E. coli and S. typhimurium hexose-6-phosphate transporter (gene uhpT).E. coli and S. typhimurium protein uhpC. UhpC is necessary for the expression of uhpT and seems to act jointly with the uhpB sensor/kinase protein.Human glucose 6-phosphate translocase [ ].These proteins of about 50kDa apparently contain 12 transmembrane regions.
Protein Domain
Name: Peptidase S26A, signal peptidase I
Type: Family
Description: This group of serine peptidases belong to MEROPS peptidase family S26 (signal peptidase I family, clan SF), subfamily S26A.At least 3 eubacterial leader peptidases are known: murein prelipoproteinpeptidase, which cleaves the leader peptide from a component of the bacterial outer membrane; type IV prepilin leader peptidase; and the serine-dependent leader peptidase 1, which has the more general role of cleavingthe leader peptide from a variety of secreted proteins and proteins directed to the periplasm and periplasmic membrane []. Leader peptidase 1 is similar to the eukaryotic signal peptidase, although the bacterial protein is monomeric, while the eukaryotic protein is multimeric [].Mitochondria contain a similar two-subunit serine protease that removes leader peptides from nuclear- and mitochondrial-encoded proteins, whichlocalise in the inner mitochondrial space [ ]. The catalytic residues of a number of these peptides have been identified as a serine/lysine dyad [].
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom