Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 901 to 1000 out of 38750 for *

Category restricted to ProteinDomain (x)

0.013s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Name: Homologous recombination OB-fold protein
Type: Family
Description: During homologous recombination, HROB acts by recruiting the MCM8-MCM9 helicase complex to sites of DNA damage to promote DNA repair synthesis [ , ].
Protein Domain
Name: Glycosyl transferase, ALG6/ALG8
Type: Family
Description: N-linked (asparagine-linked) glycosylation of proteins is mediated by a highly conserved pathway in eukaryotes, in which a lipid (dolichol phosphate)-linked oligosaccharide is assembled at the endoplasmic reticulum membrane prior to the transfer of the oligosaccharidemoiety to the target asparagine residues. This oligosaccharide is composed of Glc(3)Man(9)GlcNAc(2). The addition of the three glucose residues is the final series of steps in the synthesis of the oligosaccharide precursor. Alg6 transfers the first glucose residue,and Alg8 transfers the second one [ ]. In the human alg6 gene, a C-T transition, which causes Ala333 to be replaced with Val, hasbeen identified as the cause of a congenital disorder of glycosylation, designated as type Ic OMIM:603147 [ ].
Protein Domain
Name: Transporter protein SLAC1/Mae1/ Ssu1/TehA
Type: Family
Description: Each of these transporters has ten alpha helical transmembrane segments [ ]. The structure of a bacterial homologue of SLAC1 shows it to have a trimeric arrangement. The pore is composed of five helices with a conserved Phe residue involved in gating. One homologue, Mae1 from the yeast Schizosaccharomyces pombe, functions as a malate uptake transporter; another, Ssu1from Saccharomyces cerevisiae and other fungi including Aspergillus fumigatus, is characterised as a sulfite efflux pump; and TehA from Escherichia coli is identified as a tellurite resistance protein by virtue of its association in the tehA/tehB operon. In plants, homologues are found in the stomatal guard cells functioning as an anion-transporting pore []. Many homologues are incorrectly annotated as tellurite resistance or dicarboxylate transporter (TDT) proteins.
Protein Domain
Name: Copine, C-terminal
Type: Domain
Description: This represents a conserved region approximately 180 residues long within eukaryotic copines. Copines are Ca 2+-dependent phospholipid-binding proteins that are thought to be involved in membrane-trafficking, and may also be involved in cell division and growth [ ]. They were originally identified in paramecium. They are found in human and orthologues have been found in C. elegans and Arabidopsis Thaliana. None have been found in D. Melanogaster or S. Cereviciae. Phylogenetic distribution suggests that copines have been lost in some eukaryotes []. No functional properties have been assigned to the VWA domains present in copines. The members of this subgroup contain a functional MIDAS motif based on their preferential binding to magnesium and manganese. However, the MIDAS motif is not totally conserved, in most cases the MIDAS consists of the sequence DxTxS instead of the motif DxSxS that is found in most cases. The C2 domains present in copines mediate phospholipid binding [, ].
Protein Domain
Name: von Willebrand factor, type A
Type: Domain
Description: The von Willebrand factor is a large multimeric glycoprotein found in blood plasma. Mutant forms are involved in the aetiology of bleeding disorders []. In von Willebrand factor, the type A domain (vWF) is the prototype for a protein superfamily. The vWF domain is found in various plasma proteins: complement factors B, C2, CR3 and CR4; the integrins (I-domains); collagen types VI, VII, XII and XIV; and other extracellular proteins [, , ]. Although the majority of VWA-containing proteins are extracellular, the most ancient ones present in all eukaryotes are all intracellular proteins involved in functions such as transcription, DNA repair, ribosomal and membrane transport and the proteasome. A common feature appears to be involvement in multiprotein complexes. Proteins that incorporate vWF domains participate in numerous biological events (e.g. cell adhesion, migration, homing, pattern formation, and signal transduction), involving interaction with a large array of ligands []. A number of human diseases arise from mutations in VWA domains. Secondary structure prediction from 75 aligned vWF sequences has revealed a largely alternating sequence of α-helices and β-strands [ ]. The vWF domain fold is predicted to be a doubly-wound, open, twisted β-sheet flanked by α-helices []. 3D structures have been determined for the I-domains of integrins alpha-M (CD11b; with bound magnesium) [ ] and alpha-L (CD11a; with bound manganese) []. The domain adopts a classic α/β Rossmann fold and contains an unusual metal ion coordination site at its surface. It has been suggested that this site represents a general metal ion-dependent adhesion site (MIDAS) for binding protein ligands []. The residues constituting the MIDAS motif in the CD11band CD11a I-domains are completely conserved, but the manner in which the metal ion is coordinated differs slightly [].
Protein Domain
Name: Clp protease, ATP-binding subunit ClpX
Type: Family
Description: ClpX is a member of the HSP (heat-shock protein) 100 family. Gel filtration and electron microscopy showed that ClpX subunits associate to form a six-membered ring that is stabilised by binding of ATP or nonhydrolysable analogs of ATP [ ]. It functions as an ATP-dependent [] molecular chaperone and is the regulatory subunit of the ClpXP protease [].ClpXP is involved in DNA damage repair, stationary-phase gene expression, and ssrA-mediated protein quality control. To date more than 50 proteins include transcription factors, metabolic enzymes, and proteins involved in the starvation and oxidative stress responses have been identified as substrates []. The N-terminal domain of ClpX is a C4-type zinc binding domain (ZBD) involved in substrate recognition. ZBD forms a very stable dimer that is essential for promoting the degradation of some typical ClpXP substrates such as lO and MuA [ ].
Protein Domain
Name: CS domain
Type: Domain
Description: The bipartite CS domain, which was named after CHORD-containing proteins and SGT1 [ ], is a ~100-residue protein-protein interaction module. The CS domain can be found in stand-alone form, as well as fused with other domains, such as CHORD (), SGS ( ), TPR ( ), cytochrome b5 ( ) or b5 reductase, in multidomain proteins [ ]. The CS domain has a compact antiparallel β-sandwich fold consisting of seven β-strands [, ]. Some proteins known to contain a CS domain are listed below []: Eukaryotic proteins of the SGT1 family. Eukaryotic Rar1, related to pathogenic resistance in plants, and to development in animals. Eukaryotic nuclear movement protein nudC. Eukaryotic proteins of the p23/wos2 family, which act as co-chaperone. Animal b5+b5R flavo-hemo cytochrome NAD(P)H oxydoreductase type B. Mammalian integrin beta-1-binding protein 2 (melusin).
Protein Domain
Name: SGS domain
Type: Domain
Description: The SGT1-specific (SGS) domain is a module of ~90 amino acids, which was initially identified in eukaryotic Sgt1 proteins []. It was latter also found in calcyclin-binding proteins []. The SGS domain has been shown to bind to proteins of the S100 family, which are thought to function as sensors of calcium ion concentration in the cell [].In budding yeasts, Sgt1 is required for both SCF (Skp1p/Cdc53p-Cullin-F-box)-mediated ubiquitination, cyclic AMP pathway activity and kinetochore function [ ]. Its Schizosaccharomyces pombe homologue, Git7, is required for glucose and cyclic AMP signaling, cell wall integrity, and septation []. Its two homologues in Arabidopsis, SGT1a and SGT1b, can complement two yeast temperature-sensitive sgt1 mutant alleles, suggesting that fundamental cellular function(s) of yeast SGT1in SCF-mediated protein ubiquitylation. Moreover, SGT1a and SGT1b can act as cochaperones with HSP90 and HSC70 and function in regulating multiple resistance (R) genes and environmental responses [, , , ]. The SGS domain of SGT1 is a key determinant of the HSC70-SGT1 association [].Calcyclin (S100A6) is a member of the S100A family of calcium binding proteins and appears to play a role in cell proliferation [ ].
Protein Domain
Name: Aminotransferase class-III
Type: Family
Description: Aminotransferases share certain mechanistic features with other pyridoxalphosphate-dependent enzymes, such as the covalent binding of the pyridoxalphosphate group to a lysine residue. On the basis of sequence similarity, these various enzymes can be grouped [ ] into subfamilies. One of these, called class-III, includes acetylornithine aminotransferase (), which catalyses the transfer of an amino group from acetylornithine to alpha-ketoglutarate, yielding N-acetyl-glutamic-5-semi-aldehyde and glutamic acid [ ]; ornithine aminotransferase (), which catalyses the transfer of an amino group from ornithine to alpha-ketoglutarate, yielding glutamic-5-semi-aldehyde and glutamic acid [ ]; omega-amino acid--pyruvate aminotransferase (), which catalyses transamination between a variety of omega-amino acids, mono- and diamines, and pyruvate [ ]; 4-aminobutyrate aminotransferase () (GABA transaminase), which catalyses the transfer of an amino group from GABA to alpha-ketoglutarate, yielding succinate semialdehyde and glutamic acid [ ]; DAPA aminotransferase (), a bacterial enzyme (bioA), which catalyses an intermediate step in the biosynthesis of biotin, the transamination of 7-keto-8-aminopelargonic acid to form 7,8-diaminopelargonic acid [ ]; 2,2-dialkylglycine decarboxylase (), a Burkholderia cepacia (Pseudomonas cepacia) enzyme (dgdA) that catalyses the decarboxylating amino transfer of 2,2-dialkylglycine and pyruvate to dialkyl ketone, alanine and carbon dioxide [ ]; glutamate-1-semialdehyde aminotransferase () (GSA) [ ]; Bacillus subtilis aminotransferases yhxA and yodT; Haemophilus influenzae diaminobutyrate--2-oxoglutarate aminotransferase (HI0949) []; and Caenorhabditis elegans alanine--glyoxylate aminotransferase 2-like (T01B11.2).
Protein Domain
Name: Ornithine aminotransferase
Type: Family
Description: Ornithine aminotransferase catalyses the conversion of L-ornithine and a 2-oxo acid to L-glutamate 5-semialdehyde and an L-amino acid. This enzyme is found in low-GC bacteria, where it is responsible for the fourth step in arginine biosynthesis, and in the mitochondrial matrix of eukaryotes, where it controls L-ornithine levels in tissues. In human hereditary ornithine aminotransferase deficiency, the elevated levels of intraocular concentrations of ornithine are responsible for gyrate atrophy, which affects the CNS and peripheral nervous system [ ].
Protein Domain
Name: Reticulon
Type: Domain
Description: Eukaryotic proteins of the reticulon (RTN) family all share an association with the endoplasmic reticulum (ER). Whereas amino-terminal regions are not related to one another, all reticulon proteins share a 200 amino acid residue region of sequence similarity at the C-terminal. This region contains two large hydrophobic regions separated by a 66 residue hydrophilic segment. The conserved hydrophobic C-terminal portion has been shown to play an essential role in the association of reticulons with the ER membrane. The hydrophobic portions are supposed to be membrane-embedded and the hydrophilic 66 residue localized to the lumenal/extracellular face of the membrane. Most reticulons have a di-lysine ER retention motif at the C-terminal. Because of their likely association with the rough as well as the smooth ER, the reticulons might play some role in transport processes or in regulation of intracellular calcium levels. It has been suggested that the reticulons may be serving as ER-associated channel-like complexes [ , , , ].
Protein Domain
Name: Transcription factor TCP subgroup
Type: Domain
Description: The TCP domain has been named after its first characterised members (TB1, CYC and PCFs). So far, members of the TCP family have only been found in plants and function in processes related to cell proliferation. The TCP domain is probably involved in DNA-binding and protein-protein interactions [ ].The TCP domain is predicted to form a non-canonical basic-Helix-Loop-Helix (bHLH). The main conserved features of the TCP domain are: two short stretches of residues in the basic region, hydrophobic residues along the apolar face of both α-helices, a tryptophan in helix II, and a helix-breaking glycine in the loop between the helices. However the residues in the loop and the hydrophilic residues of the helices are not as well conserved. TCP domains form two subfamilies: one closely related to CYC and TB1, and another more related to the PCFs. The basic region of the CYC/TB1 subfamily contains a putative bipartite nuclear localisation signal (NLS) while the basic region of the PCF subfamily contains only a portion of a bipartite NLS [ ].
Protein Domain
Name: Transcription factor, TCP
Type: Family
Description: The TCP transcription factor family was named after: teosinte branched 1 (tb1, Zea mays (Maize)) [ ], cycloidea (cyc) (Antirrhinum majus) (Garden snapdragon) [] and PCF in rice (Oryza sativa) [, ]. The TCP proteins code for structurally related proteins implicated in the evolution of key morphological traits []. However, the biochemical function of CYC and TB1 proteins remains to be demonstrated. One of the conserved regions is predicted to form a non-canonical basic-Helix-Loop-Helix (bHLP) structure. This domain is also found in two rice DNA-binding proteins, PCF1 and PCF2, where it has been shown to be involved in DNA-binding and dimerization.This family of transcription factors are exclusive to higher plants. They can be divided into two groups, TCP-C and TCP-P, that appear to have separated following an early gene duplication event [ ]. This duplication event may have led to functional divergence and it has been proposed that that the TCP-P subfamily are transcriptional repressors, while the TPC-C subfamily are transcription activators [].
Protein Domain
Name: CYC/TB1, R domain
Type: Domain
Description: Members of the TCP family of transcription factors have so far only been found in plants, where they are implicated in processes related to cell proliferation. It appears that TCP domain (see ) protein have been recruited during evolution to control cell division and growth in various developmental processes. The TCP proteins fall into two subfamilies, one including CYC and TB1 and the other including the PCFs. Most members of the CYC/TB1 subfamily have an R domain, predicted to form a coiled coil that may mediate protein-protein interactions [ , ].The R domain is rich in polar residues (arginine, lysine and glutamic acid) and is predicted to form a hydrophilic α-helix [ ].Some proteins known to contain an R domain are listed below:Antirrhinum majus (Garden snapdragon) cycloidea (CYC). It is involved in the control of floral symmetry, a character that has changed many times during plant evolutionZea mays (Maize) teosinte branched 1 (TB1). It controls the developmental of apical dominance that contributed to the evolution of modern day maize from its wild ancestor teosinte Arabidopsis thaliana (Mouse-ear cress) TCP2 and TCP3, which correlate with actively dividing regions of the floral meristem
Protein Domain
Name: ABC-2 type transporter
Type: Domain
Description: ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [ ].The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [, , ].The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly β-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel β-sheet of armI by a two-fold axis [ , , , , , ].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions [ ]. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [, ]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [, , ].A number of bacterial transport systems have been found to contain integral membrane components that have similar sequences []: these systems fit thecharacteristics of ATP-binding cassette transporters [ ]. Theproteins form homo- or hetero-oligomeric channels, allowing ATP-mediated transport. Hydropathy analysis of the proteins has revealed the presenceof 6 possible transmembrane regions. These proteins belong to family 2 of ABC transporters.
Protein Domain
Name: FAD-binding domain, ferredoxin reductase-type
Type: Domain
Description: Flavoenzymes have the ability to catalyse a wide range of biochemical reactions. They are involved in the dehydrogenation of a variety of metabolites, in electron transfer from and to redox centres, in light emission, in the activation of oxygen for oxidation and hydroxylation reactions [ ]. About 1% of all eukaryotic and prokaryotic proteins are predicted to encode a flavin adenine dinucleotide (FAD)-binding domain []. According to structural similarities and conserved sequence motifs, FAD-binding domains have been grouped in three main families: (i)the ferredoxin reductase (FR)-type FAD-binding domain, (ii) the FAD-binding domains that adopt a Rossmann fold and (iii) the PCMH-type FAD-binding domain [ ].The FAD cofactor consists of adenosine monophosphate (AMP) linked to flavin mononucleotide (FMN) by a pyrophosphate bond. The AMP moiety is composed of the adenine ring bonded to a ribose that is linked to a phosphate group. The FMN moiety is composed of the isoalloxazine-flavin ring linked to a ribitol, which is connected to a phosphate group. The flavin functions mainly in a redox capacity, being able to take up two electrons from one substrate and release them two at a time to a substrate or coenzyme, or one at a time to an electron acceptor. The catalytic function of the FAD is concentrated in the isoalloxazine ring, whereas the ribityl phosphate and the AMP moiety mainly stabilise cofactor binding to protein residues [ ].The structural core of all FR family members is well conserved. The FAD-binding fold characteristic of the FR family is a cylindrical β-domain with a flattened six-stranded antiparallel β-barrel organised into two orthogonal sheets (B1-B2-B5 and B4-B3-B6) separated by one α-helix [ ]. The cylinder is open between strands B4 and B5 which makes space for the isoalloxazine and ribityl moieties of the FAD. One end of the cylinder is covered by the only helix of the domain, which is essential for the binding of the pyrophosphate groups of the FAD. The FR family contains two conserved motifs, one (R-x-Y-[ST]) located in B4 where the invariant positively charge Arg residue forms hydrogen bonds to the negative pyrophosphate oxygen atom. The other conserved sequence motif is G-x(2)-[ST]-x(2)-L-x(5)-G-x(7)-P-x-G, which is part of H1-B6 and is known as the phosphate-binding motif [, ].
Protein Domain
Name: Ferric reductase, NAD binding domain
Type: Domain
Description: This entry contains ferric reductase NAD binding proteins.
Protein Domain
Name: Cytochrome b245, heavy chain
Type: Family
Description: Phagocytes form the first line of defence against invasion by micro-organisms. Engulfing of bacteria by neutrophils during phagocytosis is accompanied by a respiratory burst. Defects in phagocytosis involving the lack of a respiratory burst give rise to chronic granulomatous disease (CGD) [ ]. Regulation of the respiratory burst takes place at the phagocytic vacuole. The process is mediated by NADPH oxidase, which transports electrons across the plasma membrane to form superoxide in the vacuole interior. The electrons are carried across the membrane by a short electron transport chain in the form of an unusual flavocytochrome b.The flavoprotein comprises two subunits, p21phox and gp91phox. Gp91phox has 2 major domains, an N-terminal, hydrophobic domain with a number of putative transmembrane helices that could associate to form a barrel-like pore in the membrane; and a more hydrophilic C-terminal domain, which probably lies on the cytosolic side of the membrane, capping the transmembrane structure [ ]. The C-terminal domain is similar to a number of electron-transport proteins, one of which, ferredoxin-NADP reductase, has provided a structural framework upon which to model this domain []. Around two thirds of individuals with CGD inherit the disease in an X-linked, recessive manner. The gene responsible for X-linked CGD is that coding for the large subunit of flavocytochrome b, gp91phox.
Protein Domain
Name: FAD-binding 8
Type: Domain
Description: This FAD binding domain is associated with ferric reductase NAD binding proteins and the heavy chain of Cytochrome b-245.
Protein Domain
Name: Multi antimicrobial extrusion protein
Type: Family
Description: In general, proteins from the MATE family are involved in exporting metabolites across the cell membrane and are often responsible for multidrug resistance (MDR) [ , ]. These proteins mediate resistance to a wide range of cationic dyes, fluroquinolones, aminoglycosides and other structurally diverse antibodies and drugs. MATE proteins are found in bacteria, archaea and eukaryotes. These proteins are predicted to have 12 α-helical transmembrane regions, some of the animal proteins may have an additional C-terminal helix [].
Protein Domain
Name: Ribosomal protein L38e
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].Ribosomal protein L38e forms part of the 60S ribosomal subunit [ ]. This family is found in eukaryotes.
Protein Domain
Name: Auxin response factor domain
Type: Domain
Description: This pattern represents a conserved region of auxin-responsive transcription factors.The plant hormone auxin (indole-3-acetic acid) can regulate the gene expression of several families, including Aux/IAA, GH3 and SAUR families. Two related families of proteins, Aux/IAA proteins () and the auxin response factors (ARF), are key regulators of auxin-modulated gene expression [ ]. There are multiple ARF proteins, some of which activate, while others repress transcription. ARF proteins bind to auxin-responsive cis-acting promoter elements (AuxREs) using an N-terminal DNA-binding domain. It is thought that Aux/IAA proteins activate transcription by modifying ARF activity through the C-terminal protein-protein interaction domains found in both Aux/IAA and ARF proteins.
Protein Domain
Name: AUX/IAA protein
Type: Family
Description: The Aux/IAA proteins are key regulators of auxin-modulated gene expression [ ]. The plant hormone auxin (indole-3-acetic acid, IAA) regulates diverse cellular and developmental responses in plants, including cell division, expansion, differentiation and patterning of embryo responses []. Auxin can regulate the gene expression of several families, including GH3 and SAUR, as well as Aux/IAA itself. The Aux/IAA proteins act as repressors of auxin-induced gene expression, possibly through modulating the activity of DNA-binding auxin response factors (ARFs). Aux/IAA and ARF are thought to interact through C-terminal protein-protein interaction domains found in both Aux/IAA and ARF.Recent evidence suggests that Aux/IAA proteins can also mediate light responses [ ]. Some members of the Aux/IAA family are longer and contain an N-terminal DNA binding domain [] and may have an early function in the establishment of vascular and body patterns in embryonic and post-embryonic development in some plants.
Protein Domain
Name: B3 DNA binding domain
Type: Domain
Description: Two DNA binding proteins, RAV1 and RAV2 from Arabidopsis thaliana contain two distinct amino acid sequence domains found only in higher plant species. The N-terminal regions of RAV1 and RAV2 are homologous to the AP2 DNA-binding domain (see ) present in a family of transcription factors, while the C-terminal region exhibits homology to the highly conserved C-terminal domain, designated B3, of VP1/ABI3 transcription factors [ ]. The AP2 and B3-like domains of RAV1 bind autonomously to the CAACA and CACCTG motifs, respectively, and together achieve a high affinity and specificity of binding. It has been suggested that the AP2 and B3-like domains of RAV1 are connected by a highly flexible structure enabling the two domains to bind to the CAACA and CACCTG motifs in various spacings and orientations [].This entry represents the B3 DNA binding domain. Its DNA binding activity has been demonstrated [ ]. The B3 domain can be found in one or more copies.
Protein Domain      
Protein Domain
Name: Glycosyl transferase, family 31
Type: Family
Description: The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates ([intenz:2.4.1.-]) and related proteins into distinct sequence based families has been described []. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form 'clans'.Glycosyltransferase family 31 ( ) comprises enzymes with a number of known activities; N-acetyllactosaminide beta-1,3-N-acetylglucosaminyltransferase ( ); beta-1,3-galactosyltransferase ( ); fucose-specific beta-1,3-N-acetylglucosaminyltransferase ( ); globotriosylceramide beta-1,3-GalNAc transferase ( ) [ , ].
Protein Domain
Name: Galectin, carbohydrate recognition domain
Type: Domain
Description: Galectins (also known as galaptins or S-lectin) are a family of proteins defined by having at least one characteristic carbohydrate recognition domain (CRD) with an affinity for beta-galactosides and sharing certain sequence elements. Members of the galectins family are found in mammals, birds, amphibians, fish, nematodes, sponges, and some fungi. Galectins are known to carry out intra- and extracellular functions through glycoconjugate-mediated recogntion. From the cytosol they may be secreted by non-classical pathways, but they may also be targeted to the nucleus or specific sub-cytosolic sites. Within the same peptide chain some galectins have a CRD with only a few additional amino acids, whereas others have two CRDs joined by a link peptide, and one (galectin-3) has one CRD joined to a different type of domain [ , ].The galectin carbohydrate recognition domain (CRD) is a β-sandwich of about 135 amino acid. The two sheets are slightly bent with 6 strands forming the concave side and 5 strands forming the convex side. The concave side forms a groove in which carbohydrate is bound, and which is long enough to hold about a linear tetrasaccharide [ , ].
Protein Domain
Name: Universal stress protein A family
Type: Family
Description: Transcriptional induction of the uspA gene of Escherichia coli occurs when conditions cause growth arrest; cells deficient in UspA survivepoorly in stationary phase [ ]. The product of uspA has been shown to bea cytoplasmic serine and threonine phosphoprotein. Members of the Usp family are predicted to be related to the MADS-box proteins and bind to DNA[ ]. Some members of the family contain 2 copies of the domain. The structure of a UspA homologue from Methanocaldococcus jannaschii (Methanococcus jannaschii) from has been determined to 1.8 angstroms resolution by using its selenomethionyl derivativeand multiwavelength anomalous diffraction. The protein homodimerises in the crystal; each monomer adopts an open-twisted 5-stranded parallel β-sheet with 2 helices on each side of the sheet []. Although the structureco-crystallised with ATP, the function of the protein is unknown. This entry includes five E. coli Usp paralogues: uspA, uspC, uspD, uspF and uspG [ ].
Protein Domain
Name: UspA
Type: Domain
Description: This entry represents a domain found in the universal stress protein UspA [ ], which is a small cytoplasmic bacterial protein whose expression is enhanced when the cell is exposed to stress agents. UspA enhances the rate of cell survival during prolonged exposure to such conditions, and may provide a general "stress endurance"activity. The crystal structure of Haemophilus influenzae UspA [ ] revealsan alpha/beta fold similar to that of the Methanocaldococcus jannaschii (Methanococcus jannaschii) MJ0577 protein, which binds ATP [ ], though UspA lacks ATP-binding activity.Proteins containing this domain include the TeaD protein from Halomonas elongata. TeaD regulates the ectoine uptake by the transporter TeaABC. TeaD shows an ATP-dependent oligomerisation [].
Protein Domain
Name: PWWP domain
Type: Domain
Description: The PWWP domain is an around 70 amino acids domain that was named after its central core 'Pro-Trp-Trp-Pro'. The PWWP domain is found in one or, less frequently, in two copies in nuclear, often DNA-binding proteins that function as transcription factors regulating developmental processes. Due to its position, the composition of amino acids close to the PWWP motif and the pattern of other domains present, it has been proposed that the PWWP domain is involved in protein-protein interactions [ , ]. The structure of the PWWP domain comprises a five-stranded β-barrel followed by a five-helix bundle [].Conservation of the PWWP domain is concentrated on one major and two minor blocks with length differences occurring in between [ ].
Protein Domain      
Protein Domain
Name: Phosphoglycerate kinase, N-terminal
Type: Homologous_superfamily
Description: Phosphoglycerate kinase ( ) (PGK) is an enzyme that catalyses the formation of ATP to ADP and vice versa. In the second step of the second phase in glycolysis, 1,3-diphosphoglycerate is converted to 3-phosphoglycerate, forming one molecule of ATP. If the reverse were to occur, one molecule of ADP would be formed. This reaction is essential in most cells for the generation of ATP in aerobes, for fermentation in anaerobes and for carbon fixation in plants.PGK is found in all living organisms and its sequence has been highly conserved throughout evolution. The enzyme exists as a monomer containing two nearly equal-sized domains that correspond to the N- and C-termini of the protein (the last 15 C-terminal residues loop back into the N-terminal domain). 3-phosphoglycerate (3-PG) binds to the N-terminal, while the nucleotide substrates, MgATP or MgADP, bind to the C-terminal domain of the enzyme. This extended two-domain structure is associated with large-scale 'hinge-bending' conformational changes, similar to those found in hexokinase [ ]. At the core of each domain is a 6-stranded parallel β-sheet surrounded by alpha helices. Domain 1 has a parallel β-sheet of six strands with an order of 342156, while domain 2 has a parallel β-sheet of six strands with an order of 321456. Analysis of the reversible unfolding of yeast phosphoglycerate kinase leads to the conclusion that the two lobes are capable of folding independently, consistent with the presence of intermediates on the folding pathway with a single domain folded []. Phosphoglycerate kinase (PGK) deficiency is associated with haemolytic anaemia and mental disorders in man [ ].This superfamily represents the N-terminal domain of PGK.
Protein Domain
Name: tRNA (guanine-N-7) methyltransferase, Trmb type
Type: Family
Description: This entry represents tRNA (guanine-N-7) methyltransferase ( ), which catalyses the formation of N(7)-methylguanine at position 46 (m7G46) in tRNA. Capping of the pre-mRNA 5' end by addition a monomethylated guanosine cap (m(7)G) is an essential and the earliest modification in the biogenesis of mRNA [ ]. The reaction is catalysed by three enzymes: triphosphatase, guanylyltransferase, and tRNA (guanine-N-7) methyltransferase [, ]. This entry includes Bacillus subtilis TrmB, which contains a unique variantof the Rossmann-fold methyltransferase (RFM) structure, with the N-terminal helix folded on the opposite site of the catalytic domain [ ].Methyltransferases (EC [intenz:2.1.1.-]) constitute an important class of enzymes present in every life form. They transfer a methyl group most frequently from S-adenosyl L-methionine (SAM or AdoMet) to a nucleophilic acceptor such as oxygen leading to S-adenosyl-L-homocysteine (AdoHcy) and a methylated molecule [, , ]. All these enzymes have in common a conserved region of about 130 amino acid residues that allow them to bind SAM []. The substrates that are methylated by these enzymes cover virtually every kind of biomolecules ranging from small molecules, to lipids, proteins and nucleic acids [, , ]. Methyltransferase are therefore involved in many essential cellular processes including biosynthesis, signal transduction, protein repair, chromatin regulation and gene silencing [, , ]. More than 230 families of methyltransferases have been described so far, of which more than 220 use SAM as the methyl donor.
Protein Domain
Name: Phosphoglycerate kinase
Type: Family
Description: Phosphoglycerate kinase ( ) (PGK) is an enzyme that catalyses the formation of ATP to ADP and vice versa. In the second step of the second phase in glycolysis, 1,3-diphosphoglycerate is converted to 3-phosphoglycerate, forming one molecule of ATP. If the reverse were to occur, one molecule of ADP would be formed. This reaction is essential in most cells for the generation of ATP in aerobes, for fermentation in anaerobes and for carbon fixation in plants.PGK is found in all living organisms and its sequence has been highly conserved throughout evolution. The enzyme exists as a monomer containing two nearly equal-sized domains that correspond to the N- and C-termini of the protein (the last 15 C-terminal residues loop back into the N-terminal domain). 3-phosphoglycerate (3-PG) binds to the N-terminal, while the nucleotide substrates, MgATP or MgADP, bind to the C-terminal domain of the enzyme. This extended two-domain structure is associated with large-scale 'hinge-bending' conformational changes, similar to those found in hexokinase [ ]. At the core of each domain is a 6-stranded parallel β-sheet surrounded by alpha helices. Domain 1 has a parallel β-sheet of six strands with an order of 342156, while domain 2 has a parallel β-sheet of six strands with an order of 321456. Analysis of the reversible unfolding of yeast phosphoglycerate kinase leads to the conclusion that the two lobes are capable of folding independently, consistent with the presence of intermediates on the folding pathway with a single domain folded []. Phosphoglycerate kinase (PGK) deficiency is associated with haemolytic anaemia and mental disorders in man [ ].
Protein Domain
Name: ATP-dependent RNA helicase DEAD-box, conserved site
Type: Conserved_site
Description: A number of eukaryotic and prokaryotic proteins involved in ATP-dependent, nucleic-acid unwinding have been characterised [ , , ] on the basis of their structural similarity. All these proteins share a number of conserved sequence motifs. Some of them are specific to this family while others are shared by other ATP-binding proteins or by proteins belonging to the helicases `superfamily'. One of these motifs, called the 'D-E-A-D-box', represents a special version of the B motif of ATP-binding proteins. Proteins currently known to belong to this family include eukaryotic initiation factor eIF-4A; yeast PRP5, PRP28 and MSS116 splicing proteins, and proteins DHH1, DRS1, MAK5 and ROK1; mouse Pl10, Caenorhabditis elegans helicase glh-1; Drosophila Rm62 (p62), Me31B and Vasa; and Escherichia coli putative RNA helicases dbpA, deaD, rhlB and rhlE.
Protein Domain
Name: WW domain
Type: Domain
Description: Synonym(s): Rsp5 or WWP domainThe WW domain is a short conserved region in a number of unrelated proteins, which folds as a stable, triple stranded β-sheet. This short domain of approximately 40 amino acids, may be repeated up to four times in some proteins [ , , , ]. The name WW or WWP derives from the presence of two signature tryptophan residues that are spaced 20-23 amino acids apart and are present in most WW domains known to date, as well as that of a conserved Pro. The WW domain binds to proteins with particular proline-motifs, [AP]-P-P-[AP]-Y, and/or phosphoserine- phosphothreonine-containing motifs [, ]. It is frequently associated with other domains typical for proteins in signal transduction processes.A large variety of proteins containing the WW domain are known. These include; dystrophin, a multidomain cytoskeletal protein; utrophin, a dystrophin-like protein of unknown function; vertebrate YAP protein, substrate of an unknown serine kinase; Mus musculus (Mouse) NEDD-4, involved in the embryonic development and differentiation of the central nervous system; Saccharomyces cerevisiae (Baker's yeast) RSP5, similar to NEDD-4 in its molecular organisation; Rattus norvegicus (Rat) FE65, a transcription-factor activator expressed preferentially in liver; Nicotiana tabacum (Common tobacco) DB10 protein, amongst others.
Protein Domain
Name: Enhancer of polycomb protein
Type: Family
Description: The enhancer of polycomb gene of Drosophila encodes a chromatin protein conserved in yeast and mammals [ ]. The homologous yeast protein, known as enhancer of polycomb-like protein 1 (Epl1), is a subunit of the NuA4 histone acetyltransferase (HAT) complex, which is involved in transcriptional activation of selected genes principally by acetylation of nucleosomal histone H4 and H2A []. Epl1 also found in a novel highly active smaller complex named Piccolo NuA4 (picNuA4), which strongly prefers chromatin over free histones as substrate []. The NuA4 HAT complex is highly conserved in eukaryotes, and it plays primary roles in transcription, cellular response to DNA damage, and cell cycle control [].This entry represents all eukaryotic enhancer of polycomb proteins, including enhancer of polycomb-like proteins.
Protein Domain
Name: Enhancer of polycomb-like, N-terminal
Type: Domain
Description: This domain is found at the N-terminal of EPL1 (Enhancer of polycomb-like) proteins. The EPL1 protein is a member of a histone acetyltransferase complex which is involved in transcriptional activation of selected genes [ ]. It is also present at the N terminus of Jade family proteins.
Protein Domain
Name: Malectin-like domain
Type: Domain
Description: Malectin is a membrane-anchored protein of the endoplasmic reticulum that recognises and binds Glc2-N-glycan [ ]. The domain is found on a number of plant receptor kinases and is distantly related to malectin domains.
Protein Domain
Name: Response regulator B-type, plant
Type: Family
Description: Members of this group are plant response regulators of the B type. B-type plant response regulators most closely resemble the classical microbial response regulators.Classical two-component signal transduction systems--consisting of a histidine protein kinase (HK) to sense signal input and a response regulator (RR) to mediate output--are widespread in prokaryotes. Their counterparts are also found in eukaryotes, indicating that they represent an ancient and evolutionarily conserved signalling mechanism. In plants, two-component systems are involved in phytohormone, stress, and light signalling [ , ]. Plant response regulators (called ARRs in Arabidopsis thaliana (Mouse-ear cress)) fall into three distinct families based on domain architecture: A-type RRs are stand-alone receiver domains; B-type RRs contain an N-terminal receiver domain fused to a Myb-like DNA-binding domain and a variable C-terminal domain; pseudo-response regulators contain an atypical receiver domain. The classical microbial RRs consist of an N-terminal CheY-like receiver (phosphoacceptor) domain and a C-terminal output (usually DNA-binding) domain. In a typical microbial signal transduction system, in response to an environmental stimulus, a phosphoryl group is transferred from the His residue of sensor histidine kinase to an Asp residue in the CheY-like receiver domain of the cognate response regulator [ , , ]. Phosphorylation of the receiver domain induces conformational changes that activate an associated output domain, which in turn triggers the response. Phosphorylation-induced conformational changes in response regulator molecules have been demonstrated in direct structural studies [].The output domain of B-type plant RRs is a central Myb-like DNA-binding domain (with the B, or GARP, motif) [ , , ] which is not found in two-component prokaryotic systems. This domain is believed to be responsible for the promoter-binding and transcription factor activity of the B-type plant RRs [, ]. The B motif contains a helix-turn-helix structure and a potential nuclear localization signal, and is considered to be a multifunctional domain responsible for both nuclear localization and DNA binding [].A variable C-terminal domain may also play a role as part of the output module and provides the basis for defining several small subgroups. The functions of these unique C-terminal domains and biological significance of the subgroups are unclear.
Protein Domain
Name: MT-A70-like
Type: Family
Description: N6-methyladenosine (m6A) is present at internal sites in some mRNAs. m6A affects different aspects of mRNA metabolism, such as half-life, splicing, and translation [ , , , , ].MT-A70 (also known as METTL3) is the S-adenosylmethionine-binding subunit of human mRNA N6-adenosine-methyltransferase (MTase), an enzyme that sequence-specifically methylates adenines in pre-mRNAs. Proteins with sequence similarity to MT-A70 have been identified in eukaryotes and prokaryotes. The resulting family is defined by sequence similarity in the carboxyl-proximal regions of the respective proteins. The amino-proximal regions of the eukaryotic proteins are highly diverse, often Pro-rich, and are conserved only within individual subfamilies [ ]. Corresponding regions are not present in prokaryotic members of the family. MT-A70-like proteins contain examples of some of the consensus methyltransferase motifs that have been derived from mutational and structural studies of bacterial DNA methyltransferases, including the universally conserved motif IV catalytic residues and a proposed motif I (AdoMet binding) element []. The MT-A70-like family comprises four subfamilies with varying degrees of interrelatedness. One subfamily is a small group of bacterial DNA: m6A MTases. The other three are paralogous eukaryotic lineages, two of which have not been associated with MTase activity but include proteins that regulate mRNA levels via unknown mechanisms apparently not involving methylation [].Some proteins known to belong to the MT-A70-like family are listed below: Human N6-adenosine-methyltransferase 70kDa subunit (MT-A70 or METTL3) ( ), the catalytic component of the METTL3-METTL14 heterodimer that forms the N6-methyltransferase complex that methylates adenosine residues at the N6 position of some RNAs [ ]. Human N6-adenosine-methyltransferase non-catalytic subunit (METTL14), the non-catalytic component of the METTL3-METTL14 heterodimer.Yeast N6-adenosine-methyltransferase IME4 ( ), which is important for induction of sporulation. Yeast karyogamy protein KAR4, a phosphoprotein required for expression of karyogamy-specific genes during mating and that it also acts during mitosis and meiosis [ ]. It has been suggested that KAR4 is inactive for methyltransfer and may not even bind AdoMet.
Protein Domain
Name: DNA methylase, N-6 adenine-specific, conserved site
Type: Conserved_site
Description: In prokaryotes, the major role of DNA methylation is to protect host DNA against degradation by restriction enzymes. There are 2 major classes of DNA methyltransferase that differ in the nature of the modifications they effect. The members of one class (C-MTases) methylate a ring carbon and form C5-methylcytosine (see PRINTS signature C5METTRFRASE). Members of the second class (N-MTases) methylate exocyclic nitrogens and form either N4-methylcytosine (N4-MTases) or N6-methyladenine (N6-MTases). Both classes of MTase utilise the cofactor S-adenosyl-L-methionine (SAM) as the methyl donor and are active as monomeric enzymes [].N-6 adenine-specific DNA methylases ( ) (A-Mtase) are enzymes that specifically methylate the amino group at the C-6 position of adenines in DNA. Such enzymes are found in the three existing types of bacterial` restriction-modification systems (in type I system the A-Mtase is the product of the hsdM gene, and in type III it is the product of the mod gene). All of these enzymes recognise a specific sequence in DNA and methylate an adenine in that sequence. It has been shown [ , , , ] that A-Mtases contain a conserved motif Asp/Asn-Pro-Pro-Tyr/Phe in their N-terminal section, this conserved region could be involved in substrate binding or in the catalytic activity. The structure of N6-MTase TaqI (M.TaqI) has been resolved to 2.4 A []. The molecule folds into 2 domains, an N-terminal catalytic domain, which contains the catalytic and cofactor binding sites, and comprises a central 9-stranded β-sheet, surrounded by 5 helices; and a C-terminal DNA recognition domain, which is formed by 4 small β-sheets and 8 α-helices. The N- and C-terminal domains form a cleft that accommodates the DNA substrate. A classification of N-MTases has been proposed, based on conserved motif (CM) arrangements []. Three such classes include the D12, D21 and N12 classes.
Protein Domain
Name: Sec-independent protein translocase protein TatA/B/E
Type: Family
Description: Translocation of proteins across the two membranes of Gram-negative bacteria can be carried out via a number of routes. Most proteins marked for export carry a secretion signal at their N terminus, and are secreted by the general secretory pathway. The signal peptide is cleaved as they pass through the outer membrane. Other secretion systems include the type III system found in a select group of Gram-negative plant and animal pathogens, and the CagA system of Helicobacter pylori [ ].In some bacterial species, however, there exists a system that operates independently of the Sec pathway []. It selectively translocates periplasmic-bound molecules that are synthesised with, or are in close association with, "partner"proteins bearing an (S/T)RRXFLK twin arginine motif at the N terminus. The pathway is therefore termed the Twin-Arginine Translocation or TAT system. Surprisingly, the four components that make up the TAT system are structurally and mechanistically related to a pH-dependent import system in plant chloroplast thylakoid membranes []. Thegene products responsible for the Sec-independent pathway are called TatA, TatB, TatC and TatE.This entry represents the related TatA, TatB and TatE proteins.
Protein Domain
Name: Zinc finger, C3HC-like
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This zinc-finger like domain is distributed throughout the eukaryotic kingdom in NIPA (Nuclear interacting partner of ALK) and other proteins. NIPA is thought to perform an antiapoptotic role in nucleophosmin-anaplastic lymphoma kinase (ALK) mediated signalling events [ ]. The domain is often repeated, with the second domain usually containing a large insert (approximately 90 residues) after the first three cysteine residues. The Schizosaccharomyces pombe protein containing this domain () is involved in mRNA export from the nucleus [ ].
Protein Domain      
Protein Domain
Name: FAD-dependent oxidoreductase 2, FAD binding domain
Type: Domain
Description: This domain is found in proteins that bind FAD, mainly in FAD-dependent oxidoreductase family 2 proteins, such as the flavoprotein subunits from succinate and fumarate dehydrogenase, and aspartate oxidase [].
Protein Domain
Name: Fumarate reductase/succinate dehydrogenase flavoprotein-like, C-terminal
Type: Domain
Description: This entry represents a domain with a spectrin-repeat-like fold consisting of three helices in a closed bundle with a left-handed twist. This domain is found in the succinate dehydrogenase/fumarate reductase oxidoreductase family of proteins, such as:L-aspartate oxidase ( ), a flavoenzyme component of the bacterial quinolinate synthase system that catalyses the conversion of L-aspartate to oxaloacetate, the first step in the de novo biosynthesis of NAD+ [ , ].Fumarate reductase, which is part of the quinol-fumarate reductase (QFR) respiratory complex that catalyses the terminal step of anaerobic respiration when fumarate acts as the terminal electron acceptor [ ].Succinate dehydrogenase (SQR; ), an iron-sulphur flavoenzyme from bacteria that is analogous to the mitochondrial respiratory complex II, forming part of the electron transport pathway from the electron acceptor (succinate) to the terminal donor (ubiquinone) [ ].Adenylylsulphate reductase A subunit ( ), an iron-sulphur flavoenzyme that catalyses the reversible reduction of adenosine-5'-phosphate (APS) to sulphite and AMP [ ].
Protein Domain
Name: L-aspartate oxidase
Type: Family
Description: L-aspartate oxidase is the B protein, NadB, of the quinolinate synthetase complex. Quinolinate synthetase makes a precursor of the pyridine nucleotide portion of NAD.
Protein Domain
Name: Succinate dehydrogenase/fumarate reductase flavoprotein, catalytic domain superfamily
Type: Homologous_superfamily
Description: Succinate:quinone oxidoreductase ( ) refers collectively to succinate:quinone reductase (SQR, or Complex II) and quinol:fumarate reductase (QFR) [ ]. SQR is found in aerobic organisms, and catalyses the oxidation of succinate to fumarate in the citric acid cycle and donates the electrons to quinone in the membrane. QFR can be found in anaerobic cells respiring with fumarate as terminal electron acceptor. SQR and QFR are very similar in composition and structure, despite catalysing opposite reactions in vivo. They are thought to have evolved from a common ancestor, and in Escherichia coli they are capable of functionally replacing each other [ ].Succinate:quinone oxidoreductases consist of a peripheral domain, exposed to the cytoplasm in bacteria and to the matrix in mitochondria, and a membrane-integral anchor domain that spans the membrane. The peripheral part, which contains the dicarboxylate binding site, is composed of a flavoprotein subunit, with one covalently bound FAD, and an iron-sulphur protein subunit containing three iron-sulphur clusters. The membrane-integral domain functions to anchor the peripheral domain to the membrane and is required for quinone reduction and oxidation. The anchor domain shows the largest variability in composition and primary sequence, being composed either of one large subunit, or two smaller subunits, which may, or may not, contain protoheme groups.The flavoprotein subunit found in both the SQR and QFR enzymes contains an N-terminal domain which binds the FAD cofactor, a central catalytic domain with an unsual fold, and a C-terminal domain whose role is unclear [ , , ]. The dicarboxylate binding site is located between the FAD and catalytic domains.This superfamily represents the catalytic domain of the flavoprotein subunit.
Protein Domain
Name: PDZ domain
Type: Domain
Description: PDZ domains (also known as Discs-large homologous regions (DHR) or GLGF)) are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates [ , ]. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences []. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 microns. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated. PDZ domains consist of 80 to 90 amino acids comprising six β-strands (beta-A to beta-F) and two α-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel β-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands [ ].
Protein Domain
Name: Peptidase S1, PA clan
Type: Homologous_superfamily
Description: This superfamily represents a domain found in proteases belonging to the MEROPS peptidase family S1 (clan PA). This domain has a two β-barrel structure. The PA clan contains both cysteine and serine proteases that can be found in plants, animals, fungi, eubacteria, archaea and viruses [ ].A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families [ ]. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid, N-ethylmaleimide or p-chloromercuribenzoate. Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [ ]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [ ]. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.
Protein Domain
Name: Peptidase S1C
Type: Family
Description: This group of serine peptidases and non-peptidase homologues belong to the MEROPS peptidase family S1, subfamily S1C (protease Do subfamily, clan PS(S)). A type example is the protease Do from Escherichia coli. Other members of this group include the E. coli htrA gene product (HrtA or DegP protein), which is essential for bacterial survival at temperatures above 42 degrees [, ] and for digesting misfolded protein in the periplasm. Mature DegP from E. coli has 448 residues, of which His105, Asp135, and Ser210 form the catalytic triad []. The protein has an N-terminal sequence typical of a leader peptide. Structural analysis indicates that bacterial HtrA is a serine protease belonging to the family of age-forming proteases and that only unfolded polypeptides can be threaded in extended conformation into the cage to access the proteolytic sites. Disulphide bonds of partially unfolded substrates impede protein breakdown and represent a conformational constraint for entering the inner cavity. This preference for unfolded polypeptides might be also a reason for the ATP-independent mode of action and for the increased proteolytic activity at higher temperatures [].The HtrA family shares a modular architecture composed of an N-terminal segment believed to have regulatory functions, a conserved trypsin-like protease domain, and one or two PDZ domains which mediate specific protein-protein interactions and bind preferentially to the C-terminal three to four residues of the target protein. HtrA belongs to the trypsin clan SA. SA proteases have a two-domain structure with each domain forming a six-stranded barrel. The active site cleft is located at the interface of the two perpendicularly arranged barrel domains. The active site is constructed by several loops located at the C-terminal side of both barrel domains. The functional unit of HtrA appears to be a trimer, which is stabilised exclusively by residues of the protease domains. The basic trimer has a funnel-like shape with the protease domains located at its top and the PDZ domains protruding to the outside. Once substrates have been bound, they have to be delivered into the interior of the funnel and the proteolytic sites. In contrast to other protease-chaperone systems, ATP does not drive binding and release of substrates [].The degQ and degS genes of E. coli encode proteins of 455 and 355 residues that are homologues of the DegP protease []. Purified DegQ protein has the properties of a serine endopeptidase, and is processed by the removal of a 27-residue N-terminal signal sequence. Deletion studies suggest that DegQ, like DegP, functions as a periplasmic protease in vivo[ ].An example of a non-peptidase homologue in this entry is the anti-sigma-I factor RsgI9 from Clostridium thermocellum, which has the catalytic serine replaced with threonine.This entry also includes the membrane transporter protein MamO and the magnetosome formation protease MamE. MamO promotes magnetite nucleation/formation and activates the MamE protease [ , ]. MamE is required for correct localization of proteins to the magnetosome while the protease activity is required for maturation of small magnetite crystals into larger, functional ones [].
Protein Domain
Name: Carbonic anhydrase, prokaryotic-like, conserved site
Type: Conserved_site
Description: Carbonic anhydrases ( ) (CA) are zinc metalloenzymes which catalyze the reversible hydration of carbon dioxide. In Escherichia coli, CA (gene cynT) is involved in recycling carbon dioxide formed in the bicarbonate-dependent decomposition of cyanate by cyanase (gene cynS). By this action, it prevents the depletion of cellular bicarbonate []. In photosynthetic bacteria and plant chloroplast, CA is essential to inorganic carbon fixation [].Prokaryotic and plant chloroplast CA are structurally and evolutionary related and form a family distinct from the one which groups the many different forms of eukaryotic CA's (see ). Hypothetical proteins yadF from Escherichia coli and HI1301 from Haemophilus influenzae also belong to this family.
Protein Domain
Name: Carbonic anhydrase
Type: Family
Description: Carbonic anhydrases ( ) (CA) are zinc metalloenzymes which catalyze the reversible hydration of carbon dioxide. In Escherichia coli, CA (gene cynT) is involved in recycling carbon dioxide formed in the bicarbonate-dependent decomposition of cyanate by cyanase (gene cynS). By this action, it prevents the depletion of cellular bicarbonate []. In photosynthetic bacteria and plant chloroplast, CA is essential to inorganic carbon fixation [].Prokaryotic and plant chloroplast CA are structurally and evolutionary related and form a family distinct from the one which groups the many different forms of eukaryotic CA's (see ). This family also includes the carbonyl sulfide hydrolase from Thiobacillus thioparus which responsible for the degradation of carbonyl sulfide to hydrogen sulfide and CO2, the second step of SCN(-) assimilation [ ], and a carbon disulfide hydrolase from acidothermophilic archaeon Acidianus, which has a typical carbonic anhydrase fold and active site but does not use CO2 as asubstrate [ ].
Protein Domain
Name: Protein of unknown function DUF1645, plant
Type: Family
Description: These sequences are derived from a number of hypothetical plant proteins. The region in question is approximately 270 amino acids long. Some members of this family are annotated as yeast pheromone receptor proteins AR781 but no literature was found to support this.
Protein Domain
Name: 3-beta hydroxysteroid dehydrogenase/isomerase
Type: Domain
Description: The enzyme 3 beta-hydroxysteroid dehydrogenase/5-ene-4-ene isomerase (3 beta-HSD) catalyses the oxidation and isomerisation of 5-ene-3 beta-hydroxypregnene and 5-ene-hydroxyandrostene steroid precursors into the corresponding 4-ene-ketosteroids necessaryfor the formation of all classes of steroid hormones.
Protein Domain
Name: DnaJ domain
Type: Domain
Description: The hsp70 chaperone machine performs many diverse roles in the cell, including folding of nascent proteins, translocation of polypeptides across organelle membranes, coordinating responses to stress, and targeting selected proteins for degradation. DnaJ is a member of the hsp40 family of molecular chaperones, which is also called the J-protein family, the members of which regulate the activity of hsp70s. DnaJ (hsp40) binds to dnaK (hsp70) and stimulates its ATPase activity, generating the ADP-bound state of dnaK, which interacts stably with the polypeptide substrate [, ]. Structurally, the DnaJ protein consists of an N-terminal conserved domain (called 'J' domain) of about 70 amino acids, a glycine-rich region ('G' domain') of about 30 residues, a central domain containing four repeats of a CXXCXGXG motif ('CRR' domain) and a C-terminal region of 120 to 170 residues.Such a structure is shown in the following schematic representation: +------------+-+-------+-----+-----------+--------------------------------+ | J-domain | | Gly-R | | CXXCXGXG | C-terminal |+------------+-+-------+-----+-----------+--------------------------------+ The structure of the J-domain has been solved [ ]. The J domain consists of four helices, the second of which has a charged surface that includes basic residues that are essential for interaction with the ATPase domain of hsp70 []. J-domains are found in many prokaryotic and eukaryotic proteins [ ]. In yeast, three J-like proteins have been identified containing regions closely resembling a J-domain, but lacking the conserved HPD motif - these proteins do not appear to act as molecular chaperones [].
Protein Domain
Name: Metallo-beta-lactamase
Type: Domain
Description: Metallo beta lactamases exhibit low sequence identity between enzymes but they are structurally similar. They have a characteristic α-β/β-α sandwich fold in which the active site is at the interface between domains. Apart from the beta-lactamases and metallo-beta-lactamases, a number of other proteins contain this domain and share the same fold type [ , ]. These proteins include thiolesterases, members of the glyoxalase II family, that catalyse the hydrolysis of S-D-lactoyl-glutathione to form glutathione and D-lactic acid and a competence protein that is essential for natural transformation in Neisseria gonorrhoeae and could be a transporter involved in DNA uptake. Except for the competence protein these proteins bind two zinc ions per molecule as cofactor.
Protein Domain
Name: PPPDE peptidase domain
Type: Domain
Description: The PPPDE superfamily (after Permuted Papain fold Peptidases of DsRNA viruses and Eukaryotes), consists of thiol peptidases with a circularly permuted papain-like fold. They contain a PPPDE domain which is a cysteine isopeptidase that exhibits a deSUMOylase activity in PPPDE2 (DeSI-1) and a deubiquinating activity in PPPDE1 (DeSI-2) and is described as a mixed alpha/β-fold composed of six β-strands and six α-helices. The catalytic dyad is formed by a conserved N-terminal histidine residue on beta2-strand and a conserved C-terminal cysteine residue on the following alpha3-helix (the H-C configuration). This catalytic dyad is invariably conserved in the PPPDE family of proteins [ , , ].
Protein Domain
Name: Terpene synthase, N-terminal domain
Type: Domain
Description: Sequences containing this domain belong to the terpene synthase family [ ]. It has been suggested that this gene family be designated tps (for terpene synthase). Sequence comparisons reveal similarities between the monoterpene (C10) synthases, sesquiterpene (C 15) synthases and the diterpene (C 20) synthases. It has been split into six subgroups on the basis of phylogeny, called Tpsa-Tpsf [ ]. Tpsa includes vetispiridiene synthase , 5-epi- aristolochene synthase, and (+)-delta-cadinene synthase . Tpsb includes (-)-limonene synthase, . Tpsc includes copalyl diphosphate synthase (kaurene synthase A), . Tpsd includes taxadiene synthase, , pinene synthase, and myrcene synthase, . Tpse includes ent-kaurene synthase B . Tpsf includes linalool synthase . In the fungus Phaeosphaeria sp. (strain L487) the synthesis of ent-kaurene from geranylgeranyl diphosphate is promoted by a single bifunctional protein [ ].
Protein Domain
Name: Lateral organ boundaries, LOB
Type: Domain
Description: The lateral organ boundaries (LOB) gene is expressed at the adaxial base of initiating lateral organs and encodes a plant-specific protein of unknown function. The N-terminal one half of the LOB protein contains a conserved approximately 100-amino acid domain (the LOB domain) that is present in 42 other Arabidopsis thaliana proteins and in proteins from a variety of other plant species. Genes encoding LOB domain (LBD) proteins are expressed in a variety of temporal- and tissue-specific patterns, suggesting that they may function in diverse processes [ ] The LOB domain contains conserved blocks of amino acids that identify the LBD gene family. In particular, a conserved C-x(2)-C-x(6)-C-x(3)-C motif, which is defining feature of the LOB domain, is present in all LBD proteins. It is possible that this motif forms a new zinc finger [].
Protein Domain
Name: Equilibrative nucleoside transporter
Type: Family
Description: Nucleosides are hydrophilic molecules and require specialised transport proteins for permeation of cell membranes. There are two types of nucleoside transport processes: equilibrative bidirectional processes driven by chemical gradients and inwardly directed concentrative processes driven by an electrochemical gradient [ ]. The two types of nucleoside transporters are classified into two families: the solute carrier (SLC) 29 and SLC28 families, corresponding to equilibrative and concentrative nucleoside transporters, respectively [].Equilibrative nucleoside transporters (ENTs) are integral membrane proteins which enable the movement of hydrophilic nucleosides and nucleoside analogues down their concentration gradients across cell membranes. ENT family members have been identified in humans, mice, fish, tunicates, slime molds, and bacteria [ ].
Protein Domain
Name: Cation-transporting P-type ATPase, C-terminal
Type: Domain
Description: Transmembrane ATPases are membrane-bound enzyme complexes/ion transporters that use ATP hydrolysis to drive the transport of protons across a membrane. Some transmembrane ATPases also work in reverse, harnessing the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. There are several different types of transmembrane ATPases, which can differ in function (ATP hydrolysis and/or synthesis), structure (e.g., F-, V- and A-ATPases, which contain rotary motors) and in the type of ions they transport [ , ]. The different types include:F-ATPases (ATP synthases, F1F0-ATPases), which are found in mitochondria, chloroplasts and bacterial plasma membranes where they are the prime producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts).V-ATPases (V1V0-ATPases), which are primarily found in eukaryotes and they function as proton pumps that acidify intracellular compartments and, in some cases, transport protons across the plasma membrane [ ]. They are also found in bacteria [].A-ATPases (A1A0-ATPases), which are found in Archaea and function like F-ATPases, though with respect to their structure and some inhibitor responses, A-ATPases are more closely related to the V-ATPases [ , ].P-ATPases (E1E2-ATPases), which are found in bacteria and in eukaryotic plasma membranes and organelles, and function to transport a variety of different ions across membranes.E-ATPases, which are cell-surface enzymes that hydrolyse a range of NTPs, including extracellular ATP.P-ATPases (also known as E1-E2 ATPases) ([intenz:3.6.3.-]) are found in bacteria and in a number of eukaryotic plasma membranes and organelles []. P-ATPases function to transport a variety of different compounds, including ions and phospholipids, across a membrane using ATP hydrolysis for energy. There are many different classes of P-ATPases, which transport specific types of ion: H +, Na +, K +, Mg 2+, Ca 2+, Ag +and Ag 2+, Zn 2+, Co 2+, Pb 2+, Ni 2+, Cd 2+, Cu +and Cu 2+. P-ATPases can be composed of one or two polypeptides, and can usually assume two main conformations called E1 and E2. This entry represents the conserved C-terminal region found in several classes of cation-transporting P-type ATPases, including those that transport H +( ), Na +( ), Ca 2+( ), Na +/K +( ), and H +/K +( ). In the H +/K +- and Na +/K +-exchange P-ATPases, this domain is found in the catalytic alpha chain.
Protein Domain
Name: P-type ATPase, subfamily IIB
Type: Family
Description: Transmembrane ATPases are membrane-bound enzyme complexes/ion transporters that use ATP hydrolysis to drive the transport of protons across a membrane. Some transmembrane ATPases also work in reverse, harnessing the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. There are several different types of transmembrane ATPases, which can differ in function (ATP hydrolysis and/or synthesis), structure (e.g., F-, V- and A-ATPases, which contain rotary motors) and in the type of ions they transport [ , ]. The different types include:F-ATPases (ATP synthases, F1F0-ATPases), which are found in mitochondria, chloroplasts and bacterial plasma membranes where they are the prime producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts).V-ATPases (V1V0-ATPases), which are primarily found in eukaryotes and they function as proton pumps that acidify intracellular compartments and, in some cases, transport protons across the plasma membrane [ ]. They are also found in bacteria [].A-ATPases (A1A0-ATPases), which are found in Archaea and function like F-ATPases, though with respect to their structure and some inhibitor responses, A-ATPases are more closely related to the V-ATPases [ , ].P-ATPases (E1E2-ATPases), which are found in bacteria and in eukaryotic plasma membranes and organelles, and function to transport a variety of different ions across membranes.E-ATPases, which are cell-surface enzymes that hydrolyse a range of NTPs, including extracellular ATP.P-ATPases (also known as E1-E2 ATPases) ([intenz:3.6.3.-]) are found in bacteria and in a number of eukaryotic plasma membranes and organelles []. P-ATPases function to transport a variety of different compounds, including ions and phospholipids, across a membrane using ATP hydrolysis for energy. There are many different classes of P-ATPases, which transport specific types of ion: H+, Na +, K +, Mg 2+, Ca 2+, Ag +and Ag 2+, Zn 2+, Co 2+, Pb 2+, Ni 2+, Cd 2+, Cu +and Cu 2+. P-ATPases can be composed of one or two polypeptides, and can usually assume two main conformations called E1 and E2. This family describes the P-type ATPase responsible for translocating calcium ions across the plasma membrane of eukaryotes [ ], out of the cell. In some organisms, this type of pump may also be found in vacuolar membranes []. In humans and mice, at least, there are multiple isoforms of the PMCA pump with overlapping but not redundant functions. Accordingly, there are no human diseases linked to PMCA defects, although alterations of PMCA function do elicit physiological effects []. The calcium P-type ATPases have been characterised as Type IIB based on a phylogenetic analysis which distinguishes this group from the Type IIA SERCA calcium pump [].
Protein Domain
Name: Cation-transporting P-type ATPase, N-terminal
Type: Domain
Description: Transmembrane ATPases are membrane-bound enzyme complexes/ion transporters that use ATP hydrolysis to drive the transport of protons across a membrane. Some transmembrane ATPases also work in reverse, harnessing the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. There are several different types of transmembrane ATPases, which can differ in function (ATP hydrolysis and/or synthesis), structure (e.g., F-, V- and A-ATPases, which contain rotary motors) and in the type of ions they transport [ , ]. The different types include:F-ATPases (ATP synthases, F1F0-ATPases), which are found in mitochondria, chloroplasts and bacterial plasma membranes where they are the prime producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts).V-ATPases (V1V0-ATPases), which are primarily found in eukaryotes and they function as proton pumps that acidify intracellular compartments and, in some cases, transport protons across the plasma membrane [ ]. They are also found in bacteria [].A-ATPases (A1A0-ATPases), which are found in Archaea and function like F-ATPases, though with respect to their structure and some inhibitor responses, A-ATPases are more closely related to the V-ATPases [, ].P-ATPases (E1E2-ATPases), which are found in bacteria and in eukaryotic plasma membranes and organelles, and function to transport a variety of different ions across membranes.E-ATPases, which are cell-surface enzymes that hydrolyse a range of NTPs, including extracellular ATP.P-ATPases (also known as E1-E2 ATPases) ([intenz:3.6.3.-]) are found in bacteria and in a number of eukaryotic plasma membranes and organelles []. P-ATPases function to transport a variety of different compounds, including ions and phospholipids, across a membrane using ATP hydrolysis for energy. There are many different classes of P-ATPases, which transport specific types of ion: H+, Na +, K +, Mg 2+, Ca 2+, Ag +and Ag 2+, Zn 2+, Co 2+, Pb 2+, Ni 2+, Cd 2+, Cu +and Cu 2+. P-ATPases can be composed of one or two polypeptides, and can usually assume two main conformations called E1 and E2. This entry represents the conserved N-terminal region found in several classes of cation-transporting P-type ATPases, including those that transport H +( ), Na +( ), Ca 2+( ), Na +/K +( ), and H +/K +( ). In the H +/K +- and Na +/K +-exchange P-ATPases, this domain is found in the catalytic alpha chain. In gastric H +/K +-ATPases, this domain undergoes reversible sequential phosphorylation inducing conformational changes that may be important for regulating the function of these ATPases [ , ].
Protein Domain
Name: P-type ATPase, transmembrane domain superfamily
Type: Homologous_superfamily
Description: Transmembrane ATPases are membrane-bound enzyme complexes/ion transporters that use ATP hydrolysis to drive the transport of protons across a membrane. Some transmembrane ATPases also work in reverse, harnessing the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. There are several different types of transmembrane ATPases, which can differ in function (ATP hydrolysis and/or synthesis), structure (e.g., F-, V- and A-ATPases, which contain rotary motors) and in the type of ions they transport [ , ]. The different types include:F-ATPases (ATP synthases, F1F0-ATPases), which are found in mitochondria, chloroplasts and bacterial plasma membranes where they are the prime producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts).V-ATPases (V1V0-ATPases), which are primarily found in eukaryotes and they function as proton pumps that acidify intracellular compartments and, in some cases, transport protons across the plasma membrane [ ]. They are also found in bacteria [].A-ATPases (A1A0-ATPases), which are found in Archaea and function like F-ATPases, though with respect to their structure and some inhibitor responses, A-ATPases are more closely related to the V-ATPases [ , ].P-ATPases (E1E2-ATPases), which are found in bacteria and in eukaryotic plasma membranes and organelles, and function to transport a variety of different ions across membranes.E-ATPases, which are cell-surface enzymes that hydrolyse a range of NTPs, including extracellular ATP.P-ATPases (also known as E1-E2 ATPases) ([intenz:3.6.3.-]) are found in bacteria and in a number of eukaryotic plasma membranes and organelles []. P-ATPases function to transport a variety of different compounds, including ions and phospholipids, across a membrane using ATP hydrolysis for energy. There are many different classes of P-ATPases, which transport specific types of ion: H+, Na +, K +, Mg 2+, Ca 2+, Ag +and Ag 2+, Zn 2+, Co 2+, Pb 2+, Ni 2+, Cd 2+, Cu +and Cu 2+. P-ATPases can be composed of one or two polypeptides, and can usually assume two main conformations called E1 and E2. This superfamily represents the ten transmembrane helices domain found in P-type ATPases [ ].
Protein Domain
Name: Focadhesin/RST, DUF3730
Type: Domain
Description: This domain of unknown function is found in Focadhesin from animals and RST1 (RESURRECTION 1) from plants. Focadhesin (FOCAD) is a focal adhesion protein with potential tumour suppressor function in gliomas [ ]. RST1 was originally identified in a genetic screen for factors involved in the biosynthesis of epicuticular waxes []. Later, RST1 and RST1 INTERACTING PROTEIN (RIPR) have been shown to act as cofactors of the cytoplasmic exosome and the Ski complex in plants []. RST1 is involved in the suppression of siRNA-mediated silencing of transgenes and certain endogenous transcripts [].
Protein Domain
Name: UBA-like domain DUF1421
Type: Domain
Description: This domain represents a conserved region that has a UBA-like fold. It is found in a number of plant proteins of unknown function.
Protein Domain
Name: Zinc-finger domain of monoamine-oxidase A repressor R1
Type: Domain
Description: R1 is a transcription factor repressor that inhibits monoamine oxidase A gene expression. This domain is a four-CXXC zinc finger putative DNA-binding domain found at the C-terminal end of R1. The domain carries 12 cysteines of which four pairs are of the CXXC type [ ].
Protein Domain
Name: DDT domain
Type: Domain
Description: The DDT has been named after the better characterised DNA-binding homeobox- containing proteins and the Different Transcription and chromatin remodellingfactors in which it is found. It is a domain of about 60 amino acids which is exclusively associated with nuclear domains like AT-Hook,PHD finger, methyl-CpG-binding domain, bromodomain and DNA-binding homeodomain.The DDT domain is characterised by a number of conserved aromatic and charged residues and is predicted to consist of three alpha helices. A DNA-bindingfunction for the DDT domain has been proposed [ ].
Protein Domain      
Protein Domain
Name: Glycosyl transferase, family 14
Type: Family
Description: The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates ([intenz:2.4.1.-]) and related proteins into distinct sequence based families has been described []. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form 'clans'.This is the glycosyltransferase family 14 , a family of two different beta-1,6-N-acetylglucosaminyltransferase enzymes, I-branching enzyme ( ) and core-2 branching enzyme ( ). I-branching enzyme, an integral membrane protein, converts linear into branched poly-N-acetyllactosaminoglycans in the glycosylation pathway, and is responsible for the production of the blood group I-antigen during embryonic development [ ]. Core-2 branching enzyme, also an integral membrane protein, forms crucial side-chain branches in O-glycans in the glycosylation pathway [].
Protein Domain
Name: Glycosyl hydrolases 36
Type: Family
Description: This family consists of several galactinol-sucrose galactosyltransferase proteins, also known as raffinose synthases, which is a widespread oligosaccharide in plant seeds and other tissues. Raffinose synthase ( ) is the key enzyme that channels sucrose into the raffinose oligosaccharide pathway [ ]. Raffinose family oligosaccharides (RFOs) are ubiquitous in plant seeds and are thought to play critical roles in the acquisition of tolerance to desiccation and seed longevity. Raffinose synthases are alkaline alpha-galactosidases and are solely responsible for RFO breakdown in germinating maize seeds, whereas acidic galactosidases appear to have other functions []. Glycoside hydrolase family 36 can be split into 11 families, GH36A to GH36K []. This family includes enzymes from GH36C.
Protein Domain
Name: O-acyltransferase WSD1, C-terminal
Type: Domain
Description: This entry represents the C terminus (approximately 170 residues) of a number of hypothetical plant proteins. O-acyltransferase WSD1 is a bifunctional wax ester synthase/diacylglycerol acyltransferase, which is involved in cuticular wax biosynthesis [ ].
Protein Domain
Name: O-acyltransferase, WSD1-like, N-terminal
Type: Domain
Description: This entry represents the N-terminal catalytic domain of a number of wax ester synthase/diacylglycerol acyltransferases (WS/DGATs), predominantly from bacteria and plants. They catalyse the condensation of a fatty alcohol and a fatty acyl-Coenzyme A (acyl-CoA) and they can also catalyse the transesterification of acyl-CoAs with diacylglycerols. These are bifunctional enzymes that share low overall sequence similarity but all WS/DGATs share a conserved HHXXXDG motif, which is also found in other acyltransferases [ , ]. This domain contains the conserved HHXXXDG motif and consists of a mixed β-sheet flanked by four α-helices and a small antiparallel β-sheet [].
Protein Domain
Name: UPF3 domain
Type: Domain
Description: Nonsense-mediated mRNA decay (NMD) is a surveillance mechanism by which eukaryotic cells detect and degrade transcripts containing premature termination codons. Three 'up-frameshift' proteins, UPF1, UPF2 and UPF3, are essential for this process in organisms ranging from yeast, human to plants [ ]. Exon junction complexes (EJCs) are deposited ~24 nucleotides upstream of exon-exon junctions after splicing. Translation causes displacement of the EJCs, however, premature translation termination upstream of one or more EJCs triggers the recruitment of UPF1, UPF2 and UPF3 and activates the NMD pathway [ , ]. This entry contains UPF3. The crystal structure of the complex between human UPF2 and UPF3b, which are, respectively, a MIF4G (middle portion of eIF4G) domain and an RNP domain (ribonucleoprotein-type RNA-binding domain) has been determined to 1.95A. The protein-protein interface is mediated by highly conserved charged residues in UPF2 and UPF3b and involves the β-sheet surface of the UPF3b ribonucleoprotein (RNP) domain, which is generally used by these domains to bind nucleic acids. In UPF3b the RNP domain does not bind RNA, whereas the UPF2 construct and the complex do. It is clear that some RNP domains have evolved for specific protein-protein interactions rather than as nucleic acid binding modules [ ].
Protein Domain
Name: Protein of unknown function DUF2854
Type: Family
Description: This family of proteins has no known function.
Protein Domain
Name: Domain of unknown function DUF4005
Type: Domain
Description: This domain is found towards the C terminus of a number of plant IQ domain-containing proteins. These proteins may be involved in cooperative interactions with calmodulins or calmodulin-like proteins, and may associate with nucleic acids and regulate gene expression at the transcriptional or post-transcriptional level.
Protein Domain
Name: Iron/zinc purple acid phosphatase-like C-terminal domain
Type: Domain
Description: This domain is found at the C terminus of purple acid phosphatase proteins [ ].
Protein Domain
Name: Isocitrate/isopropylmalate dehydrogenase, conserved site
Type: Conserved_site
Description: Isocitrate dehydrogenase (IDH) [ , ] is an important enzyme of carbohydrate metabolism which catalyses the oxidative decarboxylation of isocitrate into alpha-ketoglutarate. IDH is either dependent on NAD+( ) or on NADP +( ). In eukaryotes there are at least three isozymes of IDH: two are located in the mitochondrial matrix (one NAD +-dependent, the other NADP +-dependent), while the third one (also NADP +-dependent) is cytoplasmic. In Escherichia coli the activity of a NADP +-dependent form of the enzyme is controlled by the phosphorylation of a serine residue; the phosphorylated form of IDH is completely inactivated. 3-isopropylmalate dehydrogenase ( ) (IMDH) [ , ] catalyses the third step in the biosynthesis of leucine in bacteria and fungi, the oxidative decarboxylation of 3-isopropylmalate into 2-oxo-4-methylvalerate. Tartrate dehydrogenase () [ ] catalyses the reduction of tartrate to oxaloglycolate.These enzymes are evolutionary related [ , , , ]. The signature pattern of this entry is located in a conserved region, which contains a glycine-rich stretch of residues located in the C-terminal section.
Protein Domain
Name: Isocitrate dehydrogenase NADP-dependent
Type: Family
Description: Isocitrate dehydrogenase (IDH) [ , ] is an important enzyme of carbohydrate metabolism which catalyses the oxidative decarboxylation of isocitrate into alpha-ketoglutarate. IDH is either dependent on NAD+( ) or on NADP +(). In eukaryotes there are at least three isozymes of IDH: two are located in the mitochondrial matrix (one NAD +-dependent, the other NADP +-dependent), while the third one (also NADP +-dependent) is cytoplasmic. In Escherichia coli the activity of a NADP +-dependent form of the enzyme is controlled by the phosphorylation of a serine residue; the phosphorylated form of IDH is completely inactivated. The eukaryotic, NADP-dependent isocitrate dehydrogenases, are defined by this family that includes cytosolic, mitochondrial, and chloroplast enzymes, as well as bacterial proteins. This family differs considerably from other isocitrate dehydrogenases that are included in a different group together with 3-isopropylmalate dehydrogenases and tartrate dehydrogenases.
Protein Domain
Name: Isopropylmalate dehydrogenase-like domain
Type: Domain
Description: The isocitrate and isopropylmalate dehydrogenases family includes isocitrate dehydrogenase (IDH), 3-isopropylmalate dehydrogenase (IMDH) and tartrate dehydrogenase.IDH is an important enzyme of carbohydrate metabolism which catalyses the oxidative decarboxylation of isocitrate into alpha-ketoglutarate [, ]. IDH is either dependent on NAD+( ) or on NADP +( ). In eukaryotes there are at least three isozymes of IDH: two are located in the mitochondrial matrix (one NAD +-dependent, the other NADP +-dependent), while the third one (also NADP +-dependent) is cytoplasmic. In Escherichia coli, the activity of a NADP +-dependent form of the enzyme is controlled by the phosphorylation of a serine residue; the phosphorylated form of IDH is completely inactivated. IMDH ( ) catalyses the third step in the biosynthesis of leucine in bacteria and fungi, the oxidative decarboxylation of 3-isopropylmalate into 2-oxo-4-methylvalerate [ , ]. Tartrate dehydrogenase ( ) shows strong homology to prokaryotic isopropylmalate dehydrogenases and, to a lesser extent, isocitrate dehydrogenase [ ]. It catalyses the reduction of tartrate to oxaloglycolate [].This entry represents a structural domain found in all types of isocitrate dehydrogenase, and in isopropylmalate dehydrogenase and tartrate dehydrogenase. The crystal structure of Escherichia coli isopropylmalate dehydrogenase has been described [ ].
Protein Domain
Name: Peptidase C19, ubiquitin carboxyl-terminal hydrolase
Type: Domain
Description: Ubiquitin carboxyl-terminal hydrolases (UCH) ( ) [ ] are thiol proteases that recognise and hydrolyse the peptide bond at the C-terminal glycine of ubiquitin. These enzymes are involved in the processing of poly-ubiquitin precursors as well as that of ubiquinated proteins. The deubiquitinsing proteases can be split into 2 size ranges, 20-30kDa( ) and 100-200kDa [ ]: the second class consist of large proteins (800 to 2000 residues) that belong to the peptidase family C19, and this group is currently represented by yeast UBP1 []. UCH thiol proteases contain an N-terminal catalytic domain sometimes followed by C-terminal extensions that mediate protein-protein interactions [ ]. This entry represents the catalytic domain of UCH proteins of the UBP1 group.A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families [ ]. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid, N-ethylmaleimide or p-chloromercuribenzoate. Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [ ].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [ ]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [ ]. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.
Protein Domain
Name: Ubiquitin specific protease, conserved site
Type: Conserved_site
Description: Protein ubiquitination is a reversible posttranslational modification, which affects a large number of cellular processes including protein degradation,trafficking, cell signaling and the DNA damage response. Ubiquitination is reversible, and dedicated deubiquitinases exist which hydrolyze isopeptidebonds. Ubiquitin specific proteases (USPs) ) are the largest family of deubiquitinating enzymes. USP domains consist of a common conserved catalytic core which is interspersed at five points with insertions, some of which as large as the catalytic domain itself. The insertions can fold into independent domains that can be involved in the regulation of deubiquitinase activity. As commonly found in signaling proteins, many USP deubiquitinases have a modular architecture, and not only contain a catalytic domain but also additional protein-protein interaction and localization domains. Most USP domains cleave the isopeptide linkage between two ubiquitin molecules, and hence contain (at least) two ubiquitin-binding sites, one for the distal ubiquitin, the C terminus of which is linked to the Lys residue on the proximal ubiquitin in a second, proximal binding site [ ]. The USP domain forms the peptidase family C19 [].The USP catalytic core can be divided into six conserved boxes that are present in all USP domains. Box 1 contains the catalytic Cys residue, box 5contains the catalytic His, and box 6 contains the catalytic Asp/Asn residue. All boxes show several additional conserved features and residues. Boxes 3 and4 contain a Cys-X-X-Cys motif each, which have been shown to constitute a functional zinc-binding motif. Potentially, zinc-binding facilitates foldingof the USP core, helping the interaction of sequence motifs some few hundred residues apart. USP domains share a common conserved fold.The USP domain resembles an open hand containing Thumb, Palm and Fingers subdomains. The catalytic triad resides between the Thumb (Cys) and Palmsubdomains (His/Asp) [ ].This entry represents two conserved sites for the USP domain. The first one is around the catalytic cysteine in box 1, and the second around the catalytichistidine in box 5.
Protein Domain
Name: Ubiquitin specific protease domain
Type: Domain
Description: Protein ubiquitination is a reversible posttranslational modification, which affects a large number of cellular processes including protein degradation,trafficking, cell signaling and the DNA damage response. Ubiquitination is reversible, and dedicated deubiquitinases exist which hydrolyze isopeptidebonds. Ubiquitin specific proteases (USPs) ( ) are the largest family of deubiquitinating enzymes. USP domains consist of a common conserved catalytic core which is interspersed at five points with insertions, some of which as large as the catalytic domain itself. The insertions can fold into independent domains that can be involved in the regulation of deubiquitinase activity. As commonly found in signaling proteins, many USP deubiquitinases have a modular architecture, and not only contain a catalytic domain but also additional protein-protein interaction and localisation domains. Most USP domains cleave the isopeptide linkage between two ubiquitin molecules, and hence contain (at least) two ubiquitin-binding sites, one for the distal ubiquitin, the C terminus of which is linked to the Lys residue on the proximal ubiquitin in a second, proximal binding site []. The USP domain forms the peptidase family C19 [].The USP catalytic core can be divided into six conserved boxes that are present in all USP domains. Box 1 contains the catalytic Cys residue, box 5contains the catalytic His, and box 6 contains the catalytic Asp/Asn residue. All boxes show several additional conserved features and residues. Boxes 3 and4 contain a Cys-X-X-Cys motif each, which have been shown to constitute a functional zinc-binding motif. Potentially, zinc-binding facilitates foldingof the USP core, helping the interaction of sequence motifs some few hundred residues apart. USP domains share a common conserved fold.The USP domain resembles an open hand containing Thumb, Palm and Fingers subdomains. The catalytic triad resides between the Thumb (Cys) and Palmsubdomains (His/Asp) [ ].This entry represents the entire USP domain.
Protein Domain
Name: Sirohydrochlorin cobaltochelatase CbiX-like
Type: Family
Description: This entry represents sirohydrochlorin cobaltochelatase (also known as CbiX), which catalyses the insertion of Co2+ into sirohydrochlorin as part of the anaerobic pathway to cobalamin biosynthesis. The structure of CbiX from Archaeoglobus fulgidus consists of a central mixed β-sheet flanked by four α-helices, although it is about half the size of other Class II tetrapyrrole chelatases [ ]. The CbiX proteins found in archaea appear to be shorter than those found in eubacteria [].
Protein Domain
Name: Pre-mRNA cleavage complex subunit Clp1, C-terminal
Type: Domain
Description: The yeast Clp1 is a subunit of cleavage factor IA (CF IA) and is involved in mRNA cleavage and polyadenylation [ ]. Clp1 also mediates interactions between CF IA and another complex of the yeast mRNA cleavage and polyadenylation machinery, the Cleavage-Polyadenylation Factor (CPF) []. It seems that human Clp1 and yeast Clp1 are not functional orthologues []. Human Clp1, and its archeal homologue [], but not yeast Clp1, are 5'-OH polynucleotide kinases. In humans Clp1 functions as a RNA kinase important in tRNA splicing [, ], and is also implicated in mRNA and siRNA maturation [, , ].This entry represents the C-terminal domain of Clp1.
Protein Domain
Name: Histidine-tRNA ligase/ATP phosphoribosyltransferase regulatory subunit
Type: Family
Description: This entry represents histidine-tRNA ligase (HisS or HisRS) and its paralogue, ATP phosphoribosyltransferase regulatory subunit (HisZ). Despite the significant sequential and structural similarity, HisRS and HisZ have different functions [ ]. HisRS is a class IIa aminoacyl-tRNA synthetase (ligase), while HisZ is a regulatory subunit of the hetero-octameric ATP phosphoribosyl transferase that regulate reactions initiating histidine biosynthesis [, ]. From the phylogenetic analysis, HisZ proteins form a monophyletic group that attaches outside the predominant bacterial HisRS clade [ ]. HisZ are represented in a highly divergent set of bacteria (including an aquificale, cyanobacteria, firmicutes, and proteobacteria), but are missing from other bacteria, including mycrobacteria and certain proteobacteria []. It has been suggested that the absences of HisZ from bacteria are due to its loss during evolution [].
Protein Domain      
Protein Domain
Name: Kelch repeat type 1
Type: Repeat
Description: Kelch is a 50-residue motif, named after the Drosophila mutant in which it was first identified [ ]. This sequence motif represents one β-sheet blade, and several of these repeats can associate to form a β-propeller. For instance, the motif appears 6 times in Drosophila egg-chamber regulatory protein (also known as ring canal kelch protein), creating a 6-bladed β-propeller. The motif is also found in mouse protein MIPP [] and in a number of poxviruses. In addition, kelch repeats have been recognised in alpha- and beta-scruin [, ], and in galactose oxidase from the fungus Dactylium dendroides [, ]. The structure of galactose oxidase reveals that the repeated sequence corresponds to a 4-stranded antiparallel β-sheet motif that forms the repeat unit in a super-barrel structural fold [].The known functions of kelch-containing proteins are diverse: scruin is an actin cross-linking protein; galactose oxidase catalyses the oxidation of the hydroxyl group at the C6 position in D-galactose; and kelch may have a cytoskeletal function, as it is localised to the actin-rich ring canals that connect the 15 nurse cells to the developing oocyte in Drosophila [ ]. Nevertheless, based on the location of the kelch pattern in the catalytic unit in galactose oxidase, functionally important residues have been predicted in glyoxal oxidase [].This entry represents a type of kelch sequence motif that comprises one β-sheet blade.
Protein Domain
Name: Ribosomal protein S7e
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].A number of eukaryotic ribosomal proteins can be grouped on the basis of sequence similarities [ ].One of these families consists of Xenopus S8, and mammalian, insect and yeast S7. These proteins have about 200 amino acids.
Protein Domain
Name: Mannose-6-phosphate isomerase, type I
Type: Family
Description: Mannose-6-phosphate isomerase or phosphomannose isomerase ( ) (PMI) is the enzyme that catalyses the interconversion of mannose-6-phosphate and fructose-6-phosphate. In eukaryotes PMI is involved in the synthesis of GDP-mannose, a constituent of N- and O-linked glycans and GPI anchors and in prokaryotes it participates in a variety of pathways, including capsular polysaccharide biosynthesis and D-mannose metabolism. PMI's belong to the cupin superfamily whose functions range from isomerase and epimerase activities involved in the modification of cell wall carbohydrates in bacteria and plants, to non-enzymatic storage proteins in plant seeds, and transcription factors linked to congenital baldness in mammals [ ]. Three classes of PMI have been defined [].Type I includes eukaryotic PMI and the enzyme encoded by the manA gene in enterobacteria. PMI has a bound zinc ion, which is essential for activity.A crystal structure of PMI from Candida albicansshows that the enzyme has three distinct domains [ ]. The active site lies in the central domain, contains a single essential zinc atom, and forms a deep, open cavity of suitable dimensions to contain M6P or F6P The central domain is flanked by a helical domain on one side and a jelly-roll like domain on the other.
Protein Domain
Name: Phosphomannose isomerase, type I, conserved site
Type: Conserved_site
Description: Phosphomannose isomerase (PMI) [ , ] is the enzyme that catalyzes the interconversion of mannose-6-phosphate and fructose-6-phosphate. In eukaryotes, it is involved in the synthesis of GDP-mannose which is a constituent of N- and O-linked glycans as well as GPI anchors. In prokaryotes, it is involved in a variety of pathways including capsular polysaccharide biosynthesis and D-mannose metabolism. Three classes of PMI have been defined on the basis of sequence similarities [ ]. The first class comprises all known eukaryotic PMI as well as the enzyme encoded by the manA gene in enterobacteria such as Escherichia coli. Class I PMI's are proteins of about 42 to 50kDa which bind a zinc ion essential for their activity. Two conserved regions define class I PMI. The first one is located in the N-terminal section of the proteins, the second in the C-terminal half. Both patterns contain a residue involved in the binding of the zinc ion [ ].
Protein Domain
Name: Mannose-6-phosphate isomerase
Type: Family
Description: Mannose-6-phosphate isomerase or phosphomannose isomerase ( ) (PMI) is the enzyme that catalyses the interconversion of mannose-6-phosphate and fructose-6-phosphate. In eukaryotes PMI is involved in the synthesis of GDP-mannose, a constituent of N- and O-linked glycans and GPI anchors and in prokaryotes it participates in a variety of pathways, including capsular polysaccharide biosynthesis and D-mannose metabolism. PMI's belong to the cupin superfamily whose functions range from isomerase and epimerase activities involved in the modification of cell wall carbohydrates in bacteria and plants, to non-enzymatic storage proteins in plant seeds, and transcription factors linked to congenital baldness in mammals [ ]. Three classes of PMI have been defined [].This group represents a mannose-6-phosphate isomerase.
Protein Domain
Name: Ribulose-1,5-bisphosphate carboxylase small subunit, N-terminal
Type: Domain
Description: This domain is found in the N-terminal region of the small subunit of ribulose-1,5-bisphosphate in plants. It contains a conserved APF sequence motif. There are also two completely conserved residues (L and P) that may be functionally important [].
Protein Domain
Name: Ribulose bisphosphate carboxylase small subunit, domain
Type: Domain
Description: RuBisCO (ribulose-1,5-bisphosphate carboxylase/oxygenase) is a bifunctional enzyme that catalyses both the carboxylation and oxygenation of ribulose-1,5-bisphosphate (RuBP), thus fixing carbon dioxide as the first step of the Calvin cycle. RuBisCO is the major protein in the stroma of chloroplasts, and in higher plants exists as a complex of 8 large and 8 small subunits. The small subunits induce conformational changes in the large subunits enhancing its catalytic rate. Studies in Oryza sativa demonstrate that the availability of the small subunit upregulates the transcript levels of the large subunit [ ]. While the large subunit is coded for by a single gene, the small subunit is coded for by several different genes, which are distributed in a tissue specific manner. They are transcriptionally regulated by light receptor phytochrome [ ], which results in RuBisCO being more abundant during the day when it is required.The RuBisCo small subunit consists of a central four-stranded β-sheet, with two helices packed against it [ ].
Protein Domain
Name: Ribulose bisphosphate carboxylase, small subunit
Type: Family
Description: RuBisCO (ribulose-1,5-bisphosphate carboxylase/oxygenase) is a bifunctional enzyme that catalyses both the carboxylation and oxygenation of ribulose-1,5-bisphosphate (RuBP), thus fixing carbon dioxide as the first step of the Calvin cycle. RuBisCO is the major protein in the stroma of chloroplasts, and in higher plants exists as a complex of 8 large and 8 small subunits. The small subunits induce conformational changes in the large subunits enhancing its catalytic rate. Studies in Oryza sativa demonstrate that the availability of the small subunit upregulates the transcript levels of the large subunit [ ]. While the large subunit is coded for by a single gene, the small subunit is coded for by several different genes, which are distributed in a tissue specific manner. They are transcriptionally regulated by light receptor phytochrome [], which results in RuBisCO being more abundant during the day when it is required.
Protein Domain
Name: Tim44-like domain
Type: Domain
Description: Tim44 is an essential component of the machinery that mediates the translocation of nuclear-encoded proteins across the mitochondrial inner membrane [ ]. Tim44 is thought to bind phospholipids of the mitochondrial inner membrane both by electrostatic interactions and by penetrating the polar head group region [ ]. This entry represents the C-terminal region of Tim44 that has been shown to form a stable proteolytic fragment in yeast. This region is also found in a set of smaller bacterial proteins. The molecular function of the bacterial members is unknown, but transport seems likely. The crystal structure of the C terminus of Tim44 has revealed a large hydrophobic pocket which might play an important role in interacting with the acyl chains of lipid molecules in the mitochondrial membrane [].
Protein Domain
Name: Galactose-binding-like domain superfamily
Type: Homologous_superfamily
Description: Proteins containing a galactose-binding-like domain fold can be found in several different protein families, in both eukaryotes and prokaryotes. The common function of these domains is to bind to specific ligands, such as cell-surface-attached carbohydrate substrates for galactose oxidase and sialidase [ ], phospholipids on the outer side of the mammalian cell membrane for coagulation factor Va [], membrane-anchored ephrin for the Eph family of receptor tyrosine kinases [], and a complex of broken single-stranded DNA and DNA polymerase beta for XRCC1 [].The structure of the galactose-binding-like domain members consists of a β-sandwich, in which the strands making up the sheets exhibit a jelly roll fold. There is a high degree of similarity in the β-sandwich and in the loops between different family members, despite an often low level of sequence similarity.
Protein Domain
Name: SUN domain
Type: Domain
Description: Sad1/UNC-84 (SUN)-domain proteins are inner nuclear membrane (INM) proteins that are part of bridging complexes linking cytoskeletal elements with the nucleoskeleton. Originally identified based on an ~150-amino acid region of homology between the C terminus of the Schizosaccharomyces pombe Sad1 protein and the Caenorhabditis elegans UNC-84 protein, SUN proteins are present in the proteomes of most eucaryotes. In addition to the SUN domain, these proteins contain a transmembrane sequence and at least one coiled-coil domain and localise to the inner nuclear envelope. SUN proteins are anchored in the inner nuclear envelope by their transmembrane segment and oriented in the membrane such that the C-terminal SUN domain is located in the space between the inner and outer nuclear membrane. Here, the SUN domain can interact with the C- terminal tail of an outer nuclear envelope protein that binds to the cytoskeleton, including the centrosome [ , , ].Some proteins known to contain a SUN domain are listed below:Fission yeast spindle pole body-associated protein Sad1.Yeast spindle pole body assembly component MPS3, essential for nuclear division and fusion.Yeast uncharacterised protein SLP1.Caenorhabditis nuclear migration and anchoring protein UNC-84.Caenorhabditis SUN domain-containing protein 1 (sun-1), involved in centrosome attachment to the nucleus.Mammalian sperm-associated antigen 4 protein (SPAG4), may assist the organisation and assembly of outer dense fibres (ODFs), a specific structure of the sperm tail.Mammalian sperm-associated antigen 4-like protein (SPAG4L).Mammalian SUN1.Mammalian SUN2.Mammalian SUN3.Klaroid protein from Drosophila melanogaster [ ].
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom