Search our database by keyword

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 1110901 to 1111000 out of 1112510 for seed protein

0.76s
Type Details Score
Protein Domain
Name: Transcription factor TFIIE beta subunit, DNA-binding domain
Type: Domain
Description: Initiation of eukaryotic mRNA transcription requires melting of promoter DNA with the help of the general transcription factors TFIIE and TFIIH. In higher eukaryotes, the general transcription factor TFIIE consists of two subunits: the large alpha subunit ( ) and the small beta ( ). TFIIE beta has been found to bind to the region where the promoter starts to open to be single-stranded upon transcription initiation by RNA polymerase II. The approximately 120-residue central core domain of TFIIE beta plays a role in double-stranded DNA binding of TFIIE [ ].The TFIIE beta central core DNA-binding domain consists of three helices with a beta hairpin at the C terminus, resembling the winged helix proteins. It shows a novel double-stranded DNA-binding activity where the DNA-binding surface locates on the opposite side to the previously reported winged helix motif by forming a positively charged furrow [ ].This entry represents the central core DNA-binding domain of the TFIIE beta subunit.Transcription Factor IIE (TFIIE) beta winged-helix (or forkhead) domain is located at the central core region of TFIIE beta. The winged-helix is a form of helix-turn-helix (HTH) domain which typically binds DNA with the 3rd helix. The winged-helix domain is distinguished by the presence of a C-terminal β-strand hairpin unit (the wing) that packs against the cleft of the tri-helical core. Although most winged-helix domains are multi-member families, TFIIE beta winged-helix domain is typically found as a single orthologous group. [ , , , ].
Protein Domain
Name: Glycosyl transferase, family 20
Type: Family
Description: The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates ([intenz:2.4.1.-]) and related proteins into distinct sequence based families has been described []. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form 'clans'.Glycosyltransferase family 20 comprises enzymes with only one known activity; alpha, alpha-trehalose-phosphate synthase [UDP-forming] (). Synthesis of trehalose in the yeast Saccharomyces cerevisiae is catalysed by the trehalose-6-phosphate (Tre6P) synthase/phosphatase complex, which is composed of at least three different subunits encoded by the genes TPS1, TPS2, and TSL1. Tps1 and Tps2 carry the catalytic activities of trehalose synthesis, namely Tre6P synthase (Tps1) and Tre6P phosphatase (Tps2), while TsI1 has regulatory functions. There is some evidence that TsI1 and Tps3 may share a common function with respect to regulation and/or structural stabilisation of the Tre6P synthase/phosphatase complex in exponentially growing, heat-shocked cells [].OtsA (trehalose-6-phosphate synthase) from Escherichia coli has homology to the full-length TPS1, the N-terminal part of TPS2 and an internal region of TPS3 (TSL1) of yeast [ ].
Protein Domain
Name: DNA topoisomerase VI, subunit B, transducer
Type: Domain
Description: DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks [ ]. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis [, ]. DNA topoisomerases are divided into two classes: type I enzymes (; topoisomerases I, III and V) break single-strand DNA, and type II enzymes ( ; topoisomerases II, IV and VI) break double-strand DNA [ ].Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils [ ].This entry represents subunit B of topoisomerase VI, an ATP-dependent type IIB enzyme. Members of this family adopt a structure consisting of a four-stranded β-sheet backed by three α-helices, the last of which is over 50 amino acids long and extends from the body of the protein by several turns. This domain has been proposed to mediate intersubunit communication by structurally transducing signals from the ATP binding and hydrolysis domains to the DNA binding and cleavage domains of the gyrase holoenzyme [ ].
Protein Domain
Name: Class I peroxidase
Type: Family
Description: Peroxidases are haem-containing enzymes that use hydrogen peroxide as the electron acceptor to catalyse a number of oxidative reactions. They are found in bacteria, fungi, plants and animals. On the basis of sequence similarity, fungal, plant and bacterial peroxidases can be viewed as members of a superfamily consisting of 3 major classes. Class I, the intracellular peroxidases, includes yeast cytochrome c peroxidase (CCP), ascorbate peroxidase (AP) and bacterial catalase-peroxidases [ ].In chloroplasts of higher plants, oxygen consumption in the absence of electron acceptors is accompanied by production of H2O2 and activated forms of oxygen. Chloroplasts contain several protective systems (such as superoxide dismutase (SOD), alpha-tocopherol and carotenoids), which are effective against various forms of activated oxygen. However, they lack catalase, and the disposal of H2O2 is accomplished by other means.Ascorbic acid is a strong antioxidant that is effective in scavenging superoxide (O2-'), hydroxyl (OH') radicals and singlet oxygen. It can also remove H2O2 in the following reaction:Ascorbate + H2O2 -->dehydroascorbate + 2 H2OAscorbate peroxidase (AP) is the main enzyme responsible for hydrogen peroxide removal in the chloroplasts and cytosol of higher plants.The 3D structure of pea cytosolic ascorbate peroxidase has an overall fold virtually identical to that of CCP [ ]. The protein consists of 2 all-alpha domains, between which is embedded the haem group. The most pronounced difference between the AP and CCP structures is the absence of an antiparallel β-hairpin between the G and H helices in the AP molecule.
Protein Domain
Name: Alpha-amylase-like
Type: Family
Description: O-Glycosyl hydrolases ( ) are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [ , ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) website.Alpha-amylase is classified as family 13 of the glycosyl hydrolases and is present in archaea, bacteria, plants and animals. Alpha-amylase is an essential enzyme in alpha-glucan metabolism, acting to catalyse the hydrolysis of alpha-1,4-glucosidic bonds of glycogen, starch and related polysaccharides. Although all alpha-amylases possess the same catalytic function, they can vary with respect to sequence. In general, they are composed of three domains: a TIM barrel containing the active site residues and chloride ion-binding site (domain A), a long loop region inserted between the third beta strand and the α-helix of domain A that contains calcium-binding site(s) (domain B), and a C-terminal β-sheet domain that appears to show some variability in sequence and length between amylases (domain C) []. Amylases have at least one conserved calcium-binding site, as calcium is essential for the stability of the enzyme. The chloride-binding functions to activate the enzyme, which acts by a two-step mechanism involving a catalytic nucleophile base (usually an Asp) and a catalytic proton donor (usually a Glu) that are responsible for the formation of the beta-linked glycosyl-enzyme intermediate. This entry includes alpha-amylases and related proteins [ , ].
Protein Domain
Name: Pyruvate-flavodoxin oxidoreductase
Type: Family
Description: The oxidative decarboxylation of pyruvate to acetyl-CoA, a central step in energy metabolism, can occur by two different mechanisms [ ]. In mitochondria and aerobic bacteria this reaction is catalysed by the multienzyme complex pyruvate dehydrogenase using NAD as electron acceptor. In anaerobic organisms, however, this reaction is reversibly catalysed by a single enzyme using either ferrodoxin or flavodoxin as the electron acceptor.Pyruvate:ferrodoxin/flavodoxin reductases (PFORs) in this entry occur in both obligately and facultatively anaerobic bacteria and also some eukaryotic microorganisms. These proteins are single-chain enzymes containing a thiamin pyrophosphate cofactor for the cleavage of carbon-carbon bonds next to a carbonyl group, and iron-sulphur clusters for electron transfer. The Desulfovibrio africanus enzyme is currently the only PFOR whose three dimensional structure is known [ , ]. It is a homodimer where each subunit contains one thiamin pyrophosphate cofactor and two ferrodoxin-like 4Fe-S clusters and an atypical 4Fe-S cluster. Each monomer is composed of seven domains - domains I, II and VI make intersubunit contacts, while domains III, IV and V are located at the surface of the dimer, and domain VII forms a long arm extending over the other subunit. The cofactor is bound at the interface of domains I and VI and is proximal to the atypical 4Fe-S bound by domain VI, while the ferrodoxin-like 4Fe-S clusters are bound by domain V. Comparison of this enzyme with the multi-chain PFORs shows a correspondence between the domains in this enzyme and the subunits of the multi-chain enzymes.
Protein Domain
Name: Streptothricin acetyltransferase
Type: Family
Description: A small number of bacterial pathogens are implicated in urinary tract infections (UTIs), amongst the most frequent infections in the developedworld. The commonest bacterium isolated from UTI is Escherichia coli, with streptococcal and staphylococcal species coming a close second [ ]. Virulent microbes that colonise the human urinary tract usually possess sets of virulence factors specific to the host environment []. The most common are adhesins, molecules that allow an infection to become established; well-characterised E. coli type I pili are a good example.Aside from adhesins, other UTI-specific virulence moieties include: toxins, such as Cnf1 and haemolysin, and host biocides that act against othermicrobes competing for the same niche [ ]. Streptothricin, an antibiotic synthesised and secreted by some Gram-negative pathogens, is an example of the latter []; the antibiotic also has a toxic effect on host cells. The biocide is synthesised in a five-step process in the bacterial cytoplasm, and secreted to the cell exterior via the general secretory pathway [].The last step in the synthesis process is the acetyl co-enzyme A-dependent acetylation of the streptothricin molecule to the mature antibiotic. This is catalysed by the streptothricin acetyltransferase protein, located adjacent to the inner face of the cytoplasmic membrane []. Homologues of the original gene found in Streptomyces spp. have been found in Bacillus subtilis and Staphylococcus spp., as well as E. coli []. More recently, the streptothricin biosynthesis enzymes were shown to be related to those that carry out non-ribosomal peptide bond formation.
Protein Domain
Name: 3-isopropylmalate dehydratase, large subunit, bacteria
Type: Family
Description: 3-isopropylmalate dehydratase (or isopropylmalate isomerase; ) catalyses the stereo-specific isomerisation of 2-isopropylmalate and 3-isopropylmalate, via the formation of 2-isopropylmaleate. This enzyme performs the second step in the biosynthesis of leucine, and is present in most prokaryotes and many fungal species. The prokaryotic enzyme is a heterodimer composed of a large (LeuC) and small (LeuD) subunit, while the fungal form is a monomeric enzyme. Both forms of isopropylmalate are related and are part of the larger aconitase family [ ]. Aconitases are mostly monomeric proteins which share four domains in common and contain a single, labile [4Fe-4S]cluster. Three structural domains (1, 2 and 3) are tightly packed around the iron-sulphur cluster, while a fourth domain (4) forms a deep active-site cleft. The prokaryotic enzyme is encoded by two adjacent genes, leuC and leuD, corresponding to aconitase domains 1-3 and 4 respectively [ , ]. LeuC does not bind an iron-sulphur cluster. It is thought that some prokaryotic isopropylamalate dehydrogenases can also function as homoaconitase , converting cis-homoaconitate to homoisocitric acid in lysine biosynthesis [ ]. Homoaconitase has been identified in higher fungi (mitochondria) and several archaea and one thermophilic species of bacteria, Thermus thermophilus []. It is also found in the higher plant Arabidopsis thaliana, where it is targeted to the chloroplast [].This entry represents the large subunit of 3-isopropylmalate dehydratase (LeuC) from prokaryotes. Homoaconitase, aconitase and 3-isopropylmalate dehydratase have similar overall structures and domain organisation [ ]. All are dehydratases that bind a [4Fe-4S]-cluster.
Protein Domain
Name: Tumor necrosis factor receptor 13C/17
Type: Family
Description: The tumour necrosis factor (TNF) receptor (TNFR) superfamily comprises more than 20 type-I transmembrane proteins. Family members are defined based onsimilarity in their extracellular domain - a region that contains many cysteine residues arranged in a specific repetitive pattern []. Thecysteines allow formation of an extended rod-like structure, responsible for ligand binding [].Upon receptor activation, different intracellular signalling complexes are assembled for different members of the TNFR superfamily, depending on theirintracellular domains and sequences [ ]. Activation of TNFRs can thereforeinduce a range of disparate effects, including cell proliferation, differentiation, survival, or apoptotic cell death, depending upon thereceptor involved [ ].TNFRs are widely distributed and play important roles in many crucial biological processes, such as lymphoid and neuronal development, innate andadaptive immunity, and maintenance of cellular homeostasis [ ]. Drugs that manipulate their signalling have potential roles in the prevention andtreatment of many diseases, such as viral infections, coronary heart disease, transplant rejection, and immune disease [].This entry includes the TNF receptors 13C and 17. TNFR 17 acts as a receptor for both a proliferation-inducing ligand (APRIL) and B cell-activating factor (BAFF, also called BLyS or TALL-I) [ , ]. It is preferentially expressed by mature B-cells, suggesting a that it is involved in cell survival and proliferation. It has been demonstrated that it acts through the activation of NF-kappa-B and JNK pathways []. TNFR 13C is a B-cell receptor specific for BAFF and it promotes the mature B-cells survival and response [].
Protein Domain
Name: Mu2, C-terminal domain
Type: Domain
Description: This entry represents the C-terminal domain of heterotetrameric clathrin-associated adaptor protein complex 2 (AP-2) medium mu2 subunit. Mu2 is ubiquitously expressed in mammals. In higher eukaryotes, AP-2 plays a critical role in clathrin-mediated endocytosis from the plasma membrane in different cells. The membrane-anchored cargo molecules can be linked to the outer lattice of CCVs by AP-2. Those cargo molecules interact with adaptors through short sorting signals in their cytosolic segments. Tyrosine-based endocytotic signals are one of the most important sorting signals. They are of the form Y-X-X-Phi, where Y is tyrosine, X is any amino acid and Phi is a bulky hydrophobic residue that can be Leu, Ile, Met, Phe, or Val. These kinds of sorting signals can be recognized by the C-terminal domain of AP-2 mu2 subunit, also known as Y-X-X-Phi signal-binding domain that contains two hydrophobic pockets, one for the tyrosine-binding and one for the bulky hydrophobic residue-binding. Since the Y-X-X-Phi binding site is buried in the core structure of AP-2, a phosphorylation induced conformational change is required when the cargo molecules binds to AP-2 [ , ]. In addition, the C-terminal domain of mu2 subunit has been shown to bind other molecules. For instance, it can bind phosphoinositides, in particular PtdIns4,5P2, which might be involved in the recognition process of the tyrosine-based signals [ , ]. It can also interact with synaptotagmins, a family of important modulators of calcium-dependent neurosecretion within the synaptic vesicle (SV) membrane [].
Protein Domain
Name: Platelet-derived growth factor, N-terminal
Type: Domain
Description: Platelet-derived growth factor (PDGF) [ , ] is a potent mitogen for cells ofmesenchymal origin, including smooth muscle cells and glial cells. In both mouse and human, the PDGF signalling network consists of four ligands, PDGFA-D, and two receptors, PDGFRalpha and PDGFRbeta. All PDGFs function as secreted, disulphide-linked homodimers, but only PDGFA and B can form functional heterodimers. PDGFRs also function as homo- and heterodimers. All known PDGFs have characteristic `PDGF domains',which include eight conserved cysteines that are involved in inter- and intramolecular bonds. Alternate splicing of the A chain transcript can give rise to two differentforms that differ only in their C-terminal extremity. The transforming protein of Woolly monkey sarcoma virus (WMSV) (Simian sarcoma virus), encoded by the v-sis oncogene, is derived from the B chain of PDGF.PDGFs are mitogenic during early developmental stages, driving the proliferation of undifferentiated mesenchyme and some progenitor populations. During later maturation stages, PDGF signalling has been implicated in tissue remodelling and cellular differentiation, and in inductive events involved in patterning and morphogenesis. In addition to driving mesenchymal proliferation, PDGFs have been shown to direct the migration, differentiation and function of a variety of specialised mesenchymal and migratory cell types, both during development and in theadult animal [ ].PDGF is structurally related to a number of other growth factors which also form disulphide-linked homo- or heterodimers.This domain consists of the N-terminal regions of PGDF A and B.
Protein Domain
Name: Nuclear cap-binding complex subunit CBP66
Type: Family
Description: In protozoa of the family Trypanosomatidae RNA polymerase II (Pol II) generates polycistronic pre-mRNAs which are then processed by trans-splicing and polyadenylation to produce monocistronic mature mRNAs. Trans-splicing transfers the 39-nucleotide (nt)-long capped spliced leader (SL) from the SL RNA to the 5' end of mRNAs. The mRNA cap in these organisms has the unusual feature of containing, in addition to 7-methylguanosine, four modified nucleotides making it by definition a cap 4 structure (m7equation M2AmpAmpCmpm3Um) which appears to be conserved across this family. This highly modified cap is essential for utilisation of the SL RNA during the trans-splicing process, a key event in RNA metabolism [ , ]. In yeast and human cells, nuclear cap binding complexes (CBCs) consists of two subunits, cap binding proteins 20 and 80 (CBP20 and CBP80), the first being highly conserved from yeast to humans and contains an RNA binding motif. In Trypanosomatidae family, this complex consists of five subunits, the highly conserved CBP20 subunit, an alpha-importin which imports the complex from the cytoplasm similar to the yeast and human counterparts and three subunits that appear to be unique for this family of organisms, namely CBP30, CBP66 and CBP110 []. The CBC complex in trypanosomatids are essential for cell viability.This entry represents the CBP66 subunit of the trypanasome nuclear cap-binding complex and appears to contain an unusual zinc finger motif (CCCH). CBP66 is part of the complex that recognises this cap [ ].
Protein Domain
Name: 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase beta-1, EF hand motif
Type: Domain
Description: Phosphoinositide-specific phospholipase C (PI-PLC), also known as 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase, plays a role in the inositol phospholipid signaling by hydrolysing phosphatidylinositol-4,5-bisphosphate to produce the second messengers inositol 1,4,5-trisphosphate (IP3) and diacylglycerol (DAG). These cause the increase of intracellular calcium concentration and the activation of protein kinase C (PKC), respectively.The PLC family in murine or human species is comprised of multiple subtypes. On the basis of their structure, they have been divided into five classes, beta (beta-1, 2, 3 and 4), gamma (gamma-1 and 2), delta (delta-1, 3 and 4), epsilon, zeta, and eta types [ , ].PLC-beta-1 is the predominant PLC isoform in the brain. PLC-beta-1 knockout mice exhibit behavioral abnormalities as a result of disrupted cortical development and plasticity [ ]. PLC-beta-1 is involved in cell cycle control [, ] and in development and fertility [, ]. It has an important role in the control of mouse oocyte meiosis. PLC beta 1 is first exclusively localised to the nucleus and then migrates to the cytoplasm when the oocyte is fully grown; this chronology being crucial for the production of competent oocytes []. It regulates neuronal activity in the cerebral cortex and hippocampus, and has been implicated for participations in diverse critical functions related to forebrain diseases such as schizophrenia and epileptic encephalopathy [, ].PLC-beta1 contains a core set of domains, including an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs (represented by this entry), a PLC catalytic core, and a single C2 domain.
Protein Domain
Name: Glycoside hydrolase, family 23
Type: Family
Description: O-Glycosyl hydrolases ( ) are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [ , ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) website.Anser sp. (goose)-type lysozyme (lysozyme G) hydrolyses 1,4-beta-linkages between N-acetyl-D-glucosamine and N-acetylmuramic acid in peptidoglycan hetero-polymers of prokaryote cell walls. Lysozyme G shows preference for N-acetylmuramic acid residues that are substituted with a peptide moiety; it acts only as a glycanohydrolase.The structure of goose egg-white lysozyme (GEWL) with a bound trisaccharide has been refined to 1.6A resolution []. The trisaccharide occupies analogous sites to the B, C and D subsites of chicken (HEWL) and Bacteriophage T4 (T4L) lysozymes. All of these enzymes display the same characteristic hydrogen bonding pattern between protein and substrate []. Glu73 of GEWL corresponds closely to Glu35 of HEWL (Glu11 of T4L), supporting the viewthat this group is critical to the catalytic mechanism. However, lysozyme G has no obvious counterpart to Asp52 of chicken lysozyme (Asp20 in T4L),suggesting that a second acidic residue is not essential for its catalytic activity, and may not be required for the activity of other lysozymes.The structure of GEWL belongs to the mainly alpha class, its sequence showing no discernible similarity to other lysozymes. The enzyme hasbeen classified as belonging to family 23 of glycosyl hydrolases [ ] ().
Protein Domain
Name: Aldo-keto reductase family 2E
Type: Family
Description: This entry represents aldo-keto reductase family 2E (AKR2E), including 3-dehydroecdysone reductase AKR2E4 from Bombyx mori. AKR2E4 is a NADP-dependent oxidoreductase with high 3-dehydroecdysone reductase activity. It may play a role in the regulation of molting and has lower activity with phenylglyoxal and isatin [ ].In general, the aldo-keto reductase (AKR) protein superfamily members reduce carbonyl substrates such as: sugar aldehydes, keto-steroids, keto-prostaglandins, retinals, quinones, and lipid peroxidation by-products [ , ]. However, there are some exceptions, such as the reduction of steroid double bonds catalysed by AKR1D enzymes (5beta-reductases); and the oxidation of proximate carcinogen trans-dihydrodiol polycyclic aromatic hydrocarbons; while the beta-subunits of potassium gated ion channels (AKR6 family) control Kv channel opening [].Structurally, they contain an (alpha/beta)8-barrel motif, display large loops at the back of the barrel which govern substrate specificity, and have a conserved cofactor binding domain. The binding site is located in a large, deep, elliptical pocket in the C-terminal end of the beta sheet, the substrate being bound in an extended conformation. The hydrophobic nature of the pocket favours aromatic and apolar substrates over highly polar ones [ ]. They catalyse an ordered bi bi kinetic mechanism in which NAD(P)H cofactor binds first and leaves last []. Binding of the NADPH coenzyme causes a massive conformational change, reorienting a loop, effectively locking the coenzyme in place. This binding is more similar to FAD- than to NAD(P)-binding oxidoreductases [].
Protein Domain
Name: Aldo-keto reductase family 3A
Type: Family
Description: This entry represents aldo-keto reductase family 3A (AKR3A), including Gcy1 and Ypr1 from Saccharomyces cerevisiae. Gcy1 is a glycerol dehydrogenase involved in glycerol catabolism under microaerobic conditions [ ]. It has mRNA binding activity []. Ypr1 acts as a 2-methylbutyraldehyde reductase that displays high specific activity towards 2-methylbutyraldehyde, as well as other aldehydes such as hexanal [].In general, the aldo-keto reductase (AKR) protein superfamily members reduce carbonyl substrates such as: sugar aldehydes, keto-steroids, keto-prostaglandins, retinals, quinones, and lipid peroxidation by-products [ , ]. However, there are some exceptions, such as the reduction of steroid double bonds catalysed by AKR1D enzymes (5beta-reductases); and the oxidation of proximate carcinogen trans-dihydrodiol polycyclic aromatic hydrocarbons; while the beta-subunits of potassium gated ion channels (AKR6 family) control Kv channel opening [].Structurally, they contain an (alpha/beta)8-barrel motif, display large loops at the back of the barrel which govern substrate specificity, and have a conserved cofactor binding domain. The binding site is located in a large, deep, elliptical pocket in the C-terminal end of the beta sheet, the substrate being bound in an extended conformation. The hydrophobic nature of the pocket favours aromatic and apolar substrates over highly polar ones [ ]. They catalyse an ordered bi bi kinetic mechanism in which NAD(P)H cofactor binds first and leaves last []. Binding of the NADPH coenzyme causes a massive conformational change, reorienting a loop, effectively locking the coenzyme in place. This binding is more similar to FAD- than to NAD(P)-binding oxidoreductases [].
Protein Domain
Name: Aldo-keto reductase family 3D
Type: Family
Description: This entry represents aldo-keto reductase family 3D, including D-galacturonate reductase Gar1 from Hypocrea jecorina. Gar1 mediates the reduction of D-galacturonate to L-galactonate, the first step in D-galacturonate catabolic process. It also has activity with D-glucuronate and DL-glyceraldehyde. Its activity is seen only with NADPH and not with NADH [ ].In general, the aldo-keto reductase (AKR) protein superfamily members reduce carbonyl substrates such as: sugar aldehydes, keto-steroids, keto-prostaglandins, retinals, quinones, and lipid peroxidation by-products [ , ]. However, there are some exceptions, such as the reduction of steroid double bonds catalysed by AKR1D enzymes (5beta-reductases); and the oxidation of proximate carcinogen trans-dihydrodiol polycyclic aromatic hydrocarbons; while the beta-subunits of potassium gated ion channels (AKR6 family) control Kv channel opening [].Structurally, they contain an (alpha/beta)8-barrel motif, display large loops at the back of the barrel which govern substrate specificity, and have a conserved cofactor binding domain. The binding site is located in a large, deep, elliptical pocket in the C-terminal end of the beta sheet, the substrate being bound in an extended conformation. The hydrophobic nature of the pocket favours aromatic and apolar substrates over highly polar ones [ ]. They catalyse an ordered bi bi kinetic mechanism in which NAD(P)H cofactor binds first and leaves last []. Binding of the NADPH coenzyme causes a massive conformational change, reorienting a loop, effectively locking the coenzyme in place. This binding is more similar to FAD- than to NAD(P)-binding oxidoreductases [].
Protein Domain
Name: Aldo-keto reductase family 4A/B
Type: Family
Description: This entry represents aldo-keto reductase family 4A/B. This is a group of plant aldo-keto reductases, including NAD(P)H-dependent 6'-deoxychalcone synthase from Glycine max (Soybean) [ ], Deoxymugineic acid synthase 1 from Zea mays [], and NADPH-dependent codeinone reductase from Papaver somniferum (Opium poppy). Codeinone reductase catalyses the NADPH-dependent reduction of codeinone to codeine [].In general, the aldo-keto reductase (AKR) protein superfamily members reduce carbonyl substrates such as: sugar aldehydes, keto-steroids, keto-prostaglandins, retinals, quinones, and lipid peroxidation by-products [ , ]. However, there are some exceptions, such as the reduction of steroid double bonds catalysed by AKR1D enzymes (5beta-reductases); and the oxidation of proximate carcinogen trans-dihydrodiol polycyclic aromatic hydrocarbons; while the beta-subunits of potassium gated ion channels (AKR6 family) control Kv channel opening [].Structurally, they contain an (alpha/beta)8-barrel motif, display large loops at the back of the barrel which govern substrate specificity, and have a conserved cofactor binding domain. The binding site is located in a large, deep, elliptical pocket in the C-terminal end of the beta sheet, the substrate being bound in an extended conformation. The hydrophobic nature of the pocket favours aromatic and apolar substrates over highly polar ones [ ]. They catalyse an ordered bi bi kinetic mechanism in which NAD(P)H cofactor binds first and leaves last []. Binding of the NADPH coenzyme causes a massive conformational change, reorienting a loop, effectively locking the coenzyme in place. This binding is more similar to FAD- than to NAD(P)-binding oxidoreductases [].
Protein Domain
Name: Aldo-keto reductase family 4C
Type: Family
Description: This entry represents aldo-keto reductase family 4C (AKR4C). This is a group of plant aldo-keto reductases, including AKRC8/9/AKR4C10/AKR4C11 from Arabidopsis thaliana [ ] and aldose reductase from Hordeum vulgare (Barley) []. Plant aldo-keto reductases of the AKR4C subfamily play key roles during stress and are attractive targets for developing stress-tolerant crops [].In general, the aldo-keto reductase (AKR) protein superfamily members reduce carbonyl substrates such as: sugar aldehydes, keto-steroids, keto-prostaglandins, retinals, quinones, and lipid peroxidation by-products [ , ]. However, there are some exceptions, such as the reduction of steroid double bonds catalysed by AKR1D enzymes (5beta-reductases); and the oxidation of proximate carcinogen trans-dihydrodiol polycyclic aromatic hydrocarbons; while the beta-subunits of potassium gated ion channels (AKR6 family) control Kv channel opening [].Structurally, they contain an (alpha/beta)8-barrel motif, display large loops at the back of the barrel which govern substrate specificity, and have a conserved cofactor binding domain. The binding site is located in a large, deep, elliptical pocket in the C-terminal end of the beta sheet, the substrate being bound in an extended conformation. The hydrophobic nature of the pocket favours aromatic and apolar substrates over highly polar ones [ ]. They catalyse an ordered bi bi kinetic mechanism in which NAD(P)H cofactor binds first and leaves last []. Binding of the NADPH coenzyme causes a massive conformational change, reorienting a loop, effectively locking the coenzyme in place. This binding is more similar to FAD- than to NAD(P)-binding oxidoreductases [].
Protein Domain
Name: Aldo-keto reductase family 5A
Type: Family
Description: This entry represents aldo-keto reductase family 5A (AKR5A), including PGFS from Leishmania major and Trypanosoma brucei [ ]. PGFS, also called 9,11-endoperoxide prostaglandin H2 reductase, catalyses the NADP-dependent formation of prostaglandin F2-alpha from prostaglandin H2. It has also aldo/ketoreductase activity toward the synthetic substrates 9,10-phenanthrenequinone and p-nitrobenzaldehyde [].In general, the aldo-keto reductase (AKR) protein superfamily members reduce carbonyl substrates such as: sugar aldehydes, keto-steroids, keto-prostaglandins, retinals, quinones, and lipid peroxidation by-products [ , ]. However, there are some exceptions, such as the reduction of steroid double bonds catalysed by AKR1D enzymes (5beta-reductases); and the oxidation of proximate carcinogen trans-dihydrodiol polycyclic aromatic hydrocarbons; while the beta-subunits of potassium gated ion channels (AKR6 family) control Kv channel opening [].Structurally, they contain an (alpha/beta)8-barrel motif, display large loops at the back of the barrel which govern substrate specificity, and have a conserved cofactor binding domain. The binding site is located in a large, deep, elliptical pocket in the C-terminal end of the beta sheet, the substrate being bound in an extended conformation. The hydrophobic nature of the pocket favours aromatic and apolar substrates over highly polar ones []. They catalyse an ordered bi bi kinetic mechanism in which NAD(P)H cofactor binds first and leaves last []. Binding of the NADPH coenzyme causes a massive conformational change, reorienting a loop, effectively locking the coenzyme in place. This binding is more similar to FAD- than to NAD(P)-binding oxidoreductases [].
Protein Domain
Name: SETD2/Set2, SET domain
Type: Domain
Description: This entry represents the SET domain found in SETD2 from animals, ASHH2 from plants and Set2 from fungi. Proteins containing this domain are a group of histone methyltransferases that methylates histone H3 to form H3K36me [ , ].Yeast Set2 is involved in transcription elongation as well as in transcription repression [ ]. The methyltransferase activity of budding yeast Set2 requires the recruitment to the RNA polymerase II, which is CTK1 dependent [, , , , , , ]. Plant ASHH2 is required for the correct expression of genes essential to reproductive development [].SETD2 acts as histone-lysine N-methyltransferase that specifically trimethylates 'Lys-36' of histone H3 (H3K36me3) using demethylated 'Lys-36' (H3K36me2) as substrate [ , ]. SETD2 is also required for DNA double-strand break repair and activation of the p53-mediated checkpoint []. SETD2-inactivation has been linked to tumour development []. SETD2 also methylates alpha-tubulin at lysine 40, the same lysine that is marked by acetylation on microtubules. Methylation of microtubules occurs during mitosis and cytokinesis and can be ablated by SETD2 deletion, which causes mitotic spindle and cytokinesis defects, micronuclei, and polyploidy []. Moreover, SETD2 is also involved in interferon-alpha-induced antiviral defense by mediating both monomethylation of STAT1 at 'Lys-525' and catalyzing H3K36me3 on promoters of some interferon-stimulated genes (ISGs) to activate gene transcription [].SETD2 has been linked to several human diseases, including Renal cell carcinoma (RCC) [ ], Luscan-Lumish syndrome (LLS) [], Leukemia, acute lymphoblastic (ALL) [] and Leukemia, acute myelogenous (AML) [, ].
Protein Domain
Name: Xaa-Arg dipeptidase
Type: Family
Description: Over 70 metallopeptidase families have been identified to date. In these enzymes a divalent cation which is usually zinc, but may be cobalt, manganese or copper, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. In some families of co-catalytic metallopeptidases, two metal ions are observed in crystal structures ligated by five amino acids, with one amino acid ligating both metal ions. The known metal ligands are His, Glu, Asp or Lys. At least one other residue is required for catalysis, which may play an electrophillic role. Many metalloproteases contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases [].This family includes Xaa-Arg dipeptidases ( also known as beta-Ala-Lys dipeptidases), metallopeptidases that belong to the M20A peptidase family, subfamily D (MEROPS M20.021). These proteins catalyse the peptide bond hydrolysis in dipeptides having basic amino acids lysine, ornithine or arginine at C-terminal. They may function in a metabolite repair mechanism by eliminating alternate dipeptide by-products formed during carnosine synthesis [ ].
Protein Domain
Name: Integrin domain superfamily
Type: Homologous_superfamily
Description: Integrins are the major metazoan receptors for cell adhesion to extracellular matrix proteins and, in vertebrates, also play important roles in certain cell-cell adhesions, make transmembrane connections to the cytoskeleton and activate many intracellular signalling pathways [ , ]. An integrin receptor is a heterodimer composed of alpha and beta subunits. Each subunit crosses the membrane once, with most of the polypeptide residing in the extracellular space, and has two short cytoplasmic domains. Some members of this family have EGF repeats at the C terminus and also have a vWA domain inserted within the integrin domain at the N terminus.Most integrins recognise relatively short peptide motifs, and in general require an acidic amino acid to be present. Ligand specificity depends upon both the alpha and beta subunits [ ]. There are at least 18 types of alpha and 8 types of beta subunits recognised in humans []. Each alpha subunit tends to associate only with one type of beta subunit, but there are exceptions to this rule []. Each association of alpha and beta subunits has its own binding specificity and signalling properties. Many integrins require activation on the cell surface before they can bind ligands. Integrins frequently intercommunicate, and binding at one integrin receptor activate or inhibit another.This superfamily represent the C-terminal domain of integrin alpha (which can be further subdivided in the thight, calf-1 and calf-2 domains) and the central region of integrin beta known as the hybrid domain [ ].
Protein Domain
Name: DG-type SEA domain
Type: Domain
Description: This entry represents the DG-type SEA domain.Dystroglycan (DG) is an integral membrane receptor linking the extracellular matrix (ECM) and cytoskeleton. Through widespread expression in a variety ofcell types, including muscle, neural and epithelial cells, DG plays diverse and important roles in cell functions from basement membrane assembly totissue morphogenesis and structural integrity. DG is encoded by a single gene and posttranslationally cleaved into two noncovalently associated subunits byautoproteolysis within a distinctive protein motif called an sea urchin- enterokinase-agrin (SEA) domain. The resulting heterodimeris composed of a transmembrane subunit that tethers to the cell surface an extracellular subunit bearing extensive O-linked glycosylation. O-linkedglycosylation of the extracellular DG subunit (alpha-DG) mediates binding to several ECM ligands, including laminins and perlecan. The cleavage of DGelicits a conspicuous change in its ligand-binding activity. Extensive work has demonstrated the importance of alpha-DG glycosylation for DG functions andhow altered alpha-DG glycosylation leads to receptor dysfunction with direct implications for human diseases. However, functions contained within the DGtransmembrane subunit (beta-DG), and the roles of this subunit in human disease, are poorly understood [, ]. The DG-type SEA domain forms thepeptidase S72 family. The ~120-residue DG-type SEA domain is predicted to display a four-stranded antiparallel beta sheet (beta1-beta4) backed by alpha helices (alpha1-alpha4).The cleavage occurs at a bend between the beta2 and beta3 sheets. The cleavage of the DG precursor requires the sequence GSIVV, where cleavage occurs betweenthe glycine and serine [ , ].
Protein Domain
Name: DNA topoisomerase VI, subunit B
Type: Family
Description: DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks [ ]. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis [, ]. DNA topoisomerases are divided into two classes: type I enzymes (; topoisomerases I, III and V) break single-strand DNA, and type II enzymes ( ; topoisomerases II, IV and VI) break double-strand DNA [ ].Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils [ ].This entry represents subunit B of topoisomerase VI, a type IIB topoisomerase found predominantly in archaea, but also in a few eukayotes, such as the plant Arabidopsis thaliana [ ]. This enzyme assembles as a heterotetramer, consisting of two A subunits required for DNA cleavage and two B subunits required for ATP hydrolysis. The B subunit is structurally similar to the ATPase domain of type IIA topoisomerases, but the A subunit is distinct, and instead shares homology with the Spo11 protein that mediates double-strand DNA breaks during meiotic recombination in eukaryotes []. Therefore, though related to type IIA topoisomerases, topoisomerase VI may have a distinctive mechanism of action.
Protein Domain
Name: Transketolase-like, pyrimidine-binding domain
Type: Domain
Description: Transketolase (TK) catalyses the reversible transfer of a two-carbon ketol unit from xylulose 5-phosphate to an aldose receptor, such asribose 5-phosphate, to form sedoheptulose 7-phosphate and glyceraldehyde 3- phosphate. This enzyme, together with transaldolase, provides a link betweenthe glycolytic and pentose-phosphate pathways. TK requires thiamine pyrophosphate as a cofactor. In most sources where TK hasbeen purified, it is a homodimer of approximately 70 Kd subunits. TK sequences from a variety of eukaryotic and prokaryotic sources [, ] show that theenzyme has been evolutionarily conserved. In the peroxisomes of methylotrophic yeast Pichia angusta (Yeast) (Hansenula polymorpha), there is ahighly related enzyme, dihydroxy-acetone synthase (DHAS) (also known as formaldehyde transketolase), which exhibits a very unusualspecificity by including formaldehyde amongst its substrates. 1-deoxyxylulose-5-phosphate synthase (DXP synthase) [] is an enzyme so farfound in bacteria (gene dxs) and plants (gene CLA1) which catalyses the thiamine pyrophosphoate-dependent acyloin condensation reaction between carbonatoms 2 and 3 of pyruvate and glyceraldehyde 3-phosphate to yield 1-deoxy-D-xylulose-5-phosphate (dxp), a precursor in the biosynthetic pathway to isoprenoids, thiamine (vitamin B1), and pyridoxol (vitamin B6). DXP synthaseis evolutionary related to TK. The N-terminal section, contains a histidine residue which appears to function inproton transfer during catalysis [ ]. This entry represents the centralsection there are conserved acidic residues that are part of the active cleft and may participate in substrate-binding [].This group of proteins includes transketolase enzymes and 2-oxoisovalerate dehydrogenasebeta subunit . Both these enzymes utilise thiamine pyrophosphate as a cofactor, suggestingthere may be common aspects in their mechanism of catalysis.
Protein Domain
Name: Phosphoglycerate kinase
Type: Family
Description: Phosphoglycerate kinase ( ) (PGK) is an enzyme that catalyses the formation of ATP to ADP and vice versa. In the second step of the second phase in glycolysis, 1,3-diphosphoglycerate is converted to 3-phosphoglycerate, forming one molecule of ATP. If the reverse were to occur, one molecule of ADP would be formed. This reaction is essential in most cells for the generation of ATP in aerobes, for fermentation in anaerobes and for carbon fixation in plants.PGK is found in all living organisms and its sequence has been highly conserved throughout evolution. The enzyme exists as a monomer containing two nearly equal-sized domains that correspond to the N- and C-termini of the protein (the last 15 C-terminal residues loop back into the N-terminal domain). 3-phosphoglycerate (3-PG) binds to the N-terminal, while the nucleotide substrates, MgATP or MgADP, bind to the C-terminal domain of the enzyme. This extended two-domain structure is associated with large-scale 'hinge-bending' conformational changes, similar to those found in hexokinase [ ]. At the core of each domain is a 6-stranded parallel β-sheet surrounded by alpha helices. Domain 1 has a parallel β-sheet of six strands with an order of 342156, while domain 2 has a parallel β-sheet of six strands with an order of 321456. Analysis of the reversible unfolding of yeast phosphoglycerate kinase leads to the conclusion that the two lobes are capable of folding independently, consistent with the presence of intermediates on the folding pathway with a single domain folded []. Phosphoglycerate kinase (PGK) deficiency is associated with haemolytic anaemia and mental disorders in man [ ].
Protein Domain
Name: Cys/Met metabolism, pyridoxal phosphate-dependent enzyme
Type: Family
Description: Pyridoxal phosphate is the active form of vitamin B6 (pyridoxine or pyridoxal). Pyridoxal 5'-phosphate (PLP) is a versatile catalyst, acting as a coenzyme in a multitude of reactions, including decarboxylation, deamination and transamination [ , , ]. PLP-dependent enzymes are primarily involved in the biosynthesis of amino acids and amino acid-derived metabolites, but they are also found in the biosynthetic pathways of amino sugars and in the synthesis or catabolism of neurotransmitters; pyridoxal phosphate can also inhibit DNA polymerases and several steroid receptors []. Inadequate levels of pyridoxal phosphate in the brain can cause neurological dysfunction, particularly epilepsy [].PLP enzymes exist in their resting state as a Schiff base, the aldehyde group of PLP forming a linkage with the ε-amino group of an active site lysine residue on the enzyme. The α-amino group of the substrate displaces the lysine ε-amino group, in the process forming a new aldimine with the substrate. This aldimine is the common central intermediate for all PLP-catalysed reactions, enzymatic and non-enzymatic [ ].A number of pyridoxal-dependent enzymes involved in the metabolism of cysteine, homocysteine and methionine have been shown [ , ] to be evolutionary related. These enzymes are proteins of about 400 amino-acid residues. The pyridoxal-P group is attached to a lysine residue located in the central section of these enzymes.One of these enzymes is the sulfhydrylase FUB7 from fungi such as Gibberella and Fusarium. The gene is part of a cluster that mediates the biosynthesis of fusaric acid, a mycotoxin with low to moderate toxicity to animals and humans, but with high phytotoxic properties [ ].
Protein Domain
Name: Peptidase S28
Type: Family
Description: This group of serine peptidases belong to MEROPS peptidase family S28 (clan SC). The predicted active site residues for members of this family and family S10 occur in the same order in the sequence: S, D, H.These serine proteases include several eukaryotic enzymes such as lysosomal Pro-X carboxypeptidase, dipeptidyl-peptidase II, and thymus-specific serine peptidase [ , , , ].Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes [ ]. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Many families of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases [].Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base [ ]. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ].
Protein Domain
Name: DhaL domain
Type: Domain
Description: Dihydroxyacetone (Dha) kinases are a family of sequence-conserved enzymes that phosphorylate dihydroxyacetone, glyceraldehyde and other short-chain ketoses and aldoses. They can be divided into two groups according to the source of high-energy phosphate that they utilise, either ATP or phosphoenolpyruvate (PEP). The ATP-dependent forms are the two-domain Dha kinases (DAK), which occur in animals, plants and eubacteria. They consist of a Dha binding (K) and an ATP binding (L) domain. The PEP-dependent forms occur only in eubacteria and a few archaebacteria and consist of three subunits. Two subunits, DhaK and DhaL, are homologous to the K and L domains. Intriguingly, the ADP moiety is not exchanged for ATP but remains permanently bound to the DhaL subunit where it is rephosphorylated in situ by the third subunit, DhaM, which is homologous to the IIA domain of the mannose transporter of the bacterial PEP:sugar phosphotransferase system (PTS) [ , ].The DhaL domain consists of eight antiparallel α-helices arranged in an up-and-down geometry and aligned on a circle. This results in the formation of a helix barrel enclosing a deep pocket. The helices are amphipathic with the hydrophobic side chains directed into the pocket of the barrel and with the polar residues exposed. The nucleotide is bound on the top of the barrel [, ].The DhaL alpha helix barrel fold appears not only as a C-terminal domain in Dha kinases but also as an N-terminal domain in a family of two-domain proteins with unknown function. One representative example is YfhG of Lactococcus lactis [].
Protein Domain
Name: Nucleotidyltransferase, class I-like, C-terminal
Type: Homologous_superfamily
Description: Nucleotidytransferases can be divided into two classes based on highly conserved features of the nucleotidyltransferase motif [ ]. Class I enzymes include eukaryotic poly(A) polymerase (PAP), archaeal tRNA CCA-adding enzyme and possibly DNA polymerase beta, while class II enzymes include eukaryotic and eubacterial tRNA CCA-adding enzymes. This superfamily represents the C-terminal domain of class I nucleotidyltransferases. The C-terminal domain has an alpha/beta sandwich fold, although the archaeal tRNA CCA-adding enzyme has a large insertion; this fold is reminiscent of the RNA-recognition motif fold. Poly(A) polymerase, the enzyme at the heart of the polyadenylation machinery, is a template-independent RNA polymerase that specifically incorporates ATP at the 3' end of mRNA. In eukaryotes, polyadenylation of pre-mRNA plays an essential role in the initiation step of protein synthesis, as well as in the export and stability of mRNAs. The catalytic domain of poly(A) polymerase shares substantial structural homology with other nucleotidyl transferases such as DNA polymerase beta and kanamycin transferase [ ]. The three invariant aspartates of the catalytic triad ligate two of the three active site metals. One of these metals also contacts the adenine ring. Furthermore, conserved, catalytically important residues contact the nucleotide. These contacts, taken together with metal coordination of the adenine base, provide a structural basis for ATP selection by poly(A) polymerase.The archaeal CCA-adding enzyme builds and repairs the 3 ' end of tRNA. A single active site (nucleotidyltransferase motif) adds both CTP and ATP [ ]. This enzyme is the only RNA polymerase that can build or rebuild a specific nucleic acid sequence without using a nucleic acid template.
Protein Domain
Name: Carotenoid oxygenase
Type: Family
Description: Carotenoids such as beta-carotene, lycopene, lutein and beta-cryptoxanthine are produced in plants and certain bacteria, algae and fungi, where they function as accessory photosynthetic pigments and as scavengers of oxygen radicals for photoprotection. They are also essential dietary nutrients in animals. Carotenoid oxygenases cleave a variety of carotenoids into a range of biologically important products, including apocarotenoids in plants that function as hormones, pigments, flavours, floral scents and defence compounds, and retinoids in animals that function as vitamins, visual pigments and signalling molecules [ ]. Examples of carotenoid oxygenases include:Beta,beta-carotene 15,15'-dioxygenase (BCDO1) from animals, which cleaves beta-carotene symmetrically at the central double bond to yield two molecules of retinal [ ].Carotenoid-cleaving dioxygenase, mitochondrial (BCDO2) from animals, which cleaves beta-carotene asymmetrically to apo-10'-beta-carotenal and beta-ionone, the latter being converted to retinoic acid. Lycopene is also oxidatively cleaved [ , , ].Carotenoid 9,10(9',10')-cleavage dioxygenase (CCD) from plants, which cleaves a variety of carotenoids symmetrically at both the 9-10 and 9'-10' double bonds and catalyzes the formation of 4,9-dimethyldodeca-2,4,6,8,10-pentaene-1,12-dialdehyde from zeaxanthin [ ]. 9-cis-epoxycarotenoid dioxygenase (NCED1/2) from plants, which cleaves 9-cis xanthophylls to xanthoxin, a precursor of the hormone abscisic acid [ ].Apocarotenoid-15,15'-oxygenase (ACOX) from bacteria and cyanobacteria, which converts beta-apocarotenals rather than beta-carotene into retinal. This protein has a seven-bladed β-propeller structure with four hisitidines that hold the iron active centre [ ].Retinoid isomerohydrolase (RPE65) from animals, which in its soluble form binds all-trans retinol, and in its membrane-bound form binds all-trans retinyl esters. RPE65 is important for the production of 11-cis retinal during visual pigment regeneration [ , , ].
Protein Domain
Name: RNA 3'-terminal phosphate cyclase domain
Type: Domain
Description: RNA cyclases are a family of RNA-modifying enzymes that are conserved in eukaryotes, bacteria and archaea. Type 1 RNA 3'-terminal phosphate cyclases ( ) [ , ] catalyse the conversion of 3'-phosphate to a 2',3'-cyclic phosphodiester at the end of RNA:ATP + RNA 3'-terminal-phosphate = AMP + diphosphate + RNA terminal-2',3'-cyclic-phosphate The physiological function of the cyclase is not known, but the enzyme could be involved in the maintenance of cyclic ends in tRNA splicing intermediates or in the cyclisation of the 3' end of U6 snRNA [ ].A second subfamily of RNA 3'-terminal phosphate cyclases (type 2) that do not have cyclase activity have been identified in eukaryotes. They are localised to the nucleolus and are involved in ribosomal modification [ ].The crystal structure of RNA 3'-terminal phosphate cyclase shows that each molecule consists of two domains. The larger domain contains three repeats of a folding unit comprising two parallel alpha helices and a four-stranded beta sheet; this fold was previously identified in translation initiation factor 3 (IF3). The large domain is similar to one of the two domains of 5-enolpyruvylshikimate-3-phosphate synthase and UDP-N-acetylglucosamine enolpyruvyl transferase. The smaller domain uses a similar secondary structure element with different topology, observed in many other proteins such as thioredoxin [ ]. Although the active site of this enzyme has not been unambiguously assigned, it can be mapped to a region surrounding His309, an adenylate acceptor, in which a number of amino acids are highly conserved in the enzyme from different sources [].
Protein Domain
Name: tRNA endonuclease-like domain superfamily
Type: Homologous_superfamily
Description: This superfamily represents a structural domain found in three types of endonucleases: TsnA endonuclease (N-terminal) [ ], Hjc-type resolvase [], and tRNA-intron endonuclease (C-terminal) () [ ]. These domains have a 3-layer α/β/α topology, which is similar in structure to a motif found in several restriction endonucleases.TsnA endonuclease is a catalytic component of the Tn7 transposition system. Tn7 transposase is composed of four proteins: TnsA, TnsB, TnsC and TsnD. DNA breakage at the 5' end of the transposon is carried out by TnsA, and breakage and joining at the 3' end is carried out by TnsB. TnsC is the molecular switch that regulates transposition. The N-terminal domain of TnsA is catalytic.Hjc is a type of Holliday junction resolvase. The Holliday junction is an essential intermediate of homologous recombination, comprising four-stranded DNA complexes that are formed during recombination and related DNA repair events. During homologous recombination, genetic information is physically exchanged between parental DNAs via crossing single strands of the same polarity within the four-way Holliday structure. Hjc is an archaeal endonuclease, which specifically resolves the junction DNA to produce two separate recombinant DNA duplexes. This process is terminated by the endonucleolytic activity of resolvases, which convert the four-way DNA back to two double strands. tRNA-intron endonucleases cleave pre-tRNA producing 5'-hydroxyl and 2',3'-cyclic phosphate termini, and specifically removing the intron. The splicing of transfer RNA precursors is similar in Eukarya and Archaea. In both kingdoms an endonuclease recognises the splice sites and releases the intron, but the mechanism of splice site recognition is different in each kingdom.
Protein Domain
Name: RNA polymerase, subunit omega/Rpo6/RPB6
Type: Family
Description: In eukaryotes, there are three different forms of DNA-dependent RNA polymerases ( ) transcribing different sets of genes. Each class of RNA polymerase is an assemblage of ten to twelve different polypeptides. In archaebacteria, there is generally a single form of RNA polymerase which also consists of an oligomeric assemblage of 10 to 13 polypeptides. A component of 14 to 18kDa shared by all three forms of eukaryotic RNA polymerases and which has been sequenced in budding yeast (gene RPB6 or RPO26), in Schizosaccharomyces pombe (Fission yeast) (gene rpb6 or rpo15), in human and in African swine fever virus (ASFV) is evolutionary related to the archaebacterial subunit Rpo6 (also known as subunit K). The archaebacterial protein is colinear with the C-terminal part of the eukaryotic subunit. The structures of the omega subunit and RBP6, and the structures of the omega/beta' and RPB6/RPB1 interfaces, suggest a molecular mechanism for the function of omega and RPB6 in promoting RNAP assembly and/or stability. The conserved regions of omega and RPB6 form a compact structural domain that interacts simultaneously with conserved regions of the largest RNAP subunit and with the C-terminal tail following a conserved region of the largest RNAP subunit. The second half of the conserved region of omega and RPB6 forms an arc that projects away from the remainder of the structural domain and wraps over and around the C-terminal tail of the largest RNAP subunit, clamping it in a crevice, and threading the C-terminal tail of the largest RNAP subunit through the narrow gap between omega and RPB6 [ ].
Protein Domain
Name: Alpha-amylase, plant
Type: Family
Description: O-Glycosyl hydrolases ( ) are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [ , ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) website.Alpha-amylase is classified as family 13 of the glycosyl hydrolases and is present in archaea, bacteria, plants and animals. Alpha-amylase is an essential enzyme in alpha-glucan metabolism, acting to catalyse the hydrolysis of alpha-1,4-glucosidic bonds of glycogen, starch and related polysaccharides. Although all alpha-amylases possess the same catalytic function, they can vary with respect to sequence. In general, they are composed of three domains: a TIM barrel containing the active site residues and chloride ion-binding site (domain A), a long loop region inserted between the third beta strand and the α-helix of domain A that contains calcium-binding site(s) (domain B), and a C-terminal β-sheet domain that appears to show some variability in sequence and length between amylases (domain C) [ ]. Amylases have at least one conserved calcium-binding site, as calcium is essential for the stability of the enzyme. The chloride-binding functions to activate the enzyme, which acts by a two-step mechanism involving a catalytic nucleophile base (usually an Asp) and a catalytic proton donor (usually a Glu) that are responsible for the formation of the beta-linked glycosyl-enzyme intermediate. This entry represents a subfamily of alpha-amylase proteins that are found in plants.
Protein Domain
Name: Valyl-tRNA synthetase, tRNA-binding arm
Type: Domain
Description: This entry represents the C-terminal domain of Valyl-tRNA synthetase, which consists of two helices in a long alpha-hairpin [ ].The aminoacyl-tRNA synthetases (also known as aminoacyl-tRNA ligases) catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction [ , ]. These proteins differ widely in size and oligomeric state, and have limited sequence homology []. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric []. Class II aminoacyl-tRNA synthetases share an anti-parallel β-sheet fold flanked by α-helices [], and are mostly dimeric or multimeric, containing at least three conserved regions [, , ]. However, tRNA binding involves an α-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan, valine, and some lysine synthetases (non-eukaryotic group) belong to class I synthetases. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, phenylalanine, proline, serine, threonine, and some lysine synthetases (non-archaeal group), belong to class-II synthetases. Based on their mode of binding to the tRNA acceptor stem, both classes of tRNA synthetases have been subdivided into three subclasses, designated 1a, 1b, 1c and 2a, 2b, 2c [].
Protein Domain
Name: Glycine-tRNA synthetase, heterodimeric
Type: Family
Description: The aminoacyl-tRNA synthetases (also known as aminoacyl-tRNA ligases) catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction [ , ]. These proteins differ widely in size and oligomeric state, and have limited sequence homology []. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric []. Class II aminoacyl-tRNA synthetases share an anti-parallel β-sheet fold flanked by α-helices [], and are mostly dimeric or multimeric, containing at least three conserved regions [, , ]. However, tRNA binding involves an α-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan, valine, and some lysine synthetases (non-eukaryotic group) belong to class I synthetases. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, phenylalanine, proline, serine, threonine, and some lysine synthetases (non-archaeal group), belong to class-II synthetases. Based on their mode of binding to the tRNA acceptor stem, both classes of tRNA synthetases have been subdivided into three subclasses, designated 1a, 1b, 1c and 2a, 2b, 2c [].Glycyl-tRNA synthetase exhibits different oligomeric structures in different organisms (alpha2 beta2 and alpha2) [ ]. This entry represents the alpha and beta subunits of heterodimeric glycine-tRNA synthetases.
Protein Domain
Name: Dihydroorotate dehydrogenase, class 2
Type: Family
Description: This entry represents the enzyme protein dihydroorotate dehydrogenase (also called quinone) exclusively for class 2. It includes members from bacteria, yeast, plants etc. The subfamilies 1 and 2 share extensive homology, particularly toward the C terminus. This subfamily has a longer N-terminal region.Dihydroorotate dehydrogenase (DHOD), also known as dihydroorotate oxidase, catalyses the fourth step in de novo pyrimidine biosynthesis, the stereospecific oxidation of (S)-dihydroorotate to orotate, which is the only redox reaction in this pathway. DHODs can be divided into two mains classes: class 1 cytosolic enzymes found primarily in Gram-positive bacteria, and class 2 membrane-associated enzymes found primarily in eukaryotic mitochondria and Gram-negative bacteria [ ].The class 1 DHODs can be further divided into subclasses 1A and 1B, which differ in their structural organisation and use of electron acceptors. The 1A enzyme is a homodimer of two PyrD subunits where each subunit forms a TIM barrel fold with a bound FMN cofactor located near the top of the barrel [ ]. Fumarate is the natural electron acceptor for this enzyme. The 1B enzyme, in contrast is a heterotetramer composed of a central, FMN-containing, PyrD homodimer resembling the 1A homodimer, and two additional PyrK subunits which contain FAD and a 2Fe-2S cluster []. These additional groups allow the enzyme to use NAD(+) as its natural electron acceptor.The class 2 membrane-associated enzymes are monomers which have the FMN-containing TIM barrel domain found in the class 1 PyrD subunit, and an additional N-terminal alpha helical domain [ , ]. These enzymes use respiratory quinones as the physiological electron acceptor.
Protein Domain
Name: Arginine-tRNA ligase
Type: Family
Description: Arginine-tRNA ligase ( ) has been crystallized and preliminary X-ray crystallographic analysis of yeast arginine-tRNA ligase-yeast tRNAArg complexes is available []. The aminoacyl-tRNA synthetases (also known as aminoacyl-tRNA ligases) catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction [ , ]. These proteins differ widely in size and oligomeric state, and have limited sequence homology []. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric []. Class II aminoacyl-tRNA synthetases share an anti-parallel β-sheet fold flanked by α-helices [], and are mostly dimeric or multimeric, containing at least three conserved regions [, , ]. However, tRNA binding involves an α-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan, valine, and some lysine synthetases (non-eukaryotic group) belong to class I synthetases. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, phenylalanine, proline, serine, threonine, and some lysine synthetases (non-archaeal group), belong to class-II synthetases. Based on their mode of binding to the tRNA acceptor stem, both classes of tRNA synthetases have been subdivided into three subclasses, designated 1a, 1b, 1c and 2a, 2b, 2c [].
Protein Domain
Name: DNA polymerase family X, beta-like
Type: Family
Description: DNA carries the biological information that instructs cells how to exist in an ordered fashion: accurate replication is thus one of the mostimportant events in the cell life cycle. This function is mediated by DNA-directed DNA-polymerases, which add nucleotide triphosphate (dNTP)residues to the 3'-end of the growing DNA chain, using a complementary DNA as template. Small RNA molecules are generally used as primers forchain elongation, although terminal proteins may also be used. Three motifs, A, B and C [ ], are seen to be conserved across all DNA-polymerases, with motifs A and C also seen in RNA- polymerases. They are centred on invariant residues, and their structural significance was implied from the Klenow (Escherichia coli) structure: motif A contains a strictly-conserved aspartate at the junction of a β-strand and an α-helix; motif B contains an α-helix with positive charges; and motif C has a doublet of negative charges, located in a β-turn-beta secondary structure [].DNA polymerases ( ) can be classified, on the basis of sequence similarity [, ], into at least four different groups: A, B, C and X. Members of family X are small (about 40kDa) compared with other polymerases and encompass two distinct polymerase enzymes that have similar functionality: vertebrate polymerase beta (same as yeast pol 4), and terminal deoxynucleotidyl-transferase (TdT) (). The former functions in DNA repair, while the latter terminally adds single nucleotides to polydeoxynucleotide chains.Both enzymes catalyse addition of nucleotides in a distributive manner, i.e. they dissociate from the template-primer after addition of each nucleotide.DNA-polymerases show a degree of structural similarity with RNA-polymerases.
Protein Domain
Name: Leucyl-tRNA synthetase, class Ia, archaeal/eukaryotic cytosolic
Type: Family
Description: The aminoacyl-tRNA synthetases (also known as aminoacyl-tRNA ligases) catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction [ , ]. These proteins differ widely in size and oligomeric state, and have limited sequence homology []. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric [ ]. Class II aminoacyl-tRNA synthetases share an anti-parallel β-sheet fold flanked by α-helices [], and are mostly dimeric or multimeric, containing at least three conserved regions [, , ]. However, tRNA binding involves an α-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan, valine, and some lysine synthetases (non-eukaryotic group) belong to class I synthetases. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, phenylalanine, proline, serine, threonine, and some lysine synthetases (non-archaeal group), belong to class-II synthetases. Based on their mode of binding to the tRNA acceptor stem, both classes of tRNA synthetases have been subdivided into three subclasses, designated 1a, 1b, 1c and 2a, 2b, 2c [].Leucyl tRNA synthetase ( ) is an alpha monomer that belongs to class Ia. There are two different families of leucyl-tRNA synthetases. This family includes both archaeal and cytosolic eukaryotic leucyl-tRNA synthetases.
Protein Domain
Name: Acetate-CoA ligase
Type: Family
Description: Acetyl-CoA synthetase (also known as acetate-CoA ligase and acetyl-activating enzyme) is a ubiquitous enzyme, found in both prokaryotes and eukaryotes, which catalyses the formation of acetyl-CoA from acetate, coenzyme A (CoA) and ATP as shown below [ ]:ATP + acetate + CoA = AMP + diphosphate + acetyl-CoAThe activity of this enzyme is crucial for maintaining the required levels of acetyl-CoA, a key intermediate in many important biosynthetic and catabolic processes. It is especially important in eukayotic species as it is the only route for the activation of acetate to acetyl-CoA in these organisms (some prokaryotic species can also activate acetate by either acetate kinase/phosphotransacetylase or by ADP-forming acetyl-CoA synthase). Eukaryotes typically have two isoforms of acetyl-CoA synthase, a cytosolic form involved in biosynthetic processes and a mitochondrial form primarily involved in energy generation.The crystal structures of a eukaryotic ( , from yeast) and bacterial ( , from Salmonella) form of this enzyme have been determined [ , ]. The yeast enzyme is trimeric, while the bacterial enzyme is monomeric. The trimeric state of the yeast protein may be unique to this organism however, as the residues involved in the trimer interface are poorly conserved in other sequences. Despite differences in the oligomeric state of the two enzyme, the structures of the monomers are almost identical. A large N-terminal domain (~500 residues) containing two parallel beta sheets is followed by a small (~110 residues) C-terminal domain containing a three-stranded beta sheet with helices. The active site occurs at the domain interface, with its contents determining the orientation of the C-terminal domain.
Protein Domain
Name: Chorismate synthase, conserved site
Type: Conserved_site
Description: Chorismate synthase (CS; 5-enolpyruvylshikimate-3-phosphate phospholyase; 1-carboxyvinyl-3-phosphoshikimate phosphate-lyase; E.C. 4.2.3.5) catalyzes the seventh and final step in the shikimate pathway which is used in prokaryotes, fungi and plants for the biosynthesis of aromatic amino acids. It catalyzes the 1,4-trans elimination of the phosphate group from 5-enolpyruvylshikimate-3-phosphate (EPSP) to form chorismate which can then be used in phenylalanine, tyrosine or tryptophan biosynthesis. Chorismate synthase requires the presence of a reduced flavin mononucleotide (FMNH2 or FADH2) for its activity. Chorismate synthase from various sources shows a high degree of sequence conservation [ , ]. It is a protein of about 360 to 400 amino-acid residues.Depending on the capacity of these enzymes to regenerate the reduced form of FMN, chorismate synthases are divided into two groups: enzymes, mostly from plants and eubacteria, that sequester CS from the cellular environment, are monofunctional, while those that can generate reduced FMN at the expense of NADPH, such as found in fungi and the ciliated protozoan Euglena gracilis, are bifunctional, having an additional NADPH:FMN oxidoreductase activity. Recently, bifunctionality of the Mycobacterium tuberculosis enzyme (MtCS) was determined by measurements of both chorismate synthase and NADH:FMN oxidoreductase activities. Since shikimate pathway enzymes are present in bacteria, fungi and apicomplexan parasites (such as Toxoplasma gondii, Plasmodium falciparum, and Cryptosporidium parvum) but absent in mammals, they are potentially attractive targets for the development of new therapy against infectious diseases such as tuberculosis (TB) [ , , , , , , , , , ].This entry represents conserved regions from chorismate synthase that are rich in basic residues.
Protein Domain
Name: Plant peroxidase
Type: Family
Description: Peroxidases are haem-containing enzymes that use hydrogen peroxide as the electron acceptor to catalyse a number of oxidative reactions. Most haem peroxidases follow the reaction scheme:Fe3++ H 2O 2-->[Fe 4+=O]R' (Compound I) + H2O [Fe4+=O]R' + substrate -->[Fe 4+=O]R (Compound II) + oxidised substrate[Fe4+=O]R + substrate -->Fe 3++ H 2O + oxidised substrate In this mechanism, the enzyme reacts with one equivalent of H 2O 2to give [Fe 4+=O]R' (compound I). This is a two-electron oxidation/reduction reaction where H2O 2is reduced to water and the enzyme is oxidised. One oxidising equivalent resides on iron, giving the oxyferryl [ ] intermediate, while in many peroxidases the porphyrin (R) is oxidised to the porphyrin pi-cation radical (R'). Compound I then oxidises an organic substrate to give a substrate radical [].Peroxidases are found in bacteria, fungi, plants and animals and can be viewed as members of a superfamily consisting of 3 major classes. Class III comprises the secretory plant peroxidases, which have multiple tissue-specific functions e.g., removal of hydrogen peroxide from chloroplasts and cytosol; oxidation of toxic compounds; biosynthesis of the cell wall; defence responses towards wounding; indole-3-acetic acid (IAA) catabolism; ethylene biosynthesis; and so on. The wide spectrum of peroxidase activity, coupled with the participation in various physiological processes, is in keeping with its relative lack of specificity for substrates and the occurrence of a variety of isozymes. Plant peroxidases are monomeric glycoproteins containing 4 conserved disulphide bridges and 2 calcium ions. The 3D structure of peanut peroxidase has been shown to possess the same helical fold as class I and II peroxidases [].
Protein Domain
Name: Valine-tRNA ligase
Type: Family
Description: Valine-tRNA ligase (also known as Valyl-tRNA synthetase) ( ) is an alpha monomer that belongs to class Ia aminoacyl-tRNA ligase. The aminoacyl-tRNA synthetases (also known as aminoacyl-tRNA ligases) catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction [ , ]. These proteins differ widely in size and oligomeric state, and have limited sequence homology []. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric []. Class II aminoacyl-tRNA synthetases share an anti-parallel β-sheet fold flanked by α-helices [], and are mostly dimeric or multimeric, containing at least three conserved regions [, , ]. However, tRNA binding involves an α-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan, valine, and some lysine synthetases (non-eukaryotic group) belong to class I synthetases. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, phenylalanine, proline, serine, threonine, and some lysine synthetases (non-archaeal group), belong to class-II synthetases. Based on their mode of binding to the tRNA acceptor stem, both classes of tRNA synthetases have been subdivided into three subclasses, designated 1a, 1b, 1c and 2a, 2b, 2c [].
Protein Domain
Name: Alanine-tRNA ligase, eukaryota/bacteria
Type: Family
Description: Alanine-tRNA ligase (also known as alanyl-tRNA synthetase) ( ) is an alpha4 tetramer that belongs to class IIc. The aminoacyl-tRNA synthetases (also known as aminoacyl-tRNA ligases) catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction [ , ]. These proteins differ widely in size and oligomeric state, and have limited sequence homology []. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric []. Class II aminoacyl-tRNA synthetases share an anti-parallel β-sheet fold flanked by α-helices [], and are mostly dimeric or multimeric, containing at least three conserved regions [, , ]. However, tRNA binding involves an α-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan, valine, and some lysine synthetases (non-eukaryotic group) belong to class I synthetases. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, phenylalanine, proline, serine, threonine, and some lysine synthetases (non-archaeal group), belong to class-II synthetases. Based on their mode of binding to the tRNA acceptor stem, both classes of tRNA synthetases have been subdivided into three subclasses, designated 1a, 1b, 1c and 2a, 2b, 2c [ ].
Protein Domain
Name: GAT domain
Type: Domain
Description: The GAT domain is a region of homology of ~130 residues, which is found in eukaryotic GGAs (for Golgi-localized, gamma ear-containing ADP ribosylation factor (ARF)-binding proteins) and vertebrate TOMs (for target of myb). The GAT domain is found in its entirety only in GGAs, although, at the C terminus it shares partial sequence similarity with a short region of TOMs. The GAT domain is found in association with other domains, such as VHS and GAE. The GAT domain of GGAs serves as a molecular anchor of GGA to trans-Golgi network (TGN) membranes via its interaction with the GTP-bound form of a member of the ARF family of small GTPases and can bind specifically to the Rab GTPase effector rabaptin5 and to ubiquitin [ , , , ].The GGA-GAT domain possesses an all α-helical structure, composed of four helices arranged in a somewhat unusual topology, which has been called the helical paper clip. The overall structure shows that the GAT domain has an elongated shape, in which the longest helix participates in two small independent subdomains: an N-terminal helix-loop-helix hook and a C-terminal three-helix bundle. The hook subdomain has been shown to be both necessary and sufficient for ARF-GTP binding and Golgi targeting of GGAs. The N-terminal hook subdomain contains a hydrophobic patch, which is found to interact directly with ARF [ ]. It has been proposed that this interaction might stabilise the hook subdomain []. The C-terminal three-helix bundle is involved in the binding with Rabaptin5 and ubiquitin [].
Protein Domain
Name: NADP transhydrogenase, beta subunit
Type: Family
Description: NAD(P) transhydrogenase catalyses the transfer of reducing equivalents between NAD(H) and NADP(H), coupled to the translocation of protons across a membrane [ ]. It is an integral membrane protein found in the inner membrane of animal mitochondria and in bacterial cytoplasmic membrane. Under most physiological conditions this enzyme synthesises NADPH, driven by consumption of the proton electrochemical gradient. The resulting NADPH is subsequently used for biosynthetic reactions or the reduction of glutathione.The global structure of this enzyme is similar in all organisms, consisting of three distinct domains, though the polypeptide composition can vary. Domain I binds NAD(+)/NADH, domain II is a hydrophobic membrane-spanning domain, and domain III binds NADP(+)/NADPH. Domain I is composed of two subdomains, both of which form a Rossman fold, while domain III consists of a single Rossman fold where the NADP(+) is flipped relative to the normal orientation of bound nucleotides within the Rossman fold [ , , ]. Several residues within these domains are thought to make functionally important interdomain contacts for hydride transfer between these domains []. Proton translocation occurs through domain II and is thought to induce conformational changes which are transmitted across domain III to the site of hydride transfer between domains I and III.NAD(P) transhydrogenase from Escherichia coli contains an alpha subunit with the NAD(H)-binding domain I and a beta subunit with the NADP(H)-binding domain III. The membrane domain (domain II) harbors the proton channel and is made up of the hydrophobic parts of the alpha and beta subunits [ ]. This entry represents the beta subunit found in bacterial two-subunit NADP(H) transhydrogenases.
Protein Domain
Name: DNA-binding HTH domain, TetR-type, conserved site
Type: Conserved_site
Description: The TetR-type HTH domain is a DNA-binding, helix-turn-helix (HTH) domain of about 60 residues present in the TetR family of prokaryotic transcription regulators. Several of these bacterial regulators are repressors of genes and operons for membrane transport and cell envelope permeability. The family is named after the TetRacycline repressor TetR of enterobacteria found on Tn10 and other transposons and plasmids. The 'helix-turn-helix' DNA-binding motif is located in the N-terminal extremity of these transcriptional regulators [ ]. The C-terminal part of TetR-type regulators contains several regions that can be involved in (1) binding of inducers, which can be drugs, and (2) oligomerisation. The TetR and camR proteins are dimers, whilst qacR binds its operator as a pair of dimers and ethR seems to bind as an octamer [, ]. TetR-type transcription regulators include several bacterial regulators of drug export systems that protect pathogenic bacteria against antibiotics, antiseptics, disinfectants and host-encoded antimicrobials []. Several crystal structures of TetR-type transcription regulators have been resolved and their DNA-binding domains are formed by a three-helix bundle (H1-H3) and the N-terminal part of the following helix 4, which contributes to the hydrophobic centre of the DNA-binding domain and links it to the regulatory domain [ ]. The helix-turn-helix motif comprises the second and third helices, the third being called the recognition helix as it binds into the DNA major groove. The recognition helix of the TetR-type HTH is shorter than in most HTHs.The signature pattern, of this entry, covers a conserved region that starts six residues before the HTH motif and ends seven residues after the HTH motif.
Protein Domain
Name: Transcriptional regulator MarR-type, conserved site
Type: Conserved_site
Description: The MarR-type HTH domain is a DNA-binding, winged helix-turn-helix (wHTH) domain of about 135 amino acids present in transcription regulators of the MarR/SlyA family, involved in the development of antibiotic resistance. The MarR family of transcription regulators is named after Escherichia coli MarR, a repressor of genes which activate the multiple antibiotic resistance and oxidative stress regulons, and after slyA from Salmonella typhimurium and E. coli, a transcription regulator that is required for virulence and survival in the macrophage environment. Regulators with the MarR-type HTH domain arepresent in bacteria and archaea and control a variety of biological functions, including resistance to multiple antibiotics, household disinfectants, organicsolvents, oxidative stress agents and regulation of the virulence factor synthesis in pathogens of humans and plants. Many of the MarR-like regulatorsrespond to aromatic compounds [ , , ].The crystal structures of MarR, MexR and SlyA have been determined and show a winged HTH DNA-binding core flanked by helices involved in dimerisation. The DNA-binding domains are ascribed to the superfamily of winged helix proteins, containing a three (four)-helix (H) bundle and a three-stranded antiparallel β-sheet (B) in the topology: H1-(H1')-H2-B1-H3-H4-B2-B3-H5-H6. Helices 3 and 4 comprise the helix-turn-helix motif and the β-sheet is called the wing. Helix 4 is termed the recognition helix, like in other HTHs where it binds the DNA major groove. The helices 1, 5 and 6 are involved in dimerisation, as most MarR-like transcription regulators form dimers [, ].This entry represents a 34 residue conserved site showing high conservation within this family.
Protein Domain
Name: Phosphotransferase system, sugar-specific permease EIIA type 1
Type: Domain
Description: The phosphoenolpyruvate-dependent sugar phosphotransferase system (PTS) [ , ] is a major carbohydrate transport system in bacteria. The PTS catalyses the phosphorylation of incoming sugar substrates and coupled with translocation across the cell membrane, makes the PTS a link between the uptake and metabolism of sugars.The general mechanism of the PTS is the following: a phosphoryl group from phosphoenolpyruvate (PEP) is transferred via a signal transduction pathway, to enzyme I (EI) which in turn transfers it to a phosphoryl carrier, the histidine protein (HPr). Phospho-HPr then transfers the phosphoryl group to a sugar-specific permease, a membrane-bound complex known as enzyme 2 (EII), which transports the sugar to the cell. EII consists of at least three structurally distinct domains IIA, IIB and IIC [ ]. These can either be fused together in a single polypeptide chain or exist as two or three interactive chains, formerly called enzymes II (EII) and III (EIII). The first domain (IIA or EIIA) carries the first permease-specific phosphorylation site, a histidine which is phosphorylated by phospho-HPr. The second domain (IIB or EIIB) is phosphorylated by phospho-IIA on a cysteinyl or histidyl residue, depending on the sugar transported. Finally, the phosphoryl group is transferred from the IIB domain to the sugar substrate concomitantly with the sugar uptake processed by the IIC domain. This third domain (IIC or EIIC) forms the translocation channel and the specific substrate-binding site. An additional transmembrane domain IID, homologous to IIC, can be found in some PTSs, e.g. for mannose [ , , , ].
Protein Domain
Name: Valine-tRNA ligase, type 2
Type: Family
Description: This entry represents the type 2 subfamily of the valine-tRNA ligases.Valine-tRNA ligase (also known as Valyl-tRNA synthetase) ( ) is an alpha monomer that belongs to class Ia aminoacyl-tRNA ligase. The aminoacyl-tRNA synthetases (also known as aminoacyl-tRNA ligases) catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction [ , ]. These proteins differ widely in size and oligomeric state, and have limited sequence homology []. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric []. Class II aminoacyl-tRNA synthetases share an anti-parallel β-sheet fold flanked by α-helices [], and are mostly dimeric or multimeric, containing at least three conserved regions [, , ]. However, tRNA binding involves an α-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan, valine, and some lysine synthetases (non-eukaryotic group) belong to class I synthetases. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, phenylalanine, proline, serine, threonine, and some lysine synthetases (non-archaeal group), belong to class-II synthetases. Based on their mode of binding to the tRNA acceptor stem, both classes of tRNA synthetases have been subdivided into three subclasses, designated 1a, 1b, 1c and 2a, 2b, 2c [ ].
Protein Domain
Name: BsuBI/PstI restriction endonuclease, C-terminal domain superfamily
Type: Homologous_superfamily
Description: Type II restriction endonucleases ( ) are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four β-strands and one α-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin [ ]. However, there is still considerable diversity amongst restriction endonucleases [, ]. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone [ ]. This superfamily represents the C terminus of bacterial enzymes similar to type II restriction endonucleases BsuBI and PstI ( ). The enzymes of the BsuBI restriction/modification (R/M) system recognise the target sequence 5'CTGCAG and are functionally identical with those of the PstI R/M system [ ].
Protein Domain
Name: BsuBI/PstI restriction endonuclease, N-terminal domain superfamily
Type: Homologous_superfamily
Description: Type II restriction endonucleases ( ) are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four β-strands and one α-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin [ ]. However, there is still considerable diversity amongst restriction endonucleases [, ]. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone []. This superfamily represents the N terminus of bacterial enzymes similar to type II restriction endonucleases BsuBI and PstI ( ). The structure contains a winged-helix-turn-helix DNA binding structure. The enzymes of the BsuBI restriction/modification (R/M) system recognise the target sequence 5'CTGCAG and are functionally identical with those of the PstI R/M system [ ].
Protein Domain
Name: Carbonic anhydrase, prokaryotic-like
Type: Family
Description: Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyse the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidines [].This subfamily includes bacterial carbonic anhydrase alpha [ ], as well as plant enzymes such as tobacco nectarin III and yam dioscorin and, carbonic anhydrases from molluscs, such as nacrein, which are part of the organic matrix layer in shells. Other members of this family may be involved in maintaining pH balance, in facilitating transport of carbon dioxide or carbonic acid, or in sensing carbon dioxide levels in the environment. Dioscorin is the major storage protein of yam tubers and may play a role as an antioxidant [, , ]. Tobacco Nectarin may play a role in the maintenance of pH and oxidative balance in nectar []. Mollusc nacrein may participate in calcium carbonate crystal formation of the nacreous layer [, ]. This subfamily also includes three alpha carbonic anhydrases from Chlamydomonas reinhardtii (CAH 1-3). CAH1/2 are localized in the periplasmic space. CAH1 facilitates the movement of carbon dioxide across the plasma membrane when the medium is alkaline. CAH3 is localized to the thylakoid lumen and provides CO2 to Rubisco [].
Protein Domain
Name: PuuE allantoinase/chitin deacetylase 1
Type: Domain
Description: Allantoinase (EC 3.5.2.5) can hydrolyze allantoin((2,5-dioxoimidazolidin-4-yl)urea), one of the most important nitrogen carrier for some plants, soil animals, and microorganisms, to allantoate. DAL1 gene from Saccharomyces cerevisiae encodes an allantoinase. However, some organisms possess allantoinase activity but lack DAL1 allantoinase. In those organisms, a defective allantoinase gene, named puuE (purine utilization E), encodes an allantoinase that specifically catalyzes the hydrolysis of (S)-allantoin into allantoic acid. PuuE allantoinase is related to polysaccharide deacetylase (DCA), one member of the carbohydrate esterase 4 (CE4) superfamily, that removes N-linked or O-linked acetyl groups of cell wall polysaccharides, and lacks sequence similarity with the known DAL1 allantoinase that belongs to the amidohydrolase superfamily. PuuE allantoinase functions as a homotetramer []. Its monomer is composed of a 7-stranded barrel with detectable sequence similarity to the 6-stranded barrel NodB homology domain of DCAs. It appears to be metal-independent and acts on a small substrate molecule, which is distinct from the common features of DCAs that are normally metal ion dependent and recognize multimeric substrates.This domain is also found in chitin deacetylase 1 encoded by the Schizosaccharomyces pombe cda1 gene (SpCDA1). Although the general function of chitin deacetylase (CDA) is the synthesis of chitosan from chitin, a polymer of N-acetyl glucosamine, to build up the proper ascospore wall, the actual function of SpCDA1 might involve allantoin hydrolysis. It is likely orthologous to PuuE allantoinase, whereas it is more distantly related to the CDAs found in other fungi, such as Saccharomyces cerevisiae and Mucor rouxii. Those CDAs are similar with rizobial NodB protein and are not included in this family [ ].
Protein Domain
Name: GAT domain superfamily
Type: Homologous_superfamily
Description: The GAT domain is a region of homology of ~130 residues, which is found in eukaryotic GGAs (for Golgi-localized, gamma ear-containing ADP ribosylation factor (ARF)-binding proteins) and vertebrate TOMs (for target of myb). The GAT domain is found in its entirety only in GGAs, although, at the C terminus it shares partial sequence similarity with a short region of TOMs. The GAT domain is found in association with other domains, such as VHS and GAE. The GAT domain of GGAs serves as a molecular anchor of GGA to trans-Golgi network (TGN) membranes via its interaction with the GTP-bound form of a member of the ARF family of small GTPases and can bind specifically to the Rab GTPase effector rabaptin5 and to ubiquitin [ , , , ].The GGA-GAT domain possesses an all α-helical structure, composed of four helices arranged in a somewhat unusual topology, which has been called the helical paper clip. The overall structure shows that the GAT domain has an elongated shape, in which the longest helix participates in two small independent subdomains: an N-terminal helix-loop-helix hook and a C-terminal three-helix bundle. The hook subdomain has been shown to be both necessary and sufficient for ARF-GTP binding and Golgi targeting of GGAs. The N-terminal hook subdomain contains a hydrophobic patch, which is found to interact directly with ARF [ ]. It has been proposed that this interaction might stabilise the hook subdomain []. The C-terminal three-helix bundle is involved in the binding with Rabaptin5 and ubiquitin [].
Protein Domain
Name: Nitric-oxide synthase, eukaryote
Type: Family
Description: Nitric oxide synthase ( ) (NOS) enzymes produce nitric oxide (NO) by catalyzing a five-electron oxidation of a guanidino nitrogen of L-arginine (L-Arg). Oxidation of L-Arg to L-citrulline occurs via two successive monooxygenation reactions producing N(omega)-hydroxy-L-arginine as an intermediate. 2 mol of O(2) and 1.5 mol of NADPH are consumed per mole of NO formed [].Arginine-derived NO synthesis has been identified in mammals, fish, birds, invertebrates, plants, and bacteria [ ]. Best studied are mammals, where three distinct genes encode NOS isozymes: neuronal (nNOS or NOS-1), cytokine-inducible (iNOS or NOS-2) and endothelial (eNOS or NOS-3) []. iNOS and nNOS are soluble and found predominantly in the cytosol, while eNOS is membrane associated. The enzymes exist as homodimers, each monomer consisting of two major domains: an N-terminal oxygenase domain, which belongs to the class of haem-thiolate proteins, and a C-terminal reductase domain, which is homologous to NADPH:P450 reductase (). The interdomain linker between the oxygenase and reductase domains contains a calmodulin (CaM)-binding sequence. NOSs are the only enzymes known to simultaneously require five bound cofactors animal NOS isozymes are catalytically self-sufficient. The electron flow in the NO synthase reaction is: NADPH -->FAD -->FMN -->haem -->O(2). eNOS localisation to endothelial membranes is mediated by cotranslational N-terminal myristoylation and post-translational palmitoylation [ ]. The subcellular localisation of nNOS in skeletal muscle is mediated by anchoring of nNOS to dystrophin. nNOS contains an additional N-terminal domain, the PDZ domain [].This entry represents all forms of NOS from eukaryotes. For further information see [ , , , ].
Protein Domain
Name: DhaL domain superfamily
Type: Homologous_superfamily
Description: Dihydroxyacetone (Dha) kinases are a family of sequence-conserved enzymes that phosphorylate dihydroxyacetone, glyceraldehyde and other short-chain ketoses and aldoses. They can be divided into two groups according to the source of high-energy phosphate that they utilise, either ATP or phosphoenolpyruvate (PEP). The ATP-dependent forms are the two-domain Dha kinases (DAK), which occur in animals, plants and eubacteria. They consist of a Dha binding (K) and an ATP binding (L) domain. The PEP-dependent forms occur only in eubacteria and a few archaebacteria and consist of three subunits. Two subunits, DhaK and DhaL, are homologous to the K and L domains. Intriguingly, the ADP moiety is not exchanged for ATP but remains permanently bound to the DhaL subunit where it is rephosphorylated in situ by the third subunit, DhaM, which is homologous to the IIA domain of the mannose transporter of the bacterial PEP:sugar phosphotransferase system (PTS) [ , ].The DhaL domain consists of eight antiparallel α-helices arranged in an up-and-down geometry and aligned on a circle. This results in the formation of a helix barrel enclosing a deep pocket. The helices are amphipathic with the hydrophobic side chains directed into the pocket of the barrel and with the polar residues exposed. The nucleotide is bound on the top of the barrel [, ].The DhaL alpha helix barrel fold appears not only as a C-terminal domain in Dha kinases but also as an N-terminal domain in a family of two-domain proteins with unknown function. One representative example is YfhG of Lactococcus lactis [].
Protein Domain
Name: RPB6/omega subunit-like superfamily
Type: Homologous_superfamily
Description: In eukaryotes, there are three different forms of DNA-dependent RNA polymerases () transcribing different sets of genes. Each class of RNA polymerase is an assemblage of ten to twelve different polypeptides. In archaebacteria, there is generally a single form of RNA polymerase which also consists of an oligomeric assemblage of 10 to 13 polypeptides. A component of 14 to 18kDa shared by all three forms of eukaryotic RNA polymerases and which has been sequenced in budding yeast (gene RPB6 or RPO26), in Schizosaccharomyces pombe (Fission yeast) (gene rpb6 or rpo15), in human and in African swine fever virus (ASFV) is evolutionary related to the archaebacterial subunit Rpo6 (also known as subunit K). The archaebacterial protein is colinear with the C-terminal part of the eukaryotic subunit. The structures of the omega subunit and RBP6, and the structures of the omega/beta' and RPB6/RPB1 interfaces, suggest a molecular mechanism for the function of omega and RPB6 in promoting RNAP assembly and/or stability. The conserved regions of omega and RPB6 form a compact structural domain that interacts simultaneously with conserved regions of the largest RNAP subunit and with the C-terminal tail following a conserved region of the largest RNAP subunit. The second half of the conserved region of omega and RPB6 forms an arc that projects away from the remainder of the structural domain and wraps over and around the C-terminal tail of the largest RNAP subunit, clamping it in a crevice, and threading the C-terminal tail of the largest RNAP subunit through the narrow gap between omega and RPB6 [ ].
Protein Domain
Name: Phosphoglycerate kinase superfamily
Type: Homologous_superfamily
Description: Phosphoglycerate kinase ( ) (PGK) is an enzyme that catalyses the formation of ATP to ADP and vice versa. In the second step of the second phase in glycolysis, 1,3-diphosphoglycerate is converted to 3-phosphoglycerate, forming one molecule of ATP. If the reverse were to occur, one molecule of ADP would be formed. This reaction is essential in most cells for the generation of ATP in aerobes, for fermentation in anaerobes and for carbon fixation in plants. PGK is found in all living organisms and its sequence has been highly conserved throughout evolution. The enzyme exists as a monomer containing two nearly equal-sized domains that correspond to the N- and C-termini of the protein (the last 15 C-terminal residues loop back into the N-terminal domain). 3-phosphoglycerate (3-PG) binds to the N-terminal, while the nucleotide substrates, MgATP or MgADP, bind to the C-terminal domain of the enzyme. This extended two-domain structure is associated with large-scale 'hinge-bending' conformational changes, similar to those found in hexokinase [ ]. At the core of each domain is a 6-stranded parallel β-sheet surrounded by alpha helices. Domain 1 has a parallel β-sheet of six strands with an order of 342156, while domain 2 has a parallel β-sheet of six strands with an order of 321456. Analysis of the reversible unfolding of yeast phosphoglycerate kinase leads to the conclusion that the two lobes are capable of folding independently, consistent with the presence of intermediates on the folding pathway with a single domain folded []. Phosphoglycerate kinase (PGK) deficiency is associated with haemolytic anaemia and mental disorders in man [ ].
Protein Domain
Name: Vitamin D receptor
Type: Family
Description: Steroid or nuclear hormone receptors (NRs) constitute an important super- family of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. Members of the superfamily include thesteroid hormone receptors and receptors for thyroid hormone, retinoids, 1,25-dihydroxy-vitamin D3 and a variety of other ligands. The proteins function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner [, ]. In addition to C-terminalligand-binding domains, these nuclear receptors contain a highly-conserved, N-terminal zinc-finger that mediates specific binding to target DNA sequences, termed ligand-responsive elements. In the absence of ligand, steroid hormone receptors are thought to be weakly associated with nuclearcomponents; hormone binding greatly increases receptor affinity. NRs are extremely important in medical research, a large number of them being implicated in diseases such as cancer, diabetes, hormone resistancesyndromes, etc. While several NRs act as ligand-inducible transcription factors, many do not yet have a defined ligand and are accordingly termed "orphan"receptors. During the last decade, more than 300 NRs have been described, many of which are orphans, which cannot easily be named due to current nomenclature confusions in the literature. However, a new system has recently been introduced in an attempt to rationalise the increasingly complex set of names used to describe superfamily members.The vitamin D receptor (VDR) mediates the signal of 1-a,25-dihydroxyvitamin D3 by binding to vitamin D responsive elements - it functions either as a homodimer, or as a heterodimer of vitamin D and retinoid X receptor subunits. Deficiency of VDR causes type IIA rickets [].
Protein Domain
Name: Assassin bug toxin-like
Type: Family
Description: Assassin bugs (Arthropoda:Insecta:Hemiptera:Reduviidae), sometimes known as conenose or kissing bugs, are one of the largest and morphologically diverse families of true bugs feeding on crickets, caterpillars and other insects. Some assassin bug species are bloodsucking parasites of mammals, even of human. They can be commonly found throughout most of the world and their size varies from a few millimetres to as much as 3 or 4 centimetres. The toxic saliva of the predatory assassin bugs contains a complex mixture of small and large peptides for diverse uses such as immobilizing and pre-digesting their prey, and defence against competitors and predators. Assassin bug toxins are small peptides with disulfide connectivity that target ion-channels. They are relatively homologous to the calcium channel blockers omega-conotoxins from marine cone snails and belong to the four-loop cysteine scaffold structural class [ ], [].One of these small proteins, Ptu1, blocks reversibly the N-type calcium channels, but at the same time is less specific for the L- or P/Q-type calcium channels [ ]. Ptu1 is 34 amino acid residues long and is cross-linked by three disulphide bridges. Ptu1 contains a β-sheet region made of two antiparallel β-strands and consists of a compact disulphide-bonded core from which four loops emerge as well as N- and C-termini []. Some assassin bug toxins are listed below:Agriosphodrus dohrni (Assassin bug) toxin Ado1.Isyndus obscurus (Assassin bug) toxin Iob1.Peirates turpis (Assassin bug) toxin Ptu1.This entry also includes U-limacoditoxin(3)-Dv21 from mottled cup moth [ ], a probable toxin with moderate antiparasitic activity against the nematode Haemonchus contortus.
Protein Domain
Name: Ethanolamine/propanediol utilisation protein, EutP/PduV
Type: Family
Description: Members of this family function in ethanolamine [ , ] and propanediol [] degradation pathways. Both pathways require coenzyme B12 (adenosylcobalamin, AdoCbl). Bacteria that harbour these pathways can use ethanolamine as a source of carbon and nitrogen, or propanediol as a sole carbon and energy source, respectively.The exact roles of the EutP and PduV proteins in these respective pathways are not yet determined, however, EutP is a putative bidirectional acetate kinase that may drive flux through the ethanolamine degradation pathway under anoxic conditions found when Salmonella typhimurium infects host intestine. It may generate ATP that can be used by other enzymes (EutA and EutT) in the eut pathway. It can use GTP instead of ATP with reduced efficiency [ ]. Members of this family contain P-loop consensus motifs in the N-terminal part, and are distantly related to various GTPases and ATPases, including ATPase components of transport systems.Propanediol degradation is thought to be important for the natural Salmonella populations, since propanediol is produced by the fermentation of the common plant sugars rhamnose and fucose [ , ]. More than 1% of the Salmonella enterica genome is devoted to the utilisation of propanediol and cobalamin biosynthesis. In vivo expression technology has indicated that propanediol utilisation (pdu) genes may be important for growth in host tissues, and competitive index studies with mice have shown that pdumutations confer a virulence defect [ , ]. The pduoperon is contiguous and co-regulated with the cobalamin (B12) biosynthesis coboperon, indicating that propanediol catabolism may be the primary reason for de novo B12 synthesis in Salmonella [ , , ].
Protein Domain
Name: Photosystem I reaction centre subunit VIII superfamily
Type: Homologous_superfamily
Description: Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll 'a' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.This entry represents subunit VIII (PsaI) of the photosystem I (PSI) reaction centre. PSI is located, along with photosystem II (PSII), in the thylakoid photosynthetic membranes of plants, green algae and cyanobacteria. The crystal structure of PSI from the thermophilic cyanobacterium Synechococcus elongatus (Thermosynechococcus elongatus) has 12 protein subunits and 127 cofactors comprising 96 chlorophylls, 2 phylloquinones, 3 4Fe4S clusters, 22 carotenoids, 4 lipids, and a putative calcium ion [ ]. PsaI consists of a single transmembrane helix, and has a crucial role in aiding normal structural organisation of PsaL within the PSI complex and the absence of PsaI alters PsaL organisation, leading to a small, but physiologically significant, defect in PSI function []. PsaL encodes a subunit of PSI and is necessary for trimerisation of PSI. PsaL may constitute the trimer-forming domain in the structure of PSI [].
Protein Domain
Name: D-lactate dehydrogenase
Type: Family
Description: D-lactate dehydrogenases catalyse the reversible oxidation of D-lactate to pyruvate. This entry represents the NADH-independent D-lactate dehydrogenases found in many bacterial species. These are membrane-associated respiratory enzymes which contain an FAD cofactor and transfer electrons derived from susbstrate oxidation to the electron transfer chain. The energy derived from this reaction can be coupled either to ATP generation or the active uptake of solutes. The Escherichia coli enzyme ( ) is a peripheral membrane enzyme located on the cyctoplasmic side of the inner membrane, which passes the electrons derived from substrate oxidation to the quinone component of the electron transfer chain [ ]. It is composed of three domains: an FAD-binding domain, a cap domain and a membrane-binding domain. The FAD-binding domain has a similar fold to that of other FAD-linked enzymes, being composed of two α-β subdomains, with the FAD cofactor being accommodated between them. The cap domain forms an α-β-alpha sandwich, while the membrane-binding region is composed of four alpha helices which show an excess of basic residues over acidic ones. The enzyme is thought to be anchored to the membrane by electrostatic interactions between these basic reidues and the negatively charged phospholipid head groups of the membrane. The active site of the enzyme is not known, but is thought to be located close to the FAD-binding site, at the interface of all three domains.Proteins in this entry also include quinone-dependent D-lactate dehydrogenase Dld from Corynebacterium glutamicum. It is essential for growth with D-lactate as sole carbon source [ ].
Protein Domain
Name: MFS transporter superfamily
Type: Homologous_superfamily
Description: Transporters can be grouped in two classes, primary and secondary carriers. The primary active transporters drive solute accumulation or extrusion by using ATP hydrolysis, photon absorption, electron flow, substrate decarboxylation or methyl transfer. If charged molecules are unidirectionally pumped as a consequence of the consumption of a primary cellular energy source, electron chemical potential results. This potential can than be used to drive the active transport of additional solutes via secondary carriers.Among the different transporter the two largest families that occur ubiquitously in all classifications of organisms are the ATP-Binding Cassette (ABC) primary transporter superfamily (see ) and the Major Facilitator Superfamily (MFS). The MFS transporters are single-polypeptide secondary carriers capable only of transporting small solutes in response to chemiosmotic ion gradients [ , ]. They function as uniporters, symporters or antiporters. In addition their solute specificity are also diverse. MFS proteins contain 12 transmembrane regions (with some variations).The 3D-structure of human GLUT1, an archetype of the major facilitator superfamily has been solved [ ]. Helices 1-5, 8, 10-12 are arranged in a 9-member barrel-like manner, delimiting a hydrophilic central channel. Helix 7 is located in the centre of the channel suggesting a role in regulating transport of solutes through the channel.This entry represents MFS transporter superfamily, characterized by twelve transmembrane helices. This superfamily includes, among others, the glycerol-3-phosphate transporter from Escherichia coli, which transports glycerol-3-phosphate into the cytoplasm and inorganic phosphate into the periplasm [ ], and the E. coli proton/sugar transporter lactose permease (LacY), which acts to couple lactose and H+ translocation [, ].
Protein Domain
Name: Transcription regulators MarR/SlyA-like
Type: Family
Description: Transcription regulators of the MarR/SlyA family, involved in the development of antibiotic resistance, contain the MarR-type HTH domain, which is a DNA-binding, winged helix-turn-helix (wHTH) domain of about 135 amino acids. The MarR family of transcription regulators is named after Escherichia coli MarR, a repressor of genes which activate the multiple antibiotic resistance and oxidative stress regulons, and after slyA from Salmonella typhimurium and E. coli, a transcription regulator that is required for virulence and survival in the macrophage environment. Regulators with the MarR-type HTH domain are present in bacteria and archaea and control a variety of biological functions, including resistance to multiple antibiotics, household disinfectants, organic solvents, oxidative stress agents and regulation of the virulence factor synthesis in pathogens of humans and plants. Many of the MarR-like regulators respond to aromatic compounds [ , , ].The crystal structures of MarR, MexR and SlyA have been determined and show a winged HTH DNA-binding core flanked by helices involved in dimerisation. The DNA-binding domains are ascribed to the superfamily of winged helix proteins, containing a three (four)-helix (H) bundle and a three-stranded antiparallel β-sheet (B) in the topology: H1-(H1')-H2-B1-H3-H4-B2-B3-H5-H6. Helices 3 and 4 comprise the helix-turn-helix motif and the β-sheet is called the wing. Helix 4 is termed the recognition helix, like in other HTHs where it binds the DNA major groove. The helices 1, 5 and 6 are involved in dimerisation, as most MarR-like transcription regulators form dimers [ , ].This is a group of HTH-type transcriptional regulators including SlyA, Hpr and MarR.
Protein Domain
Name: Tumour necrosis factor receptor 27
Type: Family
Description: The tumour necrosis factor (TNF) receptor (TNFR) superfamily comprises more than 20 type-I transmembrane proteins. Family members are defined based on similarity in their extracellular domain -a region that contains many cysteine residues arranged in a specific repetitive pattern [ ]. The cysteines allow formation of an extended rod-like structure, responsible for ligand binding []. Upon receptor activation, different intracellular signalling complexes are assembled for different members of the TNFR superfamily, depending on their intracellular domains and sequences [ ]. Activation of TNFRs can therefore induce a range of disparate effects, including cell proliferation, differentiation, survival, or apoptotic cell death, depending upon the receptor involved [, ]. TNFRs are widely distributed and play important roles in many crucial biological processes, such as lymphoid and neuronal development, innate and adaptive immunity, and maintenance of cellular homeostasis [ ]. Drugs that manipulate their signalling have potential roles in the prevention and treatment of many diseases, such as viral infections, coronary heart disease, transplant rejection, and immune disease []. TNF receptor 27 (also known as ectodysplasin A2 receptor (EDA2R) and ectodysplasin receptor, x-linked (XEDAR)) is highly expressed during embryogenesis [ ], and has been implicated in the development of ectodermal appendages, such as hair follicles, teeth and sweat glands []. Although it lacks a death domain, the receptor can nevertheless induce cell death via activation of caspase 8, and may play a role in the induction of apoptosis during embryonic development and adult life []. A single partial match was also found, , a translated human cDNA sequence that fails to match motif 1.
Protein Domain
Name: Bombyxin
Type: Family
Description: The insulin family of proteins groups together several evolutionarily related active peptides [ ]: these include insulin [, ], relaxin [, ], insect prothoracicotropic hormone (bombyxin) [], insulin-like growth factors (IGF1 and IGF2) [, ], mammalian Leydig cell-specific insulin-like peptide (gene INSL3), early placenta insulin-like peptide (ELIP) (gene INSL4), locust insulin-related peptide (LIRP), molluscan insulin-related peptides (MIP), and Caenorhabditis elegans insulin-like peptides. The 3D structures of a number of family members have been determined [, , ]. The fold comprises two polypeptide chains (A and B) linked by two disulphide bonds: all share a conserved arrangement of 4 cysteines in their A chain, the first of which is linked by a disulphide bond to the third, while the second and fourth are linked by interchain disulphide bonds to cysteines in the B chain.This group represents bombyxins. Bombyxin is a brain peptide responsible for activation of prothoracic glands to produce ecdysone in insects [ ]. When cleaved, bombyxins produce both subunits of the biologically active heterodimer. Several molecular species of bombyxin have now been identified in Bombyx, Agrius and Samia moth species. Both bombyxin and insulin contain six cysteine residues with identical distributions and are predicted to have similar tertiary structures, especially in their core regions []. They are not functionally equivalent however. The similarity to insulin has prompted speculation that it may play a role in the regulation of growth, and it has been shown to act as a growth factor for wing imaginal discs and midgut stem cells in some lepidopteran species [, ].
Protein Domain
Name: Tumour necrosis factor receptor 16
Type: Family
Description: The tumour necrosis factor (TNF) receptor (TNFR) superfamily comprises more than 20 type-I transmembrane proteins. Family members are defined based on similarity in their extracellular domain -a region that contains many cysteine residues arranged in a specific repetitive pattern [ ]. The cysteines allow formation of an extended rod-like structure, responsible for ligand binding []. Upon receptor activation, different intracellular signalling complexes are assembled for different members of the TNFR superfamily, depending on their intracellular domains and sequences [ ]. Activation of TNFRs can therefore induce a range of disparate effects, including cell proliferation, differentiation, survival, or apoptotic cell death, depending upon the receptor involved [ , ]. TNFRs are widely distributed and play important roles in many crucial biological processes, such as lymphoid and neuronal development, innate and adaptive immunity, and maintenance of cellular homeostasis [ ]. Drugs that manipulate their signalling have potential roles in the prevention and treatment of many diseases, such as viral infections, coronary heart disease, transplant rejection, and immune disease []. TNF receptor 16 (also known as nerve growth factor receptor (NGFR) and p75NTR)) acts as a low affinity receptor for neurotrophins. The receptor mediates a variety of contradictory cellular functions, including cell survival or apoptosis, promotion or inhibition of axonal growth, and facilitation or attenuation of proliferation, depending on the cellular context [ ]. The receptor may also play a role in inflammation, and has been implicated in the pathogenesis of asthma []. A single partial match was also found, , a translated human cDNA sequence that fails to match motifs 1 and 2.
Protein Domain
Name: Glucose permease domain IIB
Type: Homologous_superfamily
Description: The phosphoenolpyruvate-dependent sugar phosphotransferase system (PTS) [ , ] is a major carbohydrate transport system in bacteria. The PTS catalyzes the phosphorylation of incoming sugar substrates concomitant with their translocation across the cell membrane. The general mechanism of the PTS is the following: a phosphoryl group from phosphoenolpyruvate (PEP) is transferred to enzyme-I (EI) of PTS which in turn transfers it to a phosphoryl carrier protein (HPr). Phospho-HPr then transfers the phosphoryl group to a sugar-specific permease which consists of at least three structurally distinct domains (IIA, IIB, and IIC) [] which can either be fused together in a single polypeptide chain or exist as two or three interactive chains, formerly called enzymes II (EII) and III (EIII).The first domain (IIA) carries the first permease-specific phoshorylation site, a histidine, which is phosphorylated by phospho-HPr. The second domain (IIB) is phosphorylated by phospho-IIA on a cysteinyl or histidyl residue, depending on the permease. Finally, the phosphoryl group is transferred from the IIB domain to the sugar substrate in a process catalyzed by the IIC domain; this process is coupled to the transmembrane transport of the sugar.Several PTS permease families are currently recognised, namely, the (i) glucose (including glucoside), (ii) fructose (including mannitol), (iii) lactose (including N,N-diacetylchitobiose), (iv) galactitol, (v) glucitol, (vi) mannose, and (vii) l-ascorbate families [ ].This entry represents the component IIB of the glucose family of PTS systems (type 1). The structure of this domain has a homing endonuclease-like fold, which is composed of an α-β(2)-α-β(2)-alpha fold arranged into two layers (alpha/beta) with antiparallel sheet.
Protein Domain
Name: Relaxin
Type: Family
Description: The insulin family of proteins groups together several evolutionarily related active peptides [ ]: these include insulin [, ], relaxin [, ], insect prothoracicotropic hormone (bombyxin) [], insulin-like growth factors (IGF1 and IGF2) [, ], mammalian Leydig cell-specific insulin-like peptide (gene INSL3), early placenta insulin-like peptide (ELIP) (gene INSL4), locust insulin-related peptide (LIRP), molluscan insulin-related peptides (MIP), and Caenorhabditis elegans insulin-like peptides. The 3D structures of a number of family members have been determined [, , ]. The fold comprises two polypeptide chains (A and B) linked by two disulphide bonds: all share a conserved arrangement of 4 cysteines in their A chain, the first of which is linked by a disulphide bond to the third, while the second and fourth are linked by interchain disulphide bonds to cysteines in the B chain. Relaxin is encoded by two non-allelic genes in humans and great apes, and by a single gene in all other species studied to date [ ]. The expression of human relaxin genes (H1 and H2) has been characterised in placenta, decidua, prostate and ovary: H2 relaxin mRNA was detected in the ovary, term placenta, decidua, and prostate gland; by contrast, H1 gene expression was detected only in the prostate gland []. Synthesised in the corpora lutea of ovaries during pregnancy, relaxin is released into the blood stream prior to parturition []. With oestrogen, it acts to produce dilation of the birth canal in many mammals, its major biological role being to remodel the reproductive tract to facilitate the birth process [].
Protein Domain
Name: L-carnitine CoA-transferase CaiB
Type: Family
Description: CoA-transferases are found in organisms from all kingdoms of life. They catalyse reversible transfer reactions of coenzyme A groups from CoA-thioesters to free acids. There are at least three families of CoA-transferases, which differ in sequence and reaction mechanism:Family I consists of CoA-transferases for 3-oxoacids ( , ), short-chain fatty acids ( , ) and glutaconate ( ). Most use succinyl-CoA or acetyl-CoA as CoA donors. Family II consists of the homodimeric alpha-subunits of citrate lyase and citramalate lyase ( , ). These enzymes catalyse the transfer of acyl carrier protein (ACP) with a covalently bound CoA derivative, but can accept free CoA thioesters as well. Family III consists of formyl-CoA:oxalate CoA-transferase [ ], succinyl-CoA:(R)-benzylsuccinate CoA-transferase [], (E)-cinnamoyl-CoA:(R)-phenyllactate CoA-transferase [], succinyl-CoA:mesaconate CoA-transferase [] and butyrobetainyl-CoA:(R)-carnitine CoA-transferase []. These CoA-transferases occur in prokaryotes and eukaryotes, and catalyse CoA-transfer reactions in a highly substrate- and stereo-specific manner [].CaiB is a family III CoA-transferase that catalyzes the reversible transfer of the CoA moiety from gamma-butyrobetainyl-CoA to L-carnitine to generate L-carnitinyl-CoA and gamma-butyrobetaine [ ]. This enzyme can also catalyse the reversible transfer of the CoA moiety from gamma-butyrobetainyl-CoA or L-carnitinyl-CoA to crotonobetaine to generate crotonobetainyl-CoA.In Escherichia coli, CaiB is one of the genes encoded by the caiTABCDE operon [ ]. The adjacent but divergent fixABCD operon also appears to be necessary for carnintine meatbolism []. CaiB is composed of two identical circular chains that together form an intertwined dimer. Each monomer consists of a large domain, containing a Rossmann fold, and a small domain [ ].
Protein Domain
Name: Huwentoxin, conserved site-1
Type: Conserved_site
Description: The spider venoms often contain many active peptides such as neurotoxins, lectins, inhibitors to enzyme, etc. These peptides are very important for the spider's hunting and defending. During the long history of spider evolution, the peptides evolved into different structures and functions. Despite their different biological functions the following peptides appear to have evolved from the same ancestors and belong to the huwentoxin-1 family [ ]: Ornithoctonus huwena (Chinese bird spider) (Selenocosmia huwena) huwentoxin-I (HWTX-I), a 33 amino acid peptide, which can block the N-type high-voltage activated calcium channels [ ]. O. huwena (Chinese bird spider) (Selenocosmia huwena) huwentoxin-IIIa (HWTX-IIIa). O. huwena (Chinese bird spider) (Selenocosmia huwena) huwentoxin-IV (HWTX-IV), a 35 amino acid peptide, which is an inhibitor of tetrodotoxin (TTX) sensitive voltage-gated sodium channel [ ]. O. huwena (Chinese bird spider) (Selenocosmia huwena) huwentoxin-V (HWTX-V), a 35 amino acid insecticidal toxin which can reversibly paralyze in insects [ ].O. huwena (Chinese bird spider) (Selenocosmia huwena) huwenlectin-I (SHL-I), a 32 amino acid peptide with haemagglutination activity but almost no neurotoxin activity. Selenocosmia hainana (Chinese bird spider) Hainantoxin-I (HNTX-I), Hainantoxin-III (HNTX-III), Hainantoxin-IV (HNTX-IV) and Hainantoxin-V (HNTX-V) [ ]. Brachypelma smithii (Mexican red knee tarantula) Venom protein 5. Grammostola rosea (Chilean rose tarantula) (Grammostola spatulata) voltage sensor toxin 1. Peptides of the huwentoxin type I family contain 6 cysteine residues involved in three disulphide bonds. The three disulphide bridges have been assigned as C1-C4, C2-C5 and C3-C6. HWTX-I adopts a compact structure consisting of a small triple-stranded antiparallel β-sheet and five β-turns.
Protein Domain
Name: RNA 3'-terminal phosphate cyclase domain superfamily
Type: Homologous_superfamily
Description: RNA cyclases are a family of RNA-modifying enzymes that are conserved in eukaryotes, bacteria and archaea. Type 1 RNA 3'-terminal phosphate cyclases ( ) [ , ] catalyse the conversion of 3'-phosphate to a 2',3'-cyclic phosphodiester at the end of RNA:ATP + RNA 3'-terminal-phosphate = AMP + diphosphate + RNA terminal-2',3'-cyclic-phosphate The physiological function of the cyclase is not known, but the enzyme could be involved in the maintenance of cyclic ends in tRNA splicing intermediates or in the cyclisation of the 3' end of U6 snRNA [ ].A second subfamily of RNA 3'-terminal phosphate cyclases (type 2) that do not have cyclase activity have been identified in eukaryotes. They are localised to the nucleolus and are involved in ribosomal modification [ ].The crystal structure of RNA 3'-terminal phosphate cyclase shows that each molecule consists of two domains. The larger domain contains three repeats of a folding unit comprising two parallel alpha helices and a four-stranded beta sheet; this fold was previously identified in translation initiation factor 3 (IF3). The large domain is similar to one of the two domains of 5-enolpyruvylshikimate-3-phosphate synthase and UDP-N-acetylglucosamine enolpyruvyl transferase. The smaller domain uses a similar secondary structure element with different topology, observed in many other proteins such as thioredoxin [ ]. Although the active site of this enzyme has not been unambiguously assigned, it can be mapped to a region surrounding His309, an adenylate acceptor, in which a number of amino acids are highly conserved in the enzyme from different sources [].
Protein Domain
Name: Photosystem I reaction centre subunit VIII
Type: Family
Description: Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll 'a' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.This entry represents subunit VIII (PsaI) of the photosystem I (PSI) reaction centre. PSI is located, along with photosystem II (PSII), in the thylakoid photosynthetic membranes of plants, green algae and cyanobacteria. The crystal structure of PSI from the thermophilic cyanobacterium Synechococcus elongatus (Thermosynechococcus elongatus) has 12 protein subunits and 127 cofactors comprising 96 chlorophylls, 2 phylloquinones, 3 4Fe4S clusters, 22 carotenoids, 4 lipids, and a putative calcium ion []. PsaI consists of a single transmembrane helix, and has a crucial role in aiding normal structural organisation of PsaL within the PSI complex and the absence of PsaI alters PsaL organisation, leading to a small, but physiologically significant, defect in PSI function []. PsaL encodes a subunit of PSI and is necessary for trimerisation of PSI. PsaL may constitute the trimer-forming domain in the structure of PSI [].
Protein Domain
Name: Peptidase M36, fungalysin
Type: Family
Description: Over 70 metallopeptidase families have been identified to date. In these enzymes a divalent cation which is usually zinc, but may be cobalt, manganese or copper, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. In some families of co-catalytic metallopeptidases, two metal ions are observed in crystal structures ligated by five amino acids, with one amino acid ligating both metal ions. The known metal ligands are His, Glu, Asp or Lys. At least one other residue is required for catalysis, which may play an electrophillic role. Many metalloproteases contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases [].This group of metallopeptidases belong to MEROPS peptidase family M36 (fungalysin family, clan MA(E)). The predicted active site residues for members of this family and thermolysin, the type example for clan MA, occur in the motif HEXXH.Fungalysin is produced by fungi, Aspergillus and other species, to aid degradation of host lung cell walls on infection. Theenzyme is a 42kDa single chain protein, with a pH optimum of 7.5-8.0 and optimal temperature of 60 celcius [].
Protein Domain
Name: Ketol-acid reductoisomerase, N-terminal
Type: Domain
Description: Ketol-acid reductoisomerase (KARI; ( )), also known as acetohydroxy acid isomeroreductase (AHIR or AHAIR), catalyzes the conversion ofacetohydroxy acids into dihydroxy valerates in the second step of the biosynthetic pathway for the essential branched-chain amino acids valine,leucine, and isoleucine. KARI catalyzes an unusual two-step reaction consisting of an alkyl migration in which the substrate, either 2-acetolactate(AL) or 2-aceto-2-hydroxybutarate (AHB), is converted to 3-hydoxy-3-methyl-2- oxobutyrate or 3-hydoxy-3-methyl-2-pentatonate, followed by a NADPH-dependentreduction to give 2,3-dihydroxy-3-isovalerate or 2,3-dihydroxy-3- methylvalerate respectively [, , , , , ].KARI is present only in bacteria, fungi, and plants, but not in animals. KARIs are divided into two classes on the basis of sequence length andoligomerization state. Class I KARIs are ~340 amino acid residues in length and include all fungal KARIs, whereas class II KARIs are ~490 residues longand include all plant KARIs. Bacterial KARIs can be either class I or class II. KARIs are composed of two types of domains, an N-terminal Rossmann folddomain and one or two C-terminal knotted domains. Two intertwinned knotteddomains are required for function, and in the short-chain or class I KARIs, each polypeptide chain has one knotted domain. As a result, dimerization oftwo monomers forms two complete KARI active sites. In the long-chain or class II KARIs, a duplication of the knotted domain has occurred and, as a result,the protein does not require dimerization to complete its active site [, , , , , ].The alpha/beta KARI N-terminal Rossmann fold domain consists of a nine-stranded mixed β-sheet with flanking α-helices on both sides of the β-sheet.
Protein Domain
Name: Acetylglutamate kinase ArgB, GNAT domain-containing
Type: Family
Description: N -Acetylglutamate (NAG) fulfils distinct biological roles in lower and higher organisms. In prokaryotes, lower eukaryotes and plants it is the first intermediate in the biosynthesis of arginine, whereas in ureotelic (excreting nitrogen mostly in the form of urea) vertebrates, it is an essential allosteric cofactor for carbamyl phosphate synthetase I (CPSI), the first enzyme of the urea cycle. The pathway that leads from glutamate to arginine in lower organisms employs eight steps, starting with the acetylation of glutamate to form NAG. In these species, NAG can be produced by two enzymatic reactions: one catalysed by NAG synthase (NAGS) and the other by ornithine acetyltransferase (OAT). In ureotelic species, NAG is produced exclusively by NAGS. In lower organisms, NAGS is feedback-inhibited by L-arginine, whereas mammalian NAGS activity is significantly enhanced by this amino acid. The NAGS genes of bacteria, fungi and mammals are more diverse than other arginine-biosynthesis and urea-cycle genes. The evolutionary relationship between the distinctly different roles of NAG and its metabolism in lower and higher organisms remains to be determined [].The pathway from glutamate to arginine is: NAGS; N-acetylglutamate synthase ( ) (glutamate to N-acetylglutamate) NAGK; N-acetylglutamate kinase ( ) (N-acetylglutamate to N-acetylglutamate-5P) N-acetyl-gamma-glutamyl-phosphate reductase ( ) (N-acetylglutamate-5P to N-acetylglumate semialdehyde) Acetylornithine aminotransferase ( ) (N-acetylglumate semialdehyde to N-acetylornithine) Acetylornithine deacetylase ( ) (N-acetylornithine to ornithine) Arginase ( ) (ornithine to arginine) This entry represents N-acetylglutamate kinase (NAGK) with a C-terminal GNAT domain. Majority of proteins in this entry are from bacteria, including argB from Xylella fastidiosa (UniProt:Q9PEM7).
Protein Domain
Name: Conotoxin, mu-type
Type: Family
Description: This entry represents Mu-type conotoxins. Cone snail toxins, conotoxins, are small peptides with disulphide connectivity, that target ion-channels or G-protein coupled receptors. Based on the number and pattern of disulphide bonds and biological activities, conotoxins can be classified into several families [ ]. Omega, delta and kappa families of conotoxins have a knottin or inhibitor cystine knot scaffold. The knottin scaffold is a very special disulphide through disulphide knot, in which the III-VI disulphide bond crosses the macrocycle formed by two other disulphide bonds (I-IV and II-V) and the interconnecting backbone segments, where I-VI indicates the six cysteine residues starting from the N terminus. The disulphide bonding network as well as specific amino acids in inter-cysteine loops provide specificity of conotoxin [ ]. The cysteine arrangement is the same for omega, delta and kappa families, but omega conotoxins are calcium channel blockers, whereas delta conotoxins delay the inactivation of sodium channels and kappa conotoxins are potassium channel blockers []. Mu conotoxins have two types of cysteine arrangement, but the knottin scaffold is not observed. Conotoxin gm9a, a putative 27-residue polypeptide encoded by Conus gloriamaris,has been shown to adopt an inhibitory cystine knot motif constrained by three disulphide bonds [ , ]. Mu conotoxins target the voltage-gated sodium channels [, , ], preferential skeletal muscle [], and are useful probes for investigating voltage-dependent sodium channels of excitable tissues []. They differ on the primary structure level which also affect their three-dimensional structures, explaining their sodium channel subtype specificities [].
Protein Domain
Name: Quinohemoprotein amine dehydrogenase, alpha subunit
Type: Family
Description: Quinohemoprotein amine dehydrogenases (QHNDH) ) are enzymes produced in the periplasmic space of certain Gram-negative bacteria, such as Paracoccus denitrificans and Pseudomonas putida, in response to primary amines, including n-butylamine and benzylamine. QHNDH catalyses the oxidative deamination of a wide range of aliphatic and aromatic amines through formation of a Schiff-base intermediate involving one of the quinone O atoms [ ]. Catalysis requires the presence of a novel redox cofactor, cysteine tryptophylquinone (CTQ). CTQ is derived from the post-translational modification of specific residues, which involves the oxidation of the indole ring of a tryptophan residue to form tryptophylquinone, followed by covalent cross-linking with a cysteine residue []. There is one CTQ per subunit in QHNDH. In addition to CTQ, two haem c cofactors are present in QHNDH that mediate the transfer of the substrate-derived electrons from CTQ to an external electron acceptor, cytochrome c-550 [, ].QHNDH is a heterotrimer of alpha, beta and gamma subunits. The alpha and beta subunits contain signal peptides necessary for the translocation of QHNDH to the periplasm. The alpha subunit is composed of four domains - domain 1 forming a dihaem cytochrome, and domains 2-4 forming antiparallel β-barrel structures; the beta subunit is a 7-bladed β-propeller that provides part of the active site; and the small, catalytic gamma subunit contains the novel cross-linked CTQ cofactor, in addition to additional thioester cross-links between Cys and Asp/Glu residues that encage CTQ. The gamma subunit assumes a globular secondary structure with two short α-helices having many turns and bends [ ]. This entry represents the QHNDH alpha subunit.
Protein Domain
Name: Quinohemoprotein amine dehydrogenase, beta subunit
Type: Family
Description: Quinohemoprotein amine dehydrogenases (QHNDH) ) are enzymes produced in the periplasmic space of certain Gram-negative bacteria, such as Paracoccus denitrificans and Pseudomonas putida, in response to primary amines, including n-butylamine and benzylamine. QHNDH catalyses the oxidative deamination of a wide range of aliphatic and aromatic amines through formation of a Schiff-base intermediate involving one of the quinone O atoms [ ]. Catalysis requires the presence of a novel redox cofactor, cysteine tryptophylquinone (CTQ). CTQ is derived from the post-translational modification of specific residues, which involves the oxidation of the indole ring of a tryptophan residue to form tryptophylquinone, followed by covalent cross-linking with a cysteine residue []. There is one CTQ per subunit in QHNDH. In addition to CTQ, two haem c cofactors are present in QHNDH that mediate the transfer of the substrate-derived electrons from CTQ to an external electron acceptor, cytochrome c-550 [, ].QHNDH is a heterotrimer of alpha, beta and gamma subunits. The alpha and beta subunits contain signal peptides necessary for the translocation of QHNDH to the periplasm. The alpha subunit is composed of four domains - domain 1 forming a dihaem cytochrome, and domains 2-4 forming antiparallel β-barrel structures; the beta subunit is a 7-bladed β-propeller that provides part of the active site; and the small, catalytic gamma subunit contains the novel cross-linked CTQ cofactor, in addition to additional thioester cross-links between Cys and Asp/Glu residues that encage CTQ. The gamma subunit assumes a globular secondary structure with two short α-helices having many turns and bends [ ]. This entry represents the QHNDH beta subunit.
Protein Domain
Name: Chorismate synthase AroC superfamily
Type: Homologous_superfamily
Description: The chorismate synthase AroC consists of two DCoH-like beta(2)-α-β(2)-alpha structural repeats.Chorismate synthase (CS; 5-enolpyruvylshikimate-3-phosphate phospholyase; 1-carboxyvinyl-3-phosphoshikimate phosphate-lyase; E.C. 4.2.3.5) catalyzes the seventh and final step in the shikimate pathway which is used in prokaryotes, fungi and plants for the biosynthesis of aromatic amino acids. It catalyzes the 1,4-trans elimination of the phosphate group from 5-enolpyruvylshikimate-3-phosphate (EPSP) to form chorismate which can then be used in phenylalanine, tyrosine or tryptophan biosynthesis. Chorismate synthase requires the presence of a reduced flavin mononucleotide (FMNH2 or FADH2) for its activity. Chorismate synthase from various sources shows a high degree of sequence conservation [ , ]. It is a protein of about 360 to 400 amino-acid residues.Depending on the capacity of these enzymes to regenerate the reduced form of FMN, chorismate synthases are divided into two groups: enzymes, mostly from plants and eubacteria, that sequester CS from the cellular environment, are monofunctional, while those that can generate reduced FMN at the expense of NADPH, such as found in fungi and the ciliated protozoan Euglena gracilis, are bifunctional, having an additional NADPH:FMN oxidoreductase activity. Recently, bifunctionality of the Mycobacterium tuberculosis enzyme (MtCS) was determined by measurements of both chorismate synthase and NADH:FMN oxidoreductase activities. Since shikimate pathway enzymes are present in bacteria, fungi and apicomplexan parasites (such as Toxoplasma gondii, Plasmodium falciparum, and Cryptosporidium parvum) but absent in mammals, they are potentially attractive targets for the development of new therapy against infectious diseases such as tuberculosis (TB) [ , , , , , , , , , ].
Protein Domain
Name: Homoaconitase/3-isopropylmalate dehydratase, large subunit, prokaryotic
Type: Family
Description: 3-isopropylmalate dehydratase (or isopropylmalate isomerase; ) catalyses the stereo-specific isomerisation of 2-isopropylmalate and 3-isopropylmalate, via the formation of 2-isopropylmaleate. This enzyme performs the second step in the biosynthesis of leucine, and is present in most prokaryotes and many fungal species. The prokaryotic enzyme is a heterodimer composed of a large (LeuC) and small (LeuD) subunit, while the fungal form is a monomeric enzyme. Both forms of isopropylmalate are related and are part of the larger aconitase family [ ]. Aconitases are mostly monomeric proteins which share four domains in common and contain a single, labile [4Fe-4S]cluster. Three structural domains (1, 2 and 3) are tightly packed around the iron-sulphur cluster, while a fourth domain (4) forms a deep active-site cleft. The prokaryotic enzyme is encoded by two adjacent genes, leuC and leuD, corresponding to aconitase domains 1-3 and 4 respectively [ , ]. LeuC does not bind an iron-sulphur cluster. It is thought that some prokaryotic isopropylamalate dehydrogenases can also function as homoaconitase , converting cis-homoaconitate to homoisocitric acid in lysine biosynthesis [ ]. Homoaconitase has been identified in higher fungi (mitochondria) and several archaea and one thermophilic species of bacteria, Thermus thermophilus []. It is also found in the higher plant Arabidopsis thaliana, where it is targeted to the chloroplast [].This entry represents the large subunit of 3-isopropylmalate dehydratase (LeuC), as well as homoaconitase enzymes, from prokaryotes. Homoaconitase, aconitase and 3-isopropylmalate dehydratase have similar overall structures and domain organisation [ ]. All are dehydratases that bind a [4Fe-4S]-cluster.
Protein Domain
Name: Tetrapyrrole methylase superfamily
Type: Homologous_superfamily
Description: Tetrapyrroles are large macrocyclic compounds derived from a common biosynthetic pathway [ ]. The end-product, uroporphyrinogen III, is used to synthesise a number of important molecules, including cobalamin (vitamin B12), haem, sirohaem, chlorophyll, coenzyme F430 and phytochromobilin [].These enzymes catalyse the methylation of their substrates using S-adenosyl-L-methionine as a methyl source. Enzymes in this family include:Uroporphyrinogen III methyltransferase ( ) (SUMT), which catalyses the conversion of uroporphyrinogen III to precorrin-2 at the first branch-point of the tetrapyrrole synthesis pathway, directing the pathway towards cobalamin or sirohaem synthesis [ ].Precorrin-2 C20-methyltransferase CobI/CbiL ( ), which introduces a methyl group at C-20 on precorrin-2 to produce precorrin-3A during cobalamin biosynthesis. This reaction is key to the conversion of a porphyrin-type tetrapyrrole ring to a corrin ring [ ]. In some species, this enzyme is part of a bifunctional protein.Precorrin-4 C11-methyltransferase CobM/CbiF ( ), which introduces a methyl group at C-11 on precorrin-4 to produce precorrin-5 during cobalamin biosynthesis [ ].Sirohaem synthase CysG ( ), domains 4 and 5, which synthesizes sirohaem from uroporphyrinogen III, at the first branch-point in the tetrapyrrole biosynthetic pathway, directing the pathway towards sirohaem synthesis [ ].Diphthine synthase ( ), which carries out the methylation step during the modification of a specific histidine residue of elongation factor 2 (EF-2) during diphthine synthesis. This entry represents a tetrapyrrole methylase domain, which consist of two non-similar subdomains [ ]. The first domain has parallel sheet of five strands,while the second domain has mixed sheet of five strands, with strands 4 and 5 antiparallel to the rest.
Protein Domain
Name: Nitric oxide synthase, N-terminal
Type: Domain
Description: This entry represents the N-terminal of the nitric oxide synthases. Nitric oxide synthase ( ) (NOS) enzymes produce nitric oxide (NO) by catalysing a five-electron oxidation of a guanidino nitrogen of L-arginine (L-Arg). Oxidation of L-Arg to L-citrulline occurs via two successive monooxygenation reactions producing N(omega)-hydroxy-L-arginine as an intermediate. 2 mol of O(2) and 1.5 mol of NADPH are consumed per mole of NO formed [ ].Arginine-derived NO synthesis has been identified in mammals, fish, birds, invertebrates, plants, and bacteria [ ]. Best studied are mammals, where three distinct genes encode NOS isozymes: neuronal (nNOS or NOS-1), cytokine-inducible (iNOS or NOS-2) and endothelial (eNOS or NOS-3) []. iNOS and nNOS are soluble and found predominantly in the cytosol, while eNOS is membrane associated. The enzymes exist as homodimers, each monomer consisting of two major domains: an N-terminal oxygenase domain, which belongs to the class of haem-thiolate proteins, and a C-terminal reductase domain, which is homologous to NADPH:P450 reductase (). The interdomain linker between the oxygenase and reductase domains contains a calmodulin (CaM)-binding sequence. NOSs are the only enzymes known to simultaneously require five bound cofactors animal NOS isozymes are catalytically self-sufficient. The electron flow in the NO synthase reaction is: NADPH -->FAD -->FMN -->haem -->O(2). eNOS localisation to endothelial membranes is mediated by cotranslational N-terminal myristoylation and post-translational palmitoylation [ ]. The subcellular localisation of nNOS in skeletal muscle is mediated by anchoring of nNOS to dystrophin. nNOS contains an additional N-terminal domain, the PDZ domain []. Some bacteria, like Bacillus halodurans, Bacillus subtilis or Deinococcus radiodurans, contain homologues of NOS oxygenase domain.
Protein Domain
Name: Phenylalanine-tRNA ligase alpha chain 1, bacterial
Type: Family
Description: Phenylalanine-tRNA ligase ( ) is a tetramer of two alpha and two beta chains. This family represents the type 1 alpha chain from bacteria. The aminoacyl-tRNA synthetases (also known as aminoacyl-tRNA ligases) catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction [ , ]. These proteins differ widely in size and oligomeric state, and have limited sequence homology []. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric []. Class II aminoacyl-tRNA synthetases share an anti-parallel β-sheet fold flanked by α-helices [], and are mostly dimeric or multimeric, containing at least three conserved regions [, , ]. However, tRNA binding involves an α-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan, valine, and some lysine synthetases (non-eukaryotic group) belong to class I synthetases. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, phenylalanine, proline, serine, threonine, and some lysine synthetases (non-archaeal group), belong to class-II synthetases. Based on their mode of binding to the tRNA acceptor stem, both classes of tRNA synthetases have been subdivided into three subclasses, designated 1a, 1b, 1c and 2a, 2b, 2c [].
Protein Domain
Name: Phenylalanine-tRNA ligase alpha subunit, bacterial/archaeal
Type: Family
Description: Phenylalanine-tRNA ligase ( ) is a tetramer of two alpha and two beta subunits. This entry represents the type 2 alpha subunit from bacteria and archaea. The aminoacyl-tRNA synthetases (also known as aminoacyl-tRNA ligases) catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction [ , ]. These proteins differ widely in size and oligomeric state, and have limited sequence homology []. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric []. Class II aminoacyl-tRNA synthetases share an anti-parallel β-sheet fold flanked by α-helices [], and are mostly dimeric or multimeric, containing at least three conserved regions [, , ]. However, tRNA binding involves an α-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan, valine, and some lysine synthetases (non-eukaryotic group) belong to class I synthetases. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, phenylalanine, proline, serine, threonine, and some lysine synthetases (non-archaeal group), belong to class-II synthetases. Based on their mode of binding to the tRNA acceptor stem, both classes of tRNA synthetases have been subdivided into three subclasses, designated 1a, 1b, 1c and 2a, 2b, 2c [].
Protein Domain
Name: Phenylalanine-tRNA ligase beta chain 2, archaeal type
Type: Family
Description: Phenylalanine-tRNA ligase (also known as Phenylalanyl-tRNA synthetase) ( ) is a tetramer of two alpha and two beta chains. This entry represents the type 2 beta chain from bacteria and archaea. The aminoacyl-tRNA synthetases (also known as aminoacyl-tRNA ligases) catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction [ , ]. These proteins differ widely in size and oligomeric state, and have limited sequence homology []. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric []. Class II aminoacyl-tRNA synthetases share an anti-parallel β-sheet fold flanked by α-helices [], and are mostly dimeric or multimeric, containing at least three conserved regions [, , ]. However, tRNA binding involves an α-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan, valine, and some lysine synthetases (non-eukaryotic group) belong to class I synthetases. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, phenylalanine, proline, serine, threonine, and some lysine synthetases (non-archaeal group), belong to class-II synthetases. Based on their mode of binding to the tRNA acceptor stem, both classes of tRNA synthetases have been subdivided into three subclasses, designated 1a, 1b, 1c and 2a, 2b, 2c [].
Protein Domain
Name: RNA polymerase sigma factor 54
Type: Family
Description: Sigma factors [ ] are bacterial transcription initiation factors that promote the attachment of the core RNA polymerase to specific initiation sites and are then released. They alter the specificity of promoter recognition. Most bacteria express a multiplicity of sigma factors. Two of these factors, sigma-70 (gene rpoD), generally known as the major or primary sigma factor, and sigma-54 (gene rpoN or ntrA) direct the transcription of a wide variety of genes. The other sigma factors, known as alternativesigma factors, are required for the transcription of specific subsets of genes. With regard to sequence similarity, sigma factors can be grouped into two classes: the sigma-54 and sigma-70 families. The sigma-70 family has many different sigma factors (see the relevant entry ). The sigma-54 family consists exclusively of sigma-54 factor [ , ] required for the transcription of promoters that have a characteristic -24 and -12 consensus recognition element but which are devoid of the typical -10, -35 sequences recognised by the major sigma factors. The sigma-54 factor is also characterised by its interaction with ATP-dependent positive regulatory proteins that bind to upstream activating sequences.Structurally sigma-54 factors consist of three distinct regions: A relatively well conserved N-terminal glutamine-rich region of about 50 residues that contains a potential leucine zipper motif.A region of variable length which is not well conserved.A well conserved C-terminal region of about 350 residues that contains a second potential leucine zipper, a potential DNA-binding 'helix-turn-helix' motif and a perfectly conserved octapeptide whose function is not known.
Protein Domain
Name: (+) RNA virus helicase core domain
Type: Domain
Description: Helicases have been classified in 6 superfamilies (SF1-SF6) [ ]. All of the proteins bind ATP and, consequently, all of them carry the classical Walker A (phosphate-binding loop or P-loop) and Walker B (Mg2+-binding aspartic acid) motifs. The two largest superfamilies,commonly referred to as SF1 and SF2, share similar patterns of seven conserved sequence motifs, some of which are separated by long poorly conserved spacers. Helicase motifs appear to be organised in a core domain which provides the catalytic function, whereas optional inserts and amino- and carboxy-terminal sequences may comprise distinct domains with diverse accessory roles. The helicase core contains two structural domains, an N- terminal ATP-binding domain and a C-terminal domain. Putative SF1 helicases are extremely widespread among positive-stranded (+)RNA viruses. They have been identified in a variety of plant virus families, as well as alpha- rubi-, arteri-, hepatitis E, and coronaviruses. A number of these viral enzymes have been implicated in diverse aspects of transcription and replication but also in RNA stability and cell-to-cell movement [].The (+) RNA virus helicase core contains two RecA-like α/β domains. The N-terminal ATP-binding domain contains a parallel six-stranded β-sheet surrounded by four helices on one side and two helices on the other. The C-terminal domain contains a parallel four-stranded β-sheet sandwiched between two helices on each of its sides. The (+)RNA virus helicase core is likely to bind NTP in cleft between the N terminus of the ATP-binding domain and the beginning of the C-terminal domain [].This entry represents the (+)RNA virus helicase core domain.
Protein Domain
Name: Leucine-tRNA ligase, archaeal
Type: Family
Description: Leucine tRNA ligase ( ) is an alpha monomer that belongs to class Ia. This entry represents a group of leucine tRNA ligases found only in Archaea.The aminoacyl-tRNA synthetases (also known as aminoacyl-tRNA ligases) catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction [ , ]. These proteins differ widely in size and oligomeric state, and have limited sequence homology []. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric []. Class II aminoacyl-tRNA synthetases share an anti-parallel β-sheet fold flanked by α-helices [], and are mostly dimeric or multimeric, containing at least three conserved regions [, , ]. However, tRNA binding involves an α-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan, valine, and some lysine synthetases (non-eukaryotic group) belong to class I synthetases. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, phenylalanine, proline, serine, threonine, and some lysine synthetases (non-archaeal group), belong to class-II synthetases. Based on their mode of binding to the tRNA acceptor stem, both classes of tRNA synthetases have been subdivided into three subclasses, designated 1a, 1b, 1c and 2a, 2b, 2c [].
Protein Domain
Name: Fructose-1,6-bisphosphatase class 2/Sedoheputulose-1,7-bisphosphatase
Type: Family
Description: Gluconeogenesis is an important metabolic pathway, which produces glucose from noncarbohydrate precursors such as organic acids, fatty acids, amino acids, or glycerol. Fructose-1,6-bisphosphatase, a key enzyme of gluconeogenesis, is found in all organisms, and five different classes of these enzymes have been identified. This entry represents the class 2 fructose-1,6-bisphosphatases, which include GlpX and YggF of Escherichia coli (strain K12), which show different catalytic properties. The crystal structure of GlpX has been determined in a free state and in the complex with a substrate (fructose 1,6-bisphosphate) or inhibitor (phosphate). The crystal structure of the ligand-free GlpX revealed a compact, globular shape with two alpha/β-sandwich domains. The core fold of GlpX is structurally similar to that of Li+-sensitive phosphatases suggesting that they have a common evolutionary origin and catalytic mechanism. The structure of the GlpX complex with fructose 1,6-bisphosphate revealed that the active site is located between two domains and accommodates several conserved residues coordinating two metal ions and the substrate. A third metal ion is bound to phosphate 6 of the substrate. Inorganic phosphate strongly inhibited activity of both GlpX and YggF, and the crystal structure of the GlpX complex with phosphate demonstrated that the inhibitor molecule binds to the active site. Alanine replacement mutagenesis of GlpX identifies 12 conserved residues important for activity and suggested that Thr(90) is the primary catalytic residue [ ]. A number of the proteins in this entry, particularly those from algae are bi functional and can catalyzes the hydrolysis of fructose 1,6-bisphosphate and sedoheptulose 1,7-bisphosphate to fructose 6-phosphate and sedoheptulose 7-phosphate, respectively.
Protein Domain
Name: Signal peptide peptidase A-like, C-terminal
Type: Domain
Description: This entry represents the C-terminal domain of Signal peptide peptidase A (SppA) from Escherichia coli and other members of the Peptidase S49 family (protease IV family, clan S in MEROPS) from cellular organisms.Signal peptide peptidase A is involved in the cleavage of signal peptides after their removal from the precursor proteins by signal peptidases. This entry includes S49 family members with either a single domain (sometimes referred to as 36K type), such as Escherichia coli SohB peptidase and archaeal sprotease MJ0651, or an N-terminal domain in addition to the C-terminal protease domain that is conserved in all the S49 family members (sometimes referred to as 67K type), similar to E. coli and Arabidopsis thaliana SppA peptidases [ , , ].Unlike the eukaryotic functional homologues that are proposed to be aspartic proteases, site-directed mutagenesis and sequence analysis have shown that members in this subfamily, mostly bacterial, are serine proteases. The predicted active site serine for members in this entry occurs in a transmembrane domain. Mutagenesis studies also suggest that the catalytic centre comprises a Ser-Lys dyad (both residues absolutely conserved within bacteria, chloroplast and mitochondrial signal peptidase family members) and not the usual Ser-His-Asp catalytic triad found in the majority of serine proteases. Interestingly, the single membrane spanning E. coli SppA carries out catalysis using a Ser-Lys dyad with the serine located in the C-terminal protease domain and the lysine in the N-terminal domain. The N- and C-terminal halves of SppA in E. coli are tandem repeats [ , , , ].
Protein Domain
Name: ADP-specific phosphofructokinase/glucokinase
Type: Family
Description: Although ATP is the most common phosphoryl group donor for kinases, certain hyperthermophilic archaea, such as Thermococcus litoralis and Pyrococcus furiosus, utilise unusual ADP-dependent glucokinases (ADPGKs) and phosphofructokinases (ADPPKKs) in their glycolytic pathways [ , , ]. ADPGKs and ADPPFKs exhibit significant similarity, and form an ADP-dependent kinase (ADPK) family, which was tentatively named the PFKC family []. A ~460-residue ADPK domain is also found in a bifunctional ADP-dependent gluco/phosphofructo-kinase (ADP-GK/PFK) from Methanocaldococcus jannaschii (Methanococcus jannaschii) as well as in homologous proteins present in several eukaryotes [, ]. Structure determination for eukaryotic ADPGK revealed an overall similar fold to archaeal orthologues with some differences in secondary structural elements. In the nucleotide-binding loop of eukaryotic ADPGK there is a disulfide bond between conserved cysteines; one of the cysteines coordinating the AMP defines an apparently nucleotide-binding motif unique to eukaryotic ADPGKs. Mammalian enzymes are specific for glucose [].The whole structure of the ADPK domain can be divided into large and small α/β subdomains. The larger subdomain, which carries the ADP binding site, consists of a twisted 12-stranded β-sheet flanked on both faces by 13 α-helices and three 3(10) helices, forming an α/β 3-layer sandwich. The smaller subdomain, which covers the active site, forms an α/β two-layer structure containing 5 bβ-strands and four α-helices. The ADP molecule is buried in a shallow pocket in the large subdomain. The binding of substrate sugar induces a structural change, the small domain closing to form a complete substrate sugar binding site [ , , ].
Protein Domain
Name: Aldo-keto reductase family 1 member C
Type: Family
Description: This entry represents aldo-keto reductase family 1 member C (AKR1C), including aldehyde reductase (ALR) AKR1C1-5 from mammals. The AKR1C family belongs to the aldo-keto reductase (AKR) family. ALR, also known as aldehyde reductase, or ALDR1, catalyses the NADPH-dependent reduction of a variety of aromatic and aliphatic aldehydes to their corresponding alcohols. In vitro substrates include succinic semialdehyde, 4-nitrobenzaldehyde, 1,2-naphthoquinone, methylglyoxal, and D-glucuronic acid [ , ]. In general, the aldo-keto reductase (AKR) protein superfamily members reduce carbonyl substrates such as: sugar aldehydes, keto-steroids, keto-prostaglandins, retinals, quinones, and lipid peroxidation by-products [ , ]. However, there are some exceptions, such as the reduction of steroid double bonds catalysed by AKR1D enzymes (5beta-reductases); and the oxidation of proximate carcinogen trans-dihydrodiol polycyclic aromatic hydrocarbons; while the beta-subunits of potassium gated ion channels (AKR6 family) control Kv channel opening [].Structurally, they contain an (alpha/beta)8-barrel motif, display large loops at the back of the barrel which govern substrate specificity, and have a conserved cofactor binding domain. The binding site is located in a large, deep, elliptical pocket in the C-terminal end of the beta sheet, the substrate being bound in an extended conformation. The hydrophobic nature of the pocket favours aromatic and apolar substrates over highly polar ones [ ]. They catalyse an ordered bi bi kinetic mechanism in which NAD(P)H cofactor binds first and leaves last []. Binding of the NADPH coenzyme causes a massive conformational change, reorienting a loop, effectively locking the coenzyme in place. This binding is more similar to FAD- than to NAD(P)-binding oxidoreductases [].
Protein Domain
Name: Aldo-keto reductase family 1 member D1
Type: Family
Description: This entry represents aldo-keto reductase family 1 member D1 (AKR1D1). It catalyses the stereospecific NADPH-dependent reduction of the C4-C5 double bond of bile acid intermediates and steroid hormones carrying a delta-4-3-one structure to yield an A/B cis-ring junction. This cis-configuration is crucial for bile acid biosynthesis and plays important roles in steroid metabolism [ ]. It is capable of reducing a broad range of delta-4-3-ketosteroids from C18 (such as, 17beta-hydroxyestr-4-en-3-one) to C27 (such as, 7alpha-hydroxycholest-4-en-3-one) [].In general, the aldo-keto reductase (AKR) protein superfamily members reduce carbonyl substrates such as: sugar aldehydes, keto-steroids, keto-prostaglandins, retinals, quinones, and lipid peroxidation by-products [ , ]. However, there are some exceptions, such as the reduction of steroid double bonds catalysed by AKR1D enzymes (5beta-reductases); and the oxidation of proximate carcinogen trans-dihydrodiol polycyclic aromatic hydrocarbons; while the beta-subunits of potassium gated ion channels (AKR6 family) control Kv channel opening [].Structurally, they contain an (alpha/beta)8-barrel motif, display large loops at the back of the barrel which govern substrate specificity, and have a conserved cofactor binding domain. The binding site is located in a large, deep, elliptical pocket in the C-terminal end of the beta sheet, the substrate being bound in an extended conformation. The hydrophobic nature of the pocket favours aromatic and apolar substrates over highly polar ones [ ]. They catalyse an ordered bi bi kinetic mechanism in which NAD(P)H cofactor binds first and leaves last []. Binding of the NADPH coenzyme causes a massive conformational change, reorienting a loop, effectively locking the coenzyme in place. This binding is more similar to FAD- than to NAD(P)-binding oxidoreductases [].
Protein Domain
Name: Aldo-keto reductase family 3B
Type: Family
Description: This entry represents aldo-keto reductase family 3B (AKR3B), including NADPH-dependent aldehyde reductase 1 (ARI) from Sporidiobolus salmonicolor and D/L-glyceraldehyde reductase (GLD1) from Hypocrea jecorina. Sporidiobolus salmonicolor ARI catalyses the asymmetric reduction of aliphatic and aromatic aldehydes and ketones to an R-enantiomer. It reduces ethyl 4-chloro-3-oxobutanoate to ethyl (R)-4-chloro-3-hydroxybutanoate [ , ]. GLD1 mediates the conversion of L-glyceraldehyde to glycerol in D-galacturonate catabolic process [].In general, the aldo-keto reductase (AKR) protein superfamily members reduce carbonyl substrates such as: sugar aldehydes, keto-steroids, keto-prostaglandins, retinals, quinones, and lipid peroxidation by-products [ , ]. However, there are some exceptions, such as the reduction of steroid double bonds catalysed by AKR1D enzymes (5beta-reductases); and the oxidation of proximate carcinogen trans-dihydrodiol polycyclic aromatic hydrocarbons; while the beta-subunits of potassium gated ion channels (AKR6 family) control Kv channel opening [].Structurally, they contain an (alpha/beta)8-barrel motif, display large loops at the back of the barrel which govern substrate specificity, and have a conserved cofactor binding domain. The binding site is located in a large, deep, elliptical pocket in the C-terminal end of the beta sheet, the substrate being bound in an extended conformation. The hydrophobic nature of the pocket favours aromatic and apolar substrates over highly polar ones [ ]. They catalyse an ordered bi bi kinetic mechanism in which NAD(P)H cofactor binds first and leaves last []. Binding of the NADPH coenzyme causes a massive conformational change, reorienting a loop, effectively locking the coenzyme in place. This binding is more similar to FAD- than to NAD(P)-binding oxidoreductases [].
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom