Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 2301 to 2400 out of 38750 for *

Category restricted to ProteinDomain (x)

0.02s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Name: LanC-like protein, eukaryotic
Type: Family
Description: The LanC-like protein superfamily encompasses a highly divergent group of peptide-modifying enzymes, including the eukaryotic and bacterial lanthionine synthetase C-like proteins (LanC) [ , , ]; subtilin biosynthesis protein SpaC from Bacillus subtilis [, ]; epidermin biosynthesis protein EpiC from Staphylococcus epidermidis []; nisin biosynthesis protein NisC from Lactococcus lactis [, , ]; GCR2 from Arabidopsis thaliana (Mouse-ear cress) []; and many others. The 3D structure of the lantibiotic cyclase from L. lactis has been determined by X-ray crystallography to 2.5A resolution [ ]. The globular structure is characterised by an all-α fold, in which an outer ring of helices envelops an inner toroid composed of 7 shorter, hydrophobic helices. This 7-fold hydrophobic periodicity has led several authors to claim various members of the family, including eukaryotic LanC-1 and GCR2, to be novel G protein-coupled receptors [, ]; some of these claims have since been corrected [, , ]. The eukaryotic lanthionine synthetase C-like proteins 1-3, are relatives of the bacterial lanthionine synthetase components C (LanC) [ , , ]. They are ubiquitous in nature, being variously expressed in brain, spinal cord, pituitary gland, kidney, heart, skeletal muscle, pancreas, ovary and testis. LanC-like protein 1 is a glutathione-binding protein []. LanC-like protein 2 is a bystander gene co-amplified and overexpressed with epidermal growth factor receptor (EGFR) in 20% of glioblastomas; its exogenous expression in a sarcoma cell line decreases the expression of ABCB1 (P-glycoprotein 1) and increases cellular sensitivity to the anticancer drug adriamycin [].
Protein Domain
Name: Plastid lipid-associated protein/fibrillin conserved domain
Type: Domain
Description: This entry represents a conserved domain found in a number of plastid lipid-associated proteins (PAPs), including Fibrillin-5 (FBN5) from Arabidopsis and CHRC from Oncidium hybrid cultivar. Fibrillins accumulate in chromoplasts and sequester carotenoids during the development of flowers and fruits [ ]. FBN5 has been shown to be involved in PQ-9 (Plastoquinone-9) biosynthesis in Arabidopsis and rice []. CHRC is a corolla-specific carotenoid-associated protein and a major component of carotenoid-lipoprotein complexes in Cucumis sativus chromoplasts [].
Protein Domain
Name: Phospholipase A2
Type: Family
Description: Phospholipase A2 ( ) (PLA2) is a small lipolytic enzyme that releases fatty acids from the second carbon group of glycerol. It is involved in a number of physiologically important cellular processes, such as the liberation of arachidonic acid from membrane phospholipids [ ]. It plays a pivotal role in the biosynthesis of prostaglandin and other mediators of inflammation. PLA2 has four to seven disulphide bonds and binds a calcium ion that is essential for activity. Within the active enzyme, the alpha amino group is involved in a conserved hydrogen-bonding network linking the N-terminal region to the active site. The side chains of two conserved residues, His and Asp, participate inthe catalytic network. Many PLA2's are widely distributed in snakes, lizards, bees and mammals. In mammals, there are at least four forms: pancreatic, membrane-associated as well as two less well characterised forms. The venom of most snakes contains multiple forms of PLA2 [ , ]. Some of them are presynaptic neurotoxins which inhibit neuromuscular transmission by blocking acetylcholine release from the nerve termini.Some of the proteins in this family are allergens. Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans [WHO/IUIS Allergen Nomenclature Subcommittee King T.P., Hoffmann D., Loewenstein H., Marsh D.G., Platts-Mills T.A.E., Thomas W. Bull. World Health Organ. 72:797-806(1994)]. This nomenclature system is defined by a designation that is composed of the first three letters of the genus; a space; the first letter of the species name; a space and an arabic number. In the event that two species names have identical designations, they are discriminated from one another by adding one or more letters (as necessary) to each species designation.The allergens in this family include allergens with the following designations: Api m 1.
Protein Domain      
Protein Domain
Name: Peptidase M17, leucyl aminopeptidase, C-terminal
Type: Domain
Description: Over 70 metallopeptidase families have been identified to date. In these enzymes a divalent cation which is usually zinc, but may be cobalt, manganese or copper, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. In some families of co-catalytic metallopeptidases, two metal ions are observed in crystal structures ligated by five amino acids, with one amino acid ligating both metal ions. The known metal ligands are His, Glu, Asp or Lys. At least one other residue is required for catalysis, which may play an electrophillic role. Many metalloproteases contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases [].This group of metallopeptidases belong to the MEROPS peptidase family M17 (leucyl aminopeptidase family, clan MF), the type example being leucyl aminopeptidase from Bos taurus (Bovine).Aminopeptidases are exopeptidases involved in the processing and regular turnover of intracellular proteins, although their precise role in cellularmetabolism is unclear [ , ]. Leucine aminopeptidases cleave leucine residuesfrom the N-terminal of polypeptide chains, but substantial rates are evident for all amino acids [].The enzymes exist as homo-hexamers, comprising 2 trimers stacked on top of one another []. Each monomer binds 2 zinc ions and folds into 2 alpha/beta-type quasi-spherical globular domains, producing a comma-like shape []. The N-terminal 150 residues form a 5-stranded β-sheet with 4 parallel and 1 anti-parallel strand sandwiched between 4 α-helices []. An α-helix extends into the C-terminal domain, which comprises a central 8-stranded saddle-shaped β-sheet sandwiched between groups of helices, forming the monomer hydrophobic core []. A 3-stranded β-sheet resides on the surface of the monomer, where it interacts with other members of the hexamer []. The 2 zinc ions and the active site are entirely located in the C-terminal catalytic domain [].
Protein Domain
Name: Phosphatidylinositol 3-/4-kinase, catalytic domain
Type: Domain
Description: Phosphatidylinositol 3-kinase (PI3-kinase) ( ) [ ] is an enzyme that phosphorylates phosphoinositides on the 3-hydroxyl group of the inositol ring. The three products of PI3-kinase - PI-3-P, PI-3,4-P(2) and PI-3,4,5-P(3) function as secondary messengers in cell signalling. Phosphatidylinositol 4-kinase (PI4-kinase) () [ ] is an enzyme that acts on phosphatidylinositol (PI) in the first committed step in the production of the secondary messenger inositol-1'4'5'-trisphosphate. This domain is also present in a wide range of protein kinases, involved in diverse cellular functions, such as control of cell growth, regulation of cell cycle progression, a DNA damage checkpoint, recombination, and maintenance of telomere length. Despite significant homology to lipid kinases, no lipid kinase activity has been demonstrated for any of the PIK-related kinases [].The PI3- and PI4-kinases share a well conserved domain at their C-terminal section; this domain seems to be distantly related to the catalytic domain of protein kinases [ , ]. The catalytic domain of PI3/PI4KK has the typical bilobal structure that is seen in other ATP-dependent kinases, divided into an N-lobe and a C-lobe () [ ]. The core of this domain is the most conserved region of the PI3Ks. The N-lobe contains a five-stranded antiparallel β-sheet core flanked by a helical hairpin on one side and a helix on the other. The C-lobe contains helices, which form a helical bundle together with the N--obe helix. The helical bundle is flanked by three β-strands and a helix. Three loops are related to kinase activity, the glycine-rich G-loop, the catalytic loop and the activation loop. The G-loop has been reported to bind to the phosphate group of nucleotides [].This domain is also found in a number of pseudokinases, where a lack of typical motifs at the calatytic site suggest a lack of kinase activity.
Protein Domain      
Protein Domain
Name: PIK-related kinase
Type: Domain
Description: Phosphatidylinositol kinase (PIK)-related kinases participate in meiotic and V(D)J recombination, chromosome maintenance and repair, cell cycleprogression, and cell cycle checkpoints, and their dysfunction can result in a range of diseases, including immunodeficiency, neurological disorder andcancer. The catalytic kinase domain is highly homologous to that of phosphatidylinositol 3- and 4-kinases. Nevertheless, membersof the PIK-related family appear functionally distinct, as none of them has been shown to phosphorylate lipids, such as phosphatidylinositol; instead, many have Ser/Thr protein kinase activity. The PI-kinase domain of members of the PIK-related family is wedged between the ~550-amino acid-long FAT (FRAP, ATM, TRRAP) domain [ ] and the ~35 residue C-terminal FATC domain [].It has been proposed that the FAT domain could be of importance as a structural scaffold or as a protein-binding domain, or both [].
Protein Domain
Name: NADH-ubiquinone oxidoreductase, subunit 10
Type: Family
Description: NADH-ubiquinone oxidoreductase subunit 10 of (NDUFB10) is a member of a family of conserved proteins of up to 180 residues. It is one of the 41 protein subunits within the hydrophobic fraction of the NADH:ubiquinone oxidoreductase (complex I), a multiprotein complex located in the inner mitochondrial membrane whose main function is the transport of electrons from NADH to ubiquinone, which is accompanied by translocation of protons from the mitochondrial matrix to the intermembrane space. NDUFB10 is encoded in the nucleus [ , ].
Protein Domain
Name: Rhodanese-like domain
Type: Domain
Description: Rhodanese, a sulphurtransferase involved in cyanide detoxification (see ) shares evolutionary relationship with a large family of proteins [ ], includingCdc25 phosphatase catalytic domain.non-catalytic domains of eukaryotic dual-specificity MAPK-phosphatases.non-catalytic domains of yeast PTP-type MAPK-phosphatases.non-catalytic domains of yeast Ubp4, Ubp5, Ubp7.non-catalytic domains of mammalian Ubp-Y.Drosophila heat shock protein HSP-67BB.several bacterial cold-shock and phage shock proteins.plant senescence associated proteins.catalytic and non-catalytic domains of rhodanese (see ). Rhodanese has an internal duplication. This domain is found as a single copy in other proteins, including phosphatases and ubiquitin C-terminal hydrolases [ ].
Protein Domain
Name: Endo-1,3(4)-beta-glucanase
Type: Family
Description: This is a family of endo-beta-1,3(4)-glucanases belonging to glycoside hydrolase family 81 ( ) that are also known as Glucan endo-1,3-beta-D-glucosidases. Proteins in this entry include fission yeast Eng1/Eng2 and budding yeast Dse4 (also known as Eng1). They have been shown to hydrolyse linear beta-1,3-glucan chains []. Eng1 is required for the degradation of the primary septum after completion of cytokinesis [, ]. Eng2 may couple the endocytic coat to the actin module []. This entry also includes some uncharacterised proteins from plants and bacteria.O-Glycosyl hydrolases ( ) are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [ , ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) website.
Protein Domain
Name: Cys/Met metabolism, pyridoxal phosphate-dependent enzyme
Type: Family
Description: Pyridoxal phosphate is the active form of vitamin B6 (pyridoxine or pyridoxal). Pyridoxal 5'-phosphate (PLP) is a versatile catalyst, acting as a coenzyme in a multitude of reactions, including decarboxylation, deamination and transamination [ , , ]. PLP-dependent enzymes are primarily involved in the biosynthesis of amino acids and amino acid-derived metabolites, but they are also found in the biosynthetic pathways of amino sugars and in the synthesis or catabolism of neurotransmitters; pyridoxal phosphate can also inhibit DNA polymerases and several steroid receptors []. Inadequate levels of pyridoxal phosphate in the brain can cause neurological dysfunction, particularly epilepsy [].PLP enzymes exist in their resting state as a Schiff base, the aldehyde group of PLP forming a linkage with the ε-amino group of an active site lysine residue on the enzyme. The α-amino group of the substrate displaces the lysine ε-amino group, in the process forming a new aldimine with the substrate. This aldimine is the common central intermediate for all PLP-catalysed reactions, enzymatic and non-enzymatic [ ].A number of pyridoxal-dependent enzymes involved in the metabolism of cysteine, homocysteine and methionine have been shown [ , ] to be evolutionary related. These enzymes are proteins of about 400 amino-acid residues. The pyridoxal-P group is attached to a lysine residue located in the central section of these enzymes.One of these enzymes is the sulfhydrylase FUB7 from fungi such as Gibberella and Fusarium. The gene is part of a cluster that mediates the biosynthesis of fusaric acid, a mycotoxin with low to moderate toxicity to animals and humans, but with high phytotoxic properties [ ].
Protein Domain
Name: Eukaryotic molybdopterin oxidoreductase
Type: Family
Description: A number of different eukaryotic oxidoreductases that require and bind a molybdopterin cofactor have been shown [ ] to share a few regions of sequence similarity. These enzymes include xanthine dehydrogenase (), aldehyde oxidase ( ), nitrate reductase ( ), and sulphite oxidase ( ). The multidomain redox enzyme NAD(P)H:nitrate reductase (NR) catalyses the reduction of nitrate to nitrite in a single polypeptide electron transport chain with electron flow from NAD(P)H-FAD-cytochrome b5-molybdopterin-NO(3). Three forms of NR are known, an NADH-specific enzyme found in higher plants and algae ( ); an NAD(P)H-bispecific enzyme found in higher plants, algae and fungi (); and an NADPH-specific enzyme found only in fungi ( ) [ ]. The mitochondrial enzyme sulphite oxidase (sulphite:ferricytochrome c oxidoreductase; ) catalyses oxidation of sulphite to sulphate, using cytochrome c as the physiological electron acceptor. Sulphite oxidase consists of two structure/function domains, an N-terminal haem domain, similar to cytochrome b5; and a C-terminal molybdopterin domain [ ].Despite functional parallels, members of the family show no sequence similarity to the C-terminal molybdopterin domain of xanthine dehydrogenase, although xanthine dehydrogenase, nitrate reductases and sulphite oxidase all contain the eukaryotic molybdopterin oxidoreductases signature. Sequence comparison suggests that only a single Cys residue (Cys186 in chicken sulphite oxidase), is invariant in all these enzymes, indicating that it may play a role in binding molybdopterin to the protein [ , ].
Protein Domain
Name: Oxidoreductase, molybdopterin-binding domain
Type: Domain
Description: A number of different eukaryotic oxidoreductases that require and bind a molybdopterin cofactor have been shown [ ] to share a few regions of sequence similarity. These enzymes include xanthine dehydrogenase (), aldehyde oxidase ( ), nitrate reductase ( ), and sulphite oxidase ( ). The multidomain redox enzyme NAD(P)H:nitrate reductase (NR) catalyses the reduction of nitrate to nitrite in a single polypeptide electron transport chain with electron flow from NAD(P)H-FAD-cytochrome b5-molybdopterin-NO(3). Three forms of NR are known, an NADH-specific enzyme found in higher plants and algae (); an NAD(P)H-bispecific enzyme found in higher plants, algae and fungi (); and an NADPH-specific enzyme found only in fungi ( ) [ ]. Themitochondrial enzyme sulphite oxidase (sulphite:ferricytochrome c oxidoreductase; ) catalyses oxidation of sulphite to sulphate, using cytochrome c as the physiological electron acceptor. Sulphite oxidase consists of two structure/function domains, an N-terminal haem domain, similar to cytochrome b5; and a C-terminal molybdopterin domain [ ].
Protein Domain
Name: Moybdenum cofactor oxidoreductase, dimerisation
Type: Domain
Description: The majority of molybdenum-containing enzymes utilise a molybdenum cofactor (MoCF or Moco) consisting of a Mo atom coordinated via a cis-dithiolene moiety to molybdopterin (MPT). MoCF is ubiquitous in nature, and the pathway for MoCF biosynthesis is conserved in all three domains of life. MoCF-containing enzymes function as oxidoreductases in carbon, nitrogen, and sulphur metabolism [ , ]. In Escherichia coli, biosynthesis of MoCF is a three stage process. It begins with the MoaA and MoaC conversion of GTP to the meta-stable pterin intermediate precursor Z. The second stage involves MPT synthase (MoaD and MoaE), which converts precursor Z to MPT; MoeB is involved in the recycling of MPT synthase. The final step in MoCF synthesis is the attachment of mononuclear Mo to MPT, a process that requires MoeA and which is enhanced by MogA in an Mg2 ATP-dependent manner [ ]. MoCF is the active co-factor in eukaryotic and some prokaryotic molybdo-enzymes, but the majority of bacterial enzymes requiring MoCF, need a modification of MTP for it to be active; MobA is involved in the attachment of a nucleotide monophosphate to MPT resulting in the MGD co-factor, the active co-factor for most prokaryotic molybdo-enzymes. Bacterial two-hybrid studies have revealed the close interactions between MoeA, MogA, and MobA in the synthesis of MoCF []. Moreover the close functional association of MoeA and MogA in the synthesis of MoCF is supported by fact that the known eukaryotic homologues to MoeA and MogA exist as fusion proteins: CNX1 () of Arabidopsis thaliana (Mouse-ear cress), mammalian Gephryin (e.g. ) and Drosophila melanogaster (Fruit fly) Cinnamon ( ) [ ].This domain is found in molybdopterin cofactor oxidoreductases, such as in the C-terminal of Mo-containing sulphite oxidase, which catalyses the conversion of sulphite to sulphate, the terminal step in the oxidative degradation of cysteine and methionine [ ]. This domain is involved in dimer formation, and has an Ig-fold structure [].
Protein Domain
Name: Suppressor of white apricot, N-terminal domain
Type: Domain
Description: This entry represents the conserved N-terminal region of SWAP (suppressor-of-white-apricot protein) splice factor proteins. This region contains two highly conserved motifs, viz: DRY and EERY, which appear to be the sites for alternative splicing of exons 2 and 3 of the SWAP mRNA [ ]. These proteins are thus thought to be involved in auto-regulation of pre-mRNA splicing.
Protein Domain
Name: 26S proteasome regulatory complex, non-ATPase subcomplex, Rpn1 subunit
Type: Family
Description: Intracellular proteins, including short-lived proteins such as cyclin, Mos, Myc, p53, NF-kappaB, and IkappaB, are degraded by the ubiquitin-proteasome system. The 26S proteasome is a self-compartmentalising protease responsible for the regulated degradation of intracellular proteins in eukaryotes [ , ]. This giant intracellular protease is formed by several subunits arranged into two 19S polar caps, where protein recognition and ATP-dependent unfolding occur, flanking a 20S central barrel-shaped structure with an inner proteolytic chamber. This overall structure is highly conserved among eukaryotes and is essential for cell viability. Proteins targeted to the 26S proteasome are conjugated with a polyubiquitin chain by an enzymatic cascade before delivery to the 26S proteasome for degradation into oligopeptides.The 26S proteasome can be divided into two subcomplexes: the 19S regulatory particle (RP) and the 20S core particle (CP) [ ]. The 19S component is divided into a "base"subunit containing six ATPases (Rpt proteins) and two non-ATPases (Rpn1, Rpn2), and a "lid"subunit composed of eight stoichiometric proteins (Rpn3, Rpn5, Rpn6, Rpn7, Rpn8, Rpn9, Rpn11, Rpn12) [ ]. Additional non-essential and species specific proteins may also be present. The 19S unit performs several essential functions including binding the specific protein substrates, unfolding them, cleaving the attached ubiquitin chains, opening the 20S subunit, and driving the unfolded polypeptide into the proteolytic chamber for degradation. The 26s proteasome and 19S regulator are of medical interest due to their involvement in burn rehabilitation [].This group represents a 26S proteasome regulatory complex, non-ATPase subcomplex, Rpn1 (regulatory-particle non-ATPase subunit 1). This subunit is essential for embryogenesis in Arabidopsis thaliana [ ].
Protein Domain
Name: Proteasome/cyclosome repeat
Type: Repeat
Description: A weakly conserved repeat module of unknown function, which occurs in two regulatory subunits of the 26S-proteasome and in one subunitof the anaphase-promoting complex [ ].
Protein Domain
Name: OTU domain
Type: Domain
Description: An homology region containing four conserved motifs has been identified in proteins from eukaryotes, several groups of viruses and the pathogenic bacteria Chlamydia pneumoniae []. None of these proteins has a known biochemical function but low sequence similarity with the polyprotein regions of arteriviruses has lead to suggest that it could possess cysteine protease activity []. In this case, the conserved cysteine and aspartate in motif I and the histidine in motif IV could be the catalytic residues. Motifs II and III have a more limited sequence conservation and could be involved in substrate recognition [].It has been proposed that the eukaryotic proteins containing an OTU domain could mediate proteolytic events involved in signalling associated with the modification of chromatin structure and control of cell proliferation [ ].In viruses proteins containing this domain are annotated as replicase or RNA-dependent RNA polymerase. The eukaryotic sequences are related to the Ovarian Tumour (OTU) gene in Drosophila, cezanne deubiquitinating peptidase and tumor necrosis factor, alpha-induced protein 3 (MEROPS peptidase family C64) and otubain 1 and otubain 2 (MEROPS peptidase family C65). A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families [ ]. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid, N-ethylmaleimide or p-chloromercuribenzoate. Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [ ].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [ ]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel []. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.
Protein Domain
Name: GTP-binding protein Era
Type: Family
Description: Era is an essential G-protein in Escherichia coli identified originally as a homologue protein to Ras (E. coli Ras-like protein). It binds to GTP/GDP and contains a low intrinsic GTPase activity. Its function remains elusive, although it may be associated with cell division, energy metabolism, and cell-cycle check point. The protein has recently been shown to specifically bind to 16S rRNA and the 30S ribosomal subunit [ ]. Involvement of Era in protein synthesis is suggested by the fact that Era depletionresults in the translation defect both in vitro and in vivo. A Type 2 KH domain is found near the C terminus.
Protein Domain
Name: K Homology domain, type 2
Type: Domain
Description: The K homology (KH) domain was first identified in the human heterogeneous nuclear ribonucleoprotein (hnRNP) K. It is a domain of around 70 amino acidsthat is present in a wide variety of quite diverse nucleic acid-binding proteins []. It has been shown to bind RNA [, ]. Like many other RNA-binding motifs, KH motifs are found in one or multiple copies (14 copies in chicken vigilin) and, at least for hnRNP K (three copies) and FMR-1 (two copies), each motif is necessary for in vitroRNA binding activity, suggesting that they may function cooperatively or, in the case of single KH motif proteins (for example, Mer1p), independently [ ].According to structural analyses [ , , ], the KH domain can be separated in two groups. The first group or type-1 contain a beta-α-α-β-β-alpha structure, whereas in the type-2 the two last β-sheets are located in the N-terminal part of the domain (α-β-beta-α-α-beta). Sequence similarity between these two folds are limited to a short region (VIGXXGXXI) in the RNA binding motif. This motif is located between helice 1 and 2 in type-1 and between helice 2 and 3 in type-2. Proteins known to contain a type-2 KH domain include eukaryotic and prokaryotic S3 family of ribosomal proteins, and the prokaryotic GTP-binding protein era.
Protein Domain
Name: K homology domain-like, alpha/beta
Type: Homologous_superfamily
Description: The K homology domain is a common RNA-binding motif present in one or multiple copies in both prokaryotic and eukaryotic regulatory proteins. The KH motifs may act cooperatively to bind RNA in the case of multiple motifs, or independently in the case of single KH motif proteins. Prokaryotic (pKH) and eukaryotic (eKH) KH domains share a KH-motif, but have different topologies. The pKH domain has been found in a number of proteins, including the N-terminal domain of the S3 ribosomal protein [ ], the C-terminal domain of Era GTPase [] and the two C-terminal domains of the NusA transcription factor []. The structure of the pKH domain consists of a two-layer α/β fold in the arrangement α/β(2)/α/β. This entry represents K homology domains, as well as related domains that share the same 2-layer α/β structure.
Protein Domain
Name: Protein of unknown function DUF639
Type: Family
Description: The sequences in this family are plant proteins of unknown function.
Protein Domain
Name: H/ACA ribonucleoprotein complex, subunit Nhp2-like
Type: Family
Description: H/ACA ribonucleoprotein particles (RNPs) are a family of RNA pseudouridine synthases that specify modification sites through guide RNAs. The function of these H/ACA RNPs is essential for biogenesis of the ribosome, splicing of precursor mRNAs (pre-mRNAs), maintenance of telomeres and probably for additional cellular processes [ ]. All H/ACA RNPs contain a specific RNA component (snoRNA or scaRNA) and at least four proteins common to all such particles: Cbf5, Gar1, Nhp2 and Nop10. These proteins are highly conserved from yeast to mammals and homologues are also present in archaea []. The H/ACA protein complex contains a stable core composed of Cbf5 and Nop10, to which Gar1 and Nhp2 subsequently bind [].This entry represents H/ACA ribonucleoprotein complex subunit NHP2 and similar proteins from eukaryotes, including NHP2-like protein 1 from mammals (SNU13 homologue) and 13 kDa ribonucleoprotein-associated protein (SNU13) from yeast.Nhp2 is part of a complex which catalyses pseudouridylation of rRNA and is required for rRNA biogenesis. This involves the isomerisation of uridine such that the ribose is subsequently attached to C5, instead of the normal N1. Pseudouridine ("psi") residues may serve to stabilise the conformation of rRNAs. Nph2 associates non-specifically with RNA secondary structures instead of directly binding to an specific RNA motif. This protein seem to have evolved from the archaeal ribosomal L7Ae protein family [ ]. Human SNU13 homologue is involved in pre-mRNA splicing as component of the spliceosome [ ]. The protein undergoes a conformational change upon RNA-binding [].SNU13 from Saccharomyces cerevisiae (Baker's yeast) is also a component of the spliceosome and rRNA processing machinery, required for splicing of pre-mRNA and essential for the accumulation and stability of U4 snRNA, U6 snRNA, and box C/D snoRNAs [ , , ].
Protein Domain
Name: Ribosomal protein L7Ae/L8/Nhp2 family
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].The genomic structure and sequence of the human ribosomal protein L7a has been determined and shown to resemble other mammalian ribosomal protein genes []. The sequence of a gene for ribosomal protein L4 of yeast has also been determined; its single open reading frame is highly similar to mammalian ribosomal protein L7a [, ]. Several other ribosomal proteins have been found to share sequence similarity with L7a, including Saccharomyces cerevisiae NHP2 [ ], Bacillus subtilis hypothetical protein ylxQ, Haloarcula marismortui Hs6, and Methanocaldococcus jannaschii (Methanococcus jannaschii) MJ1203.
Protein Domain
Name: DDHD domain
Type: Domain
Description: The Nir/rdgB (N-terminal domain-interacting receptor/Drosophila retinal degeneration B proteins) family has been identified in a variety of eukaryoticorganisms, ranging from worms to mammals. Members of this family are implicated in regulation of lipid trafficking, metabolism, and signaling. TheNir/rdgB proteins contain a 180 amino-acids-long conserved region in the central part of the protein. This domain contains four conserved residues,DDHD, which may form a metal-binding site. This domain is named DDHD after these four residues. This pattern of conservation of metal-binding residues isoften seen in phosphoesterase domains [ ].The DDHD domain is found in the central part of Nir/rdgB proteins, as well as the C-terminal part of the phosphatidic acid-preferring phospholipase A1. TheDDHD domain function is not currently known but it may be implicated in phospholipid metabolism, membrane turnover, or intracellular trafficking [].
Protein Domain
Name: CYTH domain
Type: Domain
Description: The entry represents the CYTH domain. The bacterial CyaB like adenylyl cyclase and the mammalian thiamine triphosphatases (ThTPases) define a superfamily of catalytic domains called the CYTH (CyaB, thiamine triphosphatase) domain that is present in all three superkingdoms of life [ ]. Proteins containing this domain act on triphosphorylated substrates and require at least one divalent metal cation for catalysis []. The catalytic core of the CYTH domain is predicted to contain an alpha+beta scaffold with 6 conserved β-strands and 6 conserved α-helices. The CYTHdomains contains several nearly universally conserved charged residues that are likely to form the active site. The most prominent of these are an EXEXKmotif associated with strand-1 of the domain, two basic residues in helix-2, a K at the end of strand 3, an E in strand 4, a basic residue in helix-4, a D atthe end of strand 5 and two acidic residues (typically glutamates) in strand 6. The presence of around 6 conserved acidic positions in the majority of theCYTH domains suggests that it coordinates two divalent metal ions. Both CyaB and ThTPase have been shown to require Mg(2+) ions for their nucleotidecyclase and phosphatase activities. The four conserved basic residues in the CYTH domain are most probably involved in the binding of acidic phosphatemoieties of their substrates. The conservation of these two sets of residues in the majority of CYTH domains suggests that most members of this group arelikely to possess an activity dependent on two metal ions, with a preference for nucleotides or related phosphate-moiety -bearing substrates. The proposedbiochemical activity, and the arrangement of predicted strands in the primary structure of the CYTH domain imply that they may adopt a barrel or sandwich-like configuration, with metal ions and the substrates bound in the central cavity [].
Protein Domain
Name: Protein of unknown function DUF1666
Type: Family
Description: These sequences are derived from hypothetical plant proteins of unknown function. The region in question is approximately 250 residues long.
Protein Domain
Name: Peptidyl-arginine deiminase, Porphyromonas-type
Type: Family
Description: Peptidyl-arginine deiminase (PAD) enzymes catalyse the deimination of the guanidino group from carboxy-terminal arginine residues of various peptides to produce ammonia. PAD from Porphyromonas gingivalis (Bacteroides gingivalis) (PPAD) appears to be evolutionarily unrelated to mammalian PAD ( ), which is a metalloenzyme. PPAD is thought to belong to the same superfamily as aminotransferase and arginine deiminase, and to form an alpha/beta propeller structure. This family has previously been named PPADH (Porphyromonas peptidyl-arginine deiminase homologues) [ ]. The predicted catalytic residues in PPAD () are Asp130, Asp187, His236, Asp238 and Cys351 [ ]. These are absolutely conserved with the exception of Asp187 which is absent in two family members. PPAD is also able to catalyse the deimination of free L-arginine, but has primarily peptidyl-arginine specificity. It may have a FMN cofactor [].
Protein Domain
Name: Agmatine deiminase
Type: Family
Description: Agmatine deiminase ( ) catalyses the conversion of agmatine into N-carbamoylputrescine. This enzyme usually functions as part of polyamine biosynthesis, though in some bacterial species, such as Enterococcus faecalis, it can also function in the generation of ATP from agmatine [ , , ]. The E. faecalis enzyme is a homotetramer where each monomer adopts a similar fold to that of the arginine deiminase catalytic domain, a five-bladed fan-like structure where the repeating unit is formed from three beta sheets and an alpha helix [].This entry represents known and putative agmatine deiminases from bacteria and plants. Related deiminases not in this entry include the peptidyl-arginine deiminase ( ) as found in Porphyromonas gingivalis.
Protein Domain
Name: Annexin
Type: Family
Description: The annexins (or lipocortins) are a family of proteins that bind to phospholipids in a calcium-dependent manner [ ]. The 12 annexins common to vertebrates are classified in the annexin A family and named as annexins A1-A13 (or ANXA1-ANXA13), leaving A12 unassigned in the official nomenclature. Annexins outside vertebrates are classified into families B (in invertebrates), C (in fungi and some groups of unicellular eukaryotes), D (in plants), and E (in protists) []. Annexins are absent from yeasts and prokaryotes [].Most eukaryotic species have 1-20 annexin (ANX) genes. All annexins share a core domain made up of four similar repeats, each approximately 70 amino acids long [ ]. Each individual annexin repeat (sometimes referred to as endonexin folds) is folded into five α-helices, and in turn are wound into a right-handed super-helix; they usually contain a characteristic 'type 2' motif for binding calcium ions with the sequence 'GxGT-[38 residues]-D/E'. Animal and fungal annexins also have variable amino-terminal domains. The core domains of most vertebrate annexins have been analysed by X-ray crystallography, revealing conservation of their secondary and tertiary structures despite only 45-55% amino-acid identity among individual members. The four repeats pack into a structure that resembles a flattened disc, with a slightly convex surface on which the Ca2+ -binding loops are located and a concave surface at which the amino and carboxyl termini come into close apposition. Annexins are traditionally thought of as calcium-dependent phospholipid-binding proteins, but recent work suggests a more complex set of functions. The family has been linked with inhibition of phospholipase activity, exocytosis and endocytosis, signal transduction, organisation of the extracellular matrix, resistance to reactive oxygen species and DNA replication [ ].
Protein Domain
Name: Phenylalanyl-tRNA synthetase-like, B3/B4
Type: Homologous_superfamily
Description: This entry represents the B3/B4 domain found in tRNA synthetase beta subunits as well as in some non-tRNA synthetase proteins. This domain has a 3-layer structure, and contains a β-sandwich fold of unusual topology, and contains a putative tRNA-binding structural motif [ ]. In Thermus thermophilus, both the catalytic alpha- and the non-catalytic beta-subunits comprise the characteristic fold of the class II active-site domains. The presence of an RNA-binding domain, similar to that of the U1A spliceosomal protein, in the beta-subunit of tRNA synthetase indicates structural relationships among different families of RNA-binding proteins. Aminoacyl-tRNA synthetases can catalyse editing reactions to correct errors produced during amino acid activation and tRNA esterification, in order to prevent the attachment of incorrect amino acids to tRNA. The B3/B4 domain of the beta subunit contains an editing site, which lies close to the active site on the alpha subunit [ ]. Disruption of this site abolished tRNA editing, a process that is essential for faithful translation of the genetic code.
Protein Domain
Name: Putative DNA-binding domain superfamily
Type: Homologous_superfamily
Description: A putative DNA-binding domain with a conserved structure is found in several different protein families. The core structure of the domain consists of a three-helical fold that is architecturally similar to that of the "winged-helix"fold, but is topologically distinct. Representatives of this domain can be found in domains B1 and B5 from the beta subunit of phenylalanine-tRNA synthetases [ ], the C-terminal region of the DNA/RPA-binding domain of the DNA excision repair factor XPA [], the N-terminal domain of the transcriptional activators BmrR and MtaN [], the most conserved domain of the retinal development protein Dachshund [], and the DNA-binding domain of the gpNU1 subunit from the bacteriophage lambda viral packing protein terminase [].
Protein Domain
Name: B3/B4 tRNA-binding domain
Type: Domain
Description: This entry represents the B3/B4 domain found in tRNA synthetase beta subunits as well as in some non-tRNA synthetase proteins. This domain has a 3-layer structure, and contains a β-sandwich fold of unusual topology, and contains a putative tRNA-binding structural motif [ ]. In Thermus thermophilus, both the catalytic alpha- and the non-catalytic beta-subunits comprise the characteristic fold of the class II active-site domains. The presence of an RNA-binding domain, similar to that of the U1A spliceosomal protein, in the beta-subunit of tRNA synthetase indicates structural relationships among different families of RNA-binding proteins. Aminoacyl-tRNA synthetases can catalyse editing reactions to correct errors produced during amino acid activation and tRNA esterification, in order to prevent the attachment of incorrect amino acids to tRNA. The B3/B4 domain of the beta subunit contains an editing site, which lies close to the active site on the alpha subunit [ ]. Disruption of this site abolished tRNA editing, a process that is essential for faithful translation of the genetic code.
Protein Domain
Name: Mot1, central domain
Type: Domain
Description: This domain is found in Mot1 and related proteins. The TATA-binding protein (TBP) is a major target for transcriptional regulation. Mot1, a Swi2/Snf2-related ATPase, dissociates TBP from DNA in an ATP dependent process [ ]. The N-terminal domain of Mot1 has been shown to bind TBP, NC2 and DNA. Its ATPase domain is at the C terminus [].
Protein Domain
Name: Peptidase S26A, signal peptidase I
Type: Family
Description: This group of serine peptidases belong to MEROPS peptidase family S26 (signal peptidase I family, clan SF), subfamily S26A.At least 3 eubacterial leader peptidases are known: murein prelipoproteinpeptidase, which cleaves the leader peptide from a component of the bacterial outer membrane; type IV prepilin leader peptidase; and the serine-dependent leader peptidase 1, which has the more general role of cleavingthe leader peptide from a variety of secreted proteins and proteins directed to the periplasm and periplasmic membrane []. Leader peptidase 1 is similar to the eukaryotic signal peptidase, although the bacterial protein is monomeric, while the eukaryotic protein is multimeric [].Mitochondria contain a similar two-subunit serine protease that removes leader peptides from nuclear- and mitochondrial-encoded proteins, whichlocalise in the inner mitochondrial space [ ]. The catalytic residues of a number of these peptides have been identified as a serine/lysine dyad [].
Protein Domain
Name: Protein CHAPERONE-LIKE PROTEIN OF POR1-like
Type: Family
Description: This entry includes proteins from plants and bacteria. The plant members CHAPERONE-LIKE PROTEIN OF POR1 (CPP1) and Protein CHLOROPLAST J-LIKE DOMAIN 1 (CJD1) have a J-like domain and three transmembrane domains. CPP1 is an essential protein for chloroplast development, plays a role in the regulation of POR (light-dependent protochlorophyllide oxidoreductase) stability and function [ , , ]. CJD1 may be involved in the regulation of the fatty acid metabolic process in chloroplasts, especially chloroplastic galactolipids monogalactosyldiacylglycerol (MGDG) and digalactosyldiacylglycerol (DGDG) [].
Protein Domain
Name: Ribosomal protein L9, N-terminal
Type: Domain
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].Ribosomal protein L9 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L9 is known to bind directly to the 23S rRNA. It belongsto a family of ribosomal proteins grouped on the basis of sequence similarities [ ].The crystal structure of Bacillus stearothermophilus L9 shows the 149-residue protein comprises two globular domains connected by a rigid linker [ ]. Each domain contains an rRNA binding site, and the protein functions as astructural protein in the large subunit of the ribosome. The C-terminal domain consists of two loops, an α-helix and a three-stranded mixed parallel, anti-parallel β-sheet packed against the central α-helix. The long central α-helix is exposed to solvent in the middle and participates in thehydrophobic cores of the two domains at both ends.
Protein Domain
Name: Ribosomal protein L9/RNase H1, N-terminal
Type: Homologous_superfamily
Description: The N-terminal domain of the ribosomal protein L9 is a regulatory RNA-binding module that binds to 23rRNA. L9 is composed of two domains and functions as a structural protein in the large subunit of the ribosome. The N-terminal domain of eukaryotic RNase HI, which is lacking in retroviral and prokaryotic enzymes, shows a striking structural similarity to the L9 N-terminal domain, and may also function as a regulatory RNA-binding module. Eukaryotic RNases HI possess either one or two copies of the small N-terminal domain, in addition to the well-conserved catalytic RNase H domain. RNase HI belongs to the family of ribonuclease H enzymes that recognise RNA:DNA hybrids and degrade the RNA component. The structures of both the L9 [ ] and the RNase HI [] N-terminal domains consist of a three-stranded antiparallel β-sheet sandwiched between two short α-helices. The hydrophobic core of the domain is formed by the conserved residues that are involved in the packing of the α-helices onto the β-sheet. The (beta)2/alpha/beta/alpha topology of the domain differs from the structures of known RNA binding domains such as the double-stranded RNA binding domain (dsRBD), the hnRNP K homology (KH) domain and the RNP motif.
Protein Domain
Name: Ribosomal protein L9
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [, ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].Ribosomal protein L9 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L9 is known to bind directly to the 23S rRNA. It belongsto a family of ribosomal proteins grouped on the basis of sequence similarities [ ].The crystal structure of Bacillus stearothermophilus L9 shows the 149-residue protein comprises two globular domains connected by a rigid linker [ ]. Each domain contains an rRNA binding site, and the protein functions as astructural protein in the large subunit of the ribosome. The C-terminal domain consists of two loops, an α-helix and a three-stranded mixed parallel, anti-parallel β-sheet packed against the central α-helix. The long central α-helix is exposed to solvent in the middle and participates in thehydrophobic cores of the two domains at both ends.
Protein Domain
Name: Ribosomal protein L9, C-terminal
Type: Domain
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].Ribosomal protein L9 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L9 is known to bind directly to the 23S rRNA. It belongsto a family of ribosomal proteins grouped on the basis of sequence similarities [ ].The crystal structure of Bacillus stearothermophilus L9 shows the 149-residue protein comprises two globular domains connected by a rigid linker [ ]. Each domain contains an rRNA binding site, and the protein functions as astructural protein in the large subunit of the ribosome. The C-terminal domain consists of two loops, an α-helix and a three-stranded mixed parallel, anti-parallel β-sheet packed against the central α-helix. The long central α-helix is exposed to solvent in the middle and participates in thehydrophobic cores of the two domains at both ends.
Protein Domain
Name: Protein DMP
Type: Family
Description: This entry includes plant protein DMP, including Arabidopsis AtDMP1-10. DMP1 is a membrane protein that may be involved in membrane fission during breakdown of the ER and the tonoplast during leaf senescence and in membrane fusion during vacuole biogenesis in roots [ ]. DMP8 and DMP9 have been shown to facilitate gamete fusion during double fertilization in flowering plants [].
Protein Domain
Name: Anthranilate synthase/para-aminobenzoate synthase like domain
Type: Domain
Description: This entry represents the anthranilate synthase/para-aminobenzoate synthase domain, which share sequence similarity to the glutamine amidotransferase domain . Anthranilate synthase play a role in the tryptophan-biosynthetic pathway, while the para-aminobenzoate synthase is involved in the folate biosynthetic pathway. In at least one case, a single polypeptide from Bacillus subtilis was shown to have both functions. This entry contains proteins similar to para-aminobenzoate (PABA) synthase and ASase. These enzymes catalyze similar reactions and produce similar products, PABA and ortho-aminobenzoate (anthranilate). Each enzyme is composed of non-identical subunits: a glutamine amidotransferase subunit (component II) and a subunit that produces an aminobenzoate products (component I). ASase catalyses the synthesis of anthranilate from chorismate and glutamine and is a tetrameric protein comprising two copies each of components I and II. Component II of ASase belongs to the family of triad GTases which hydrolyze glutamine and transfer nascent ammonia between the active sites. In some bacteria, such as Escherichia coli, component II can be much larger than in other organisms, due to the presence of phosphoribosyl-anthranilate transferase (PRTase) activity. PRTase catalyses the second step in tryptophan biosynthesis and results in the addition of 5-phosphoribosyl-1-pyrophosphate to anthranilate to create N-5'-phosphoribosyl-anthranilate. In E.coli, the first step in the conversion of chorismate to PABA involves two proteins: PabA and PabB which co-operate to transfer the amide nitrogen of glutamine to chorismate forming 4-amino-4 deoxychorismate (ADC). PabA acts as a glutamine amidotransferase, supplying an amino group to PabB, which carries out the amination reaction. A third protein PabC then mediates elimination of pyruvate and aromatization to give PABA. Several organisms have bipartite proteins containing fused domains homologous to PabA and PabB commonly called PABA synthases. These hybrid PABA synthases may produce ADC and not PABA. [ , , , , , ].
Protein Domain      
Protein Domain
Name: Eukaryotic translation initiation factor 3 subunit I
Type: Family
Description: Eukaryotic translation initiation factor 3 subunit I is a component of the eukaryotic translation initiation factor 3 (eIF-3) complex, which is involved in protein synthesis and, together with other initiation factors, stimulates binding of mRNA and methionyl-tRNAi to the 40S ribosome [ , ].
Protein Domain
Name: Methionyl/Valyl/Leucyl/Isoleucyl-tRNA synthetase, anticodon-binding
Type: Domain
Description: The aminoacyl-tRNA synthetases (also known as aminoacyl-tRNA ligases) catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction [ , ]. These proteins differ widely in size and oligomeric state, and have limited sequence homology []. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric []. Class II aminoacyl-tRNA synthetases share an anti-parallel β-sheet fold flanked by α-helices [], and are mostly dimeric or multimeric, containing at least three conserved regions [, , ]. However, tRNA binding involves an α-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan, valine, and some lysine synthetases (non-eukaryotic group) belong to class I synthetases. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, phenylalanine, proline, serine, threonine, and some lysine synthetases (non-archaeal group), belong to class-II synthetases. Based on their mode of binding to the tRNA acceptor stem, both classes of tRNA synthetases have been subdivided into three subclasses, designated 1a, 1b, 1c and 2a, 2b, 2c [].This domain is found methionyl, valyl, leucyl and isoleucyl tRNA synthetases. It binds to the anticodon of the tRNA.
Protein Domain
Name: Valine-tRNA ligase
Type: Family
Description: Valine-tRNA ligase (also known as Valyl-tRNA synthetase) ( ) is an alpha monomer that belongs to class Ia aminoacyl-tRNA ligase. The aminoacyl-tRNA synthetases (also known as aminoacyl-tRNA ligases) catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction [ , ]. These proteins differ widely in size and oligomeric state, and have limited sequence homology []. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric []. Class II aminoacyl-tRNA synthetases share an anti-parallel β-sheet fold flanked by α-helices [], and are mostly dimeric or multimeric, containing at least three conserved regions [, , ]. However, tRNA binding involves an α-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan, valine, and some lysine synthetases (non-eukaryotic group) belong to class I synthetases. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, phenylalanine, proline, serine, threonine, and some lysine synthetases (non-archaeal group), belong to class-II synthetases. Based on their mode of binding to the tRNA acceptor stem, both classes of tRNA synthetases have been subdivided into three subclasses, designated 1a, 1b, 1c and 2a, 2b, 2c [].
Protein Domain
Name: Aminoacyl-tRNA synthetase, class Ia
Type: Domain
Description: The aminoacyl-tRNA synthetases (also known as aminoacyl-tRNA ligases) catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction [ , ]. These proteins differ widely in size and oligomeric state, and have limited sequence homology []. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric []. Class II aminoacyl-tRNA synthetases share an anti-parallel β-sheet fold flanked by α-helices [], and are mostly dimeric or multimeric, containing at least three conserved regions [, , ]. However, tRNA binding involves an α-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan, valine, and some lysine synthetases (non-eukaryotic group) belong to class I synthetases. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, phenylalanine, proline, serine, threonine, and some lysine synthetases (non-archaeal group), belong to class-II synthetases. Based on their mode of binding to the tRNA acceptor stem, both classes of tRNA synthetases have been subdivided into three subclasses, designated 1a, 1b, 1c and 2a, 2b, 2c [].The class Ia aminoacyl-tRNA synthetases consist of the isoleucyl, methionyl, valyl, leucyl, cysteinyl, and arginyl-tRNA synthetases; the class Ib include the glutamyl and glutaminyl-tRNA synthetases, and the class Ic are the tyrosyl and tryptophanyl-tRNA synthetases [ ].
Protein Domain
Name: Valyl/Leucyl/Isoleucyl-tRNA synthetase, editing domain
Type: Homologous_superfamily
Description: Certain aminoacyl-tRNA synthetases prevent potential errors in protein synthesis through deacylation of mischarged tRNAs. The close homologues isoleucyl-tRNA synthetase (IleRS) and valyl-tRNA synthetase (ValRS) deacylate Val-tRNAIle and Thr-tRNAVal, respectively. These reactions strictly require the presence of the cognate tRNA. In the absence of tRNA, the enzymatically generated misactivated adenylates remain in the active site, sequestered from hydrolysis. Upon addition of cognate tRNA the misactivated amino acids are hydrolysed, regenerating the free tRNA and amino acid, while converting 1 equivalent of ATP to AMP. A prominent mechanism for editing misactivated amino acids is the rapid hydrolysis of transiently mischarged tRNA. This reaction is catalysed at a second active site on IleRS and ValRS. This site is located within a large insertion (termed CP1) into the canonical class I aminoacyl-tRNA synthetase active-site fold [ , ]. The CP1 domain as an isolated polypeptide hydrolyses its cognate mischarged tRNA [].
Protein Domain
Name: Oleosin
Type: Family
Description: Oleosins [ ] are the proteinaceous components of plants' lipid storage bodiescalled oil bodies. Oil bodies are small droplets (0.2 to 1.5 mu-m in diameter) containing mostly triacylglycerol that are surrounded by a phospholipid/oleosin annulus. Oleosins may have a structural role in stabilising the lipid body during dessication of the seed, by preventing coalescence of the oil.They may also provide recognition signals for specific lipase anchorage in lipolysis during seedling growth. Oleosins are found in the monolayer lipid/water interface of oil bodies and probably interact with both the lipid and phospholipid moieties.Oleosins are proteins of 16 Kd to 24 Kd and are composed of three domains: an N-terminal hydrophilic region of variable length (from 30 to 60 residues); acentral hydrophobic domain of about 70 residues and a C-terminal amphipathic region of variable length (from 60 to 100 residues). The central hydrophobicdomain is proposed to be made up of β-strand structure and to interact with the lipids []. It is the only domain whose sequenceis conserved.
Protein Domain
Name: Polyribonucleotide nucleotidyltransferase
Type: Family
Description: The eukaryotic exosome and the prokaryotic degradosome are important protein complexes involved in RNA processing and maintaining appropriate RNA levels within the cell [ ]. Both of these complexes contain exoribonucleases (exoRNases) which degrade RNA from the 3' end. The hydrolytic exoRNases produce nucleoside monophosphates, while the phosphorolytic exoRNases add orthophosphate at the cleaved bond to produce nucleoside monophosphates.This entry represents polyribonucleotide nucleotidyltransferase ( ), also known as polynucleotide phosphorylase (PNPase), found in bacterial and eukaryotic organelle degradosomes. This enzyme can process single-stranded RNA, but is stalled by double-stranded structures such as stem-loops. Structural studies show that PNPase is a trimeric multidomain protein with a central channel [ ]. Each subunit contains duplicated RNase PH-like domains which, though structurally homologous, are thought to be functionally distinct. The first domain is more divergent in sequence than than the second domain and is thought to be involved in the flexible binding of RNA substrate and the formation of the trimer channel structure. The second domain is thought to contain the catalytic site and show exoRNase activity. The catalytic mechanism of the enzyme is not yet known but it seems likely that single-stranded RNA would be threaded through the channel to be processed by the three active sites within the trimer, which would thus be restricted to a single substrate molecule per trimer. PNPase activity would thus be tightly regulated by secondary structural elements within the RNA [].
Protein Domain
Name: Prephenate dehydrogenase
Type: Domain
Description: This entry represents prephenate dehydrogenase (PDHG) , an enzyme involved in tyrosine biosynthesis [ ] and related proteins.Three enzymes catalyse the conversion of chorismate to hydroxyphenylpyruvate or pyruvate in the aromatic amino acid biosynthesis pathway. In this pathway, chorismate is a branch point intermediate that is converted to tryptophan, phenylalanine (Phe), and tyrosine (Tyr). In bacteria the enzymes, chorismate mutase (CM), prephenate dehydratase (PDT), and prephenate dehydrogenase (PDHG) are either present as distinct proteins or fusions combining two activities [ ]. In the archaea Archaeoglobus fulgidus a single protein (AroQ) contains all three enzymatic domains [].This entry also matches 3-phosphoshikimate 1-carboxyvinyltransferase .
Protein Domain
Name: USP8 dimerisation domain
Type: Domain
Description: This domain is found in the amino terminal region of Ubiquitin carboxyl-terminal hydrolase 8 (USP8). It forms a five helical bundle that dimerises [ ]. It is also found in other proteins, including AMSH-like protease and STAM-binding protein.
Protein Domain
Name: Endoribonuclease YbeY, conserved site
Type: Conserved_site
Description: YbeY is a single strand-specific metallo-endoribonuclease involved in late-stage 70S ribosome quality control and in maturation of the 3' terminus of the 16S rRNA. It acts together with the RNase R to eliminate defective 70S ribosomes, but not properly matured 70S ribosomes or individual subunits, by a process mediated specifically by the 30S ribosomal subunit. It is involved in the processing of 16S, 23S and 5S rRNAs, with a particularly strong effect on maturation at both the 5'-and 3'-ends of 16S rRNA as well as maturation of the 5'-end of 23S and 5S rRNAs [ , , , ].YbeY contains a conserved region with three histidines at the C terminus. This entry represents this region.
Protein Domain
Name: Domain of unknown function DUF4094
Type: Domain
Description: This domain is found in a number of proteins, including eleven beta-1,3-galactosyltransferases paralogues from Arabidopsis thaliana. Beta-1,3-galactosyltransferase transfers galactose from UDP-galactose to substrates with a terminal glycosyl residue [ ]. This domain is found in N-terminal domain of the beta-1,3-galactosyltransferase.
Protein Domain
Name: Transcription factor TFE/TFIIEalpha HTH domain
Type: Domain
Description: Initiation of eukaryotic mRNA transcription requires melting of promoter DNA with the help of the general transcription factors TFIIE and TFIIH. In higher eukaryotes, the general transcription factor TFIIE consists of two subunits: the large alpha subunit ( ) and the small beta ( ). TFIIE beta has been found to bind to the region where the promoter starts to open to be single-stranded upon transcription initiation by RNA polymerase II. The approximately 120-residue central core domain of TFIIE beta plays a role in double-stranded DNA binding of TFIIE [ ].The TFIIE beta central core DNA-binding domain consists of three helices with a beta hairpin at the C terminus, resembling the winged helix proteins. It shows a novel double-stranded DNA-binding activity where the DNA-binding surface locates on the opposite side to the previously reported winged helix motif by forming a positively charged furrow [ ].Archaea contain a TFIIE homologue, called TFE, which corresponds to the N-terminal half of TFIIEalpha. It appears that archaeal TFE corresponds to the minimal essential region of eukaryotic TFIIEalpha. In archaea TFE contains an N-terminal, weakly conserved, helix-turn-helix (HTH) motif within a leucine-rich region and a C-terminal zinc ribbon [ , , ]. It has been proposed that the TFE/IIEalpha-type HTH domain acts as a bridging factor or adapter between the TATA box-binding protein, the polymerase, and possibly promoter DNA [].The TFE/IIEalpha-type HTH domain adopts a winged HTH (winged helix) fold, comprising three α-helices and three β-strands in the canonical order α1-β1-α2-α3-β2-β3. Conserved residues within helices α1-α3 form the tightly packed hydrophobic core of the winged helix domain. A specific feature of the structure is the extension of the canonical winged helix fold at the N and C termini by the additional helices α0 and α4, respectively. Hydrophobic residues from the additional helix α0 extend the hydrophobic core of the winged helix domain, and helix α0 is tightly packed against the canonical winged helix fold. Helix alpha4 comprises only one turn [ ].
Protein Domain
Name: Transcription initiation factor IIE subunit alpha, N-terminal
Type: Domain
Description: Initiation of eukaryotic mRNA transcription requires melting of promoter DNA with the help of the general transcription factors TFIIE and TFIIH. In higher eukaryotes, the general transcription factor TFIIE consists of two subunits: the large alpha subunit ( ) and the small beta ( ). TFIIE beta has been found to bind to the region where the promoter starts to open to be single-stranded upon transcription initiation by RNA polymerase II. The approximately 120-residue central core domain of TFIIE beta plays a role in double-stranded DNA binding of TFIIE [ ].The TFIIE beta central core DNA-binding domain consists of three helices with a beta hairpin at the C terminus, resembling the winged helix proteins. It shows a novel double-stranded DNA-binding activity where the DNA-binding surface locates on the opposite side to the previously reported winged helix motif by forming a positively charged furrow [ ].This entry represents the conserved amino terminal region of eukaryotic TFIIE-alpha and proteins from archaebacteria (TFE) that are also presumed to be TFIIE-alpha subunits [ ].
Protein Domain
Name: TFIIEalpha/SarR/Rpc3 HTH domain
Type: Domain
Description: The general transcription factor TFIIE has an essential role in eukaryotic transcription initiation, together with RNA polymerase II and other general factors. Human TFIIE consists of two subunits, TFIIE-alpha and TFIIE-beta, and joins the preinitiation complex after RNA polymerase II and TFIIF [ ]. This entry represents a helix-turn-helix (HTH) domain found in eukaryotic TFIIE-alpha []. It is also found in proteins from archaebacteria that are presumed to be TFIIE-alpha subunits [], the transcriptional regulator SarR, and also DNA-directed RNA polymerase III subunit Rpc3.
Protein Domain
Name: Domain of unknown function DUF676, lipase-like
Type: Domain
Description: This domain, whose function is unknown, is found within a group of putative lipases. Proteins containing this domain include YOR059C (Lpl1) from budding yeasts. Lpl1 has been identified as a phospholipase B [ ].
Protein Domain
Name: Peptidase M1, membrane alanine aminopeptidase
Type: Domain
Description: Over 70 metallopeptidase families have been identified to date. In these enzymes a divalent cation which is usually zinc, but may be cobalt, manganese or copper, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. In some families of co-catalytic metallopeptidases, two metal ions are observed in crystal structures ligated by five amino acids, with one amino acid ligating both metal ions. The known metal ligands are His, Glu, Asp or Lys. At least one other residue is required for catalysis, which may play an electrophillic role. Many metalloproteases contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases [].This group of metallopeptidases belong to the MEROPS peptidase family M1 (clan MA(E)), the type example being aminopeptidase N from Homo sapiens (Human). The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA.Membrane alanine aminopeptidase () is part of the HEXXH+E group; it consists entirely of aminopeptidases, spread across a widevariety of species [ ]. Functional studies show that CD13/APN catalyzes the removal of single amino acids from the amino terminus of small peptides and probably plays a role in their final digestion; one family member (leukotriene-A4 hydrolase) is known to hydrolyse the epoxide leukotriene-A4to form an inflammatory mediator [ ]. This hydrolase has been shown tohave aminopeptidase activity [ ], and the zinc ligands of the M1 familywere identified by site-directed mutagenesis on this enzyme [ ] CD13 participates in trimming peptides bound to MHC class II molecules [] and cleaves MIP-1 chemokine, which alters target cell specificity from basophils to eosinophils []. CD13 acts as a receptor for specific strains of RNA viruses (coronaviruses) which cause a relatively large percentage of upper respiratorytract infections.
Protein Domain
Name: DNA mismatch repair Msh2-type
Type: Family
Description: Mismatch repair (MMR) is one of five major DNA repair pathways, the others being homologous recombination repair, non-homologous end joining, nucleotide excision repair, and base excision repair. The mismatch repair system recognises and repairs mispaired or unpaired nucleotides that result from errors in DNA replication. The most extensively studied general MMR system is the MutHLS pathway of the bacterium Escherichia coli. In the first step of the MutHLS pathway, the MutS protein (in the form of a dimer) binds to the site of a mismatch in double-stranded DNA. Through a complex interaction between MutS, MutL and MutH, a section of the newly replicated DNA strand (and thus the strand with the replication error) at the location of the mismatch bound by MutS is targeted for removal []. Homologues of MutS have been found in many species including eukaryotes, Archaea and other bacteria, and together these proteins have been grouped into the MutS family.This entry represents a subset of the MutS family members, including Msh2. Msh2 (MutS homologue 2) has a dual role in DNA repair and apoptosis. Msh2 acts as a heterodimer with Msh6, which together function to recruit the Mlh (MutL homologue) - Pms (post-meiotic segregation) heterodimer, and to replace the mispaired base [ ].
Protein Domain
Name: DNA mismatch repair protein MutS, C-terminal
Type: Domain
Description: Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication [ ]. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base []. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch []. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level []. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA. MutS is a modular protein with a complex structure [ ], and is composed of:N-terminal mismatch-recognition domain, which is similar in structure to tRNA endonuclease.Connector domain, which is similar in structure to Holliday junction resolvase ruvC.Core domain, which is composed of two separate subdomains that join together to form a helical bundle; from within the core domain, two helices act as levers that extend towards (but do not touch) the DNA.Clamp domain, which is inserted between the two subdomains of the core domain at the top of the lever helices; the clamp domain has a β-sheet structure.ATPase domain (connected to the core domain), which has a classical Walker A motif.HTH (helix-turn-helix) domain, which is involved in dimer contacts.The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair. Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein [ ].This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions []. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts [ ]. This entry represents the C-terminal domain found in proteins in the MutS family of DNA mismatch repair proteins. The C-terminal region of MutS is comprised of the ATPase domain and the HTH (helix-turn-helix) domain, the latter being involved in dimer contacts. Yeast MSH3 [ ], bacterial proteins involved in DNA mismatch repair, and the predicted protein product of the Rep-3 gene of mouse share extensive sequence similarity. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein.
Protein Domain
Name: DNA mismatch repair protein MutS, connector domain
Type: Domain
Description: Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication [ ]. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base []. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch []. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level []. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA. MutS is a modular protein with a complex structure [ ], and is composed of:N-terminal mismatch-recognition domain, which is similar in structure to tRNA endonuclease.Connector domain, which is similar in structure to Holliday junction resolvase ruvC.Core domain, which is composed of two separate subdomains that join together to form a helical bundle; from within the core domain, two helices act as levers that extend towards (but do not touch) the DNA.Clamp domain, which is inserted between the two subdomains of the core domain at the top of the lever helices; the clamp domain has a β-sheet structure.ATPase domain (connected to the core domain), which has a classical Walker A motif.HTH (helix-turn-helix) domain, which is involved in dimer contacts.The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair. Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein [ ].This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions []. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts []. This entry represents the connector domain (domain 2) found in proteins of the MutS family. The structure of the MutS connector domain consists of a parallel β-sheet surrounded by four alpha helices, which is similar to the structure of the Holliday junction resolvase ruvC.
Protein Domain
Name: DNA mismatch repair protein MutS-like, N-terminal
Type: Domain
Description: Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication [ ]. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base []. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch []. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level []. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA. MutS is a modular protein with a complex structure [ ], and is composed of:N-terminal mismatch-recognition domain, which is similar in structure to tRNA endonuclease.Connector domain, which is similar in structure to Holliday junction resolvase ruvC.Core domain, which is composed of two separate subdomains that join together to form a helical bundle; from within the core domain, two helices act as levers that extend towards (but do not touch) the DNA.Clamp domain, which is inserted between the two subdomains of the core domain at the top of the lever helices; the clamp domain has a β-sheet structure.ATPase domain (connected to the core domain), which has a classical Walker A motif.HTH (helix-turn-helix) domain, which is involved in dimer contacts.The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair. Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein [ ].This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions []. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts []. This entry represents the N-terminal domain of proteins in the MutS family of DNA mismatch repair proteins, as well as closely related proteins. The N-terminal domain of MutS is responsible for mismatch recognition and forms a 6-stranded mixed β-sheet surrounded by three α-helices, which is similar to the structure of tRNA endonuclease. Yeast MSH3 [ ], bacterial proteins involved in DNA mismatch repair, and the predicted protein product of the Rep-3 gene of mouse share extensive sequence similarity. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein.
Protein Domain
Name: DNA mismatch repair protein MutS, clamp
Type: Domain
Description: Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication [ ]. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base []. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch []. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level []. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA. MutS is a modular protein with a complex structure [ ], and is composed of:N-terminal mismatch-recognition domain, which is similar in structure to tRNA endonuclease.Connector domain, which is similar in structure to Holliday junction resolvase ruvC.Core domain, which is composed of two separate subdomains that join together to form a helical bundle; from within the core domain, two helices act as levers that extend towards (but do not touch) the DNA.Clamp domain, which is inserted between the two subdomains of the core domain at the top of the lever helices; the clamp domain has a β-sheet structure.ATPase domain (connected to the core domain), which has a classical Walker A motif.HTH (helix-turn-helix) domain, which is involved in dimer contacts.The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair. Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein [ ].This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions []. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts []. This entry represents the clamp domain (domain 4) found in proteins of the MutS family. The clamp domain is inserted within the core domain at the top of the lever helices. It has a β-sheet structure [ ].
Protein Domain
Name: DNA mismatch repair protein MutS, core
Type: Domain
Description: Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication [ ]. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base []. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch []. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level []. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA. MutS is a modular protein with a complex structure [ ], and is composed of:N-terminal mismatch-recognition domain, which is similar in structure to tRNA endonuclease.Connector domain, which is similar in structure to Holliday junction resolvase ruvC.Core domain, which is composed of two separate subdomains that join together to form a helical bundle; from within the core domain, two helices act as levers that extend towards (but do not touch) the DNA.Clamp domain, which is inserted between the two subdomains of the core domain at the top of the lever helices; the clamp domain has a β-sheet structure.ATPase domain (connected to the core domain), which has a classical Walker A motif.HTH (helix-turn-helix) domain, which is involved in dimer contacts.The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair. Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein [ ].This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions []. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts []. This entry represents the core domain (domain 3) found in proteins of the MutS family. The core domain of MutS adopts a multi-helical structure comprised of two subdomains, which are interrupted by the clamp domain. Two of the helices in the core domain comprise the levers that extend towards the DNA. This domain is found associated with Pfam:PF00488, Pfam:PF05188, Pfam:PF01624 and Pfam:PF05190. The aligned region corresponds with domain III, which is central to the structure of Thermus aquaticus MutS as characterised in [ ].
Protein Domain
Name: HAUS augmin-like complex subunit 6, N-terminal
Type: Domain
Description: This entry represents the N-terminal domain of the subunit 6 of the HAUS augmin-like complex, involved in mitotic spindle assembly, maintenance of centrosome integrity and completion of cytokinesis [ , ]. Subunit 6 interacts with the NEDD1-gamma-tubulin complex and recruits this complex to the spindle, which in turn promotes microtubule polymerisation [].
Protein Domain
Name: Ribosomal protein L37, mitochondrial
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].This entry includes yeast MRPL37 a mitochondrial ribosomal protein [ ].
Protein Domain
Name: ATP11
Type: Family
Description: This family consists of several eukaryotic ATP11 proteins. The expression of functional F1-ATPase requires two proteins which are encoded by the ATP11 and ATP12 genes [ ]. Atp11p is a molecular chaperone of the mitochondrial matrix that participates in the biogenesis pathway to form F1, which is the catalytic unit of ATP synthase. It binds to the free beta subunits of F1, which prevents the beta subunit from associating with itself in non-productive complex. It also allows for the formation of a (alpha beta)3 hexamer [].
Protein Domain
Name: Dihydroorotase, conserved site
Type: Conserved_site
Description: This group contains a number of protein families, example are:Archaeal and bacterial dihydroorotase ( ) (DHOase) Allantoinase ( ) Dihydroorotase belongs to MEROPS peptidase family M38 (clan MJ), where it is classified as a non-peptidase homologue. DHOase catalyses the third step in the de novobiosynthesis of pyrimidine, the conversion of ureidosuccinic acid (N-carbamoyl-L-aspartate) into dihydroorotate. Dihydroorotase binds a zinc ion which is required for its catalytic activity [ ].In bacteria, DHOase is a dimer of identical chains of about 400 amino-acid residues (gene pyrC) [ ]. In higher eukaryotes, DHOase is part of a large multi-functional protein known as 'rudimentary' in Drosophila melanogaster and CAD in mammals and which catalyzes the first three steps of pyrimidine biosynthesis []. The DHOase domain is located in the central part of this polyprotein. In yeasts, DHOase is encoded by a monofunctional protein (gene URA4). However, a defective DHOase domain [] is found in a multifunctional protein (gene URA2) that catalyzes the first two steps of pyrimidine biosynthesis.The comparison of DHOase sequences from various sources shows [ ] that there are two highly conserved regions. The first located in the N-terminal extremity contains two histidine residues suggested [] to be involved in binding the zinc ion. The second is found in the C-terminal part. Members of this family of proteins are predicted to adopt a TIM barrel fold [].Allantoinase ( ) is the enzyme that hydrolyzes allantoin into allantoate. In yeast (gene DAL1) [ ], it is the first enzyme in the allantoin degradation pathway; in amphibians [] and fishs it catalyzes the second step in the degradation of uric acid. The sequence of allantoinase is evolutionary related to that of DHOases.
Protein Domain
Name: Dihydroorotase homodimeric type
Type: Family
Description: Dihydroorotase belongs to MEROPS peptidase family M38 (clan MJ), where it is classified as a non-peptidase homologue. DHOase catalyses the third step in the de novobiosynthesis of pyrimidine, the conversion of ureidosuccinic acid (N-carbamoyl-L-aspartate) into dihydroorotate. Dihydroorotase binds a zinc ion which is required for its catalytic activity [ ].In bacteria, DHOase is a dimer of identical chains of about 400 amino-acid residues (gene pyrC). In the metazoa, DHOase is part of a large multi-functional protein known as 'rudimentary' in Drosophila melanogaster and CAD in mammals and which catalyzes the first three steps of pyrimidine biosynthesis [ ]. The DHOase domain is located in the central part of this polyprotein. In yeast, DHOase is encoded by a monofunctional protein (gene URA4). However, a defective DHOase domain [] is found in a multifunctional protein (gene URA2) that catalyzes the first two steps of pyrimidine biosynthesis.The comparison of DHOase sequences from various sources shows [ ] that there are two highly conserved regions. The first located in the N-terminal extremity contains two histidine residues suggested [] to be involved in binding the zinc ion. The second is found in the C-terminal part. Members of this family of proteins are predicted to adopt a TIM barrel fold [].This family represents the homodimeric form of dihydroorotase . It is found in bacteria, plants and fungi; URA4 of yeast is a member of this group of sequences.
Protein Domain
Name: DNA repair protein XRCC4-like, C-terminal
Type: Homologous_superfamily
Description: XRCC4 is essential for non-homologous DNA end joining (NHDJ) in eukaryotes, which is required for double-strand break repair, and V(D)J recombination in immunoglobulin and T-cell receptor genes. XRCC4 forms a complex with DNA ligase IV, and acts as a regulatory element required for the stability and activity of the ligase. XRCC4 forms an elongated dumb-bell-like tetramer consisting of a C-terminal stalk that interacts with DNA ligase IV and an N-terminal globular head. The C-terminal oligomerisation domain consists of oligomers of short identical helices that form parallel coiled-coils [ , ].This superfamily also matches the C-terminal of the coiled-coil myosin heavy chain tail region. Muscle contraction is caused by sliding between the thick and thin filaments of the myofibril. Myosin is a major component of thick filaments and exists as a hexamer of 2 heavy chains [], 2 alkali light chains, and 2 regulatory light chains. The heavy chain can be subdivided into the N-terminal globular head and the C-terminal coiled-coil rod-like tail [].
Protein Domain
Name: Aminotransferase class V domain
Type: Domain
Description: Aminotransferases share certain mechanistic features with other pyridoxal-phosphate dependent enzymes, such as the covalent binding of the pyridoxal-phosphate group to a lysine residue. On the basis of sequence similarity, these various enzymes can be grouped [ ] into subfamilies. This entry represents an aminotransferase class-V domain, which is found in amino transferases and other related, though functionally distinct enzymes, including cysteine desulfurase.
Protein Domain
Name: Protein EXORDIUM-like
Type: Family
Description: This entry includes the EXORDIUM protein and related proteins. The EXO (EXORDIUM) gene was identified as a potential mediator of brassinosteroid (BR)-promoted growth [ ]. It mediates cell expansion in Arabidopsis leaves []. This entry also includes PHI-1, a phosphate-induced protein of unknown function from Nicotiana tabacum [].
Protein Domain
Name: G patch domain-containing protein, N-terminal
Type: Domain
Description: This domain is found at the N terminus of several eukaryotic RNA processing proteins, including Arabidopsis TGH, which is involved in microRNA (miRNA) and small interfering RNA (siRNA) biogenesis [ ].
Protein Domain
Name: ICln
Type: Family
Description: ICln, known as methylosome subunit pICln or chloride conductance regulatory protein ICln, owes these different names to its function in multiple regulatory pathways [ ] as different as ion permeation, ribonucleoprotein biosynthesis and cytoskeletal organisation []. ICln can be identified both in the cytosol and in the cellular membrane, where it functions as a chloride current regulator and is important in regulating volume decrease after cellular swelling [, , , ].pLCln also functions as a Sm chaperone in the stepwise snRNP assembly process [ ]. snRNPs is a RNA-protein complex esessential to the removal of introns from pre-mRNA [, ]. In humans, the core of snRNPs is composed of seven Sm proteins bound to snRNA. pLCln tethers the hetero-oligomers SmD1/D2 and SmE/F/G into a ring-shaped 6S complex, which subsequently docks onto the SMN complex. The SMN complex then removes pICln and enables the transfer of pre-assembled Sm proteins onto snRNA []. Consistent with the role of human pICln, the orthologue from S. pombe is required for optimal production of the spliceosomal snRNPs and for efficient splicing [].
Protein Domain
Name: Tryptophan synthase, beta chain, conserved site
Type: Conserved_site
Description: Tryptophan synthase catalyses the last step in the biosynthesis of tryptophan [ , ]:L-serine + 1-(indol-3-yl)glycerol 3-phosphate = L-tryptophan + glyceraldehyde 3-phosphate + H2O It has two functional domains, each found in bacteria and plants on a separate subunit: alpha chain () is for the aldol cleavage of indoleglycerol phosphate to indole and glyceraldehyde 3-phosphate and beta chain is for the synthesis of tryptophan from indole and serine. In fungi the two domains are fused together on a single multifunctional protein [ ].The beta chain of the enzyme, represented here, requires pyridoxal-phosphate as a cofactor. The pyridoxal-phosphate group is attached to a lysine residue. The region around this lysine residue also contains two histidine residues which are part of the pyridoxal-phosphate binding site.
Protein Domain
Name: Tryptophan synthase, beta chain
Type: Family
Description: Tryptophan synthase catalyses the last step in the biosynthesis of tryptophan [ , ]:L-serine + 1-(indol-3-yl)glycerol 3-phosphate = L-tryptophan + glyceraldehyde 3-phosphate + H2O It has two functional domains, each found in bacteria and plants on a separate subunit: alpha chain () is for the aldol cleavage of indoleglycerol phosphate to indole and glyceraldehyde 3-phosphate and beta chain is for the synthesis of tryptophan from indole and serine. In fungi the two domains are fused together on a single multifunctional protein [ ].The beta chain of the enzyme, represented here, requires pyridoxal-phosphate as a cofactor. The pyridoxal-phosphate group is attached to a lysine residue. The region around this lysine residue also contains two histidine residues which are part of the pyridoxal-phosphate binding site.
Protein Domain
Name: Domain of unknown function DUF4217
Type: Domain
Description: This short domain is found at the C terminus of many helicases, including some DEAD box helicases. The DEAD box helicases are involved in various aspects of RNA metabolism, including nuclear transcription, pre mRNA splicing, ribosome biogenesis, nucleocytoplasmic transport, translation, RNA decay and organellar gene expression.
Protein Domain
Name: Signal recognition particle, SRP14 subunit
Type: Family
Description: The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes [ , ]. SRP recognises the signal sequence of the nascent polypeptide on the ribosome. In eukaryotes this retards its elongation until SRP docks the ribosome-polypeptide complex to the RER membrane via the SR receptor []. Eukaryotic SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor []. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane. In archaea, the SRP complex contains 7S RNA like its eukaryotic counterpart, yet only includes two of the six protein subunits found in the eukarytic complex: SRP19 and SRP54 [].This entry represents the 14kDa SRP14 component. Both SRP9 and SRP14 have the same (beta)-α-β(3)-alpha fold. The heterodimer has pseudo two-fold symmetry and is saddle-like, consisting of a curved six-stranded β-sheet that has four helices packed on the convex side and an exposed concave surface lined with positively charged residues. The SRP9/SRP14 heterodimer is essential for SRP RNA binding, mediating the pausing of synthesis of ribosome associated nascent polypeptides that have been engaged by the targeting domain of SRP [ ].
Protein Domain
Name: Signal recognition particle, SRP9/SRP14 subunit
Type: Homologous_superfamily
Description: The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes [ , ]. SRP recognises the signal sequence of the nascent polypeptide on the ribosome. In eukaryotes this retards its elongation until SRP docks the ribosome-polypeptide complex to the RER membrane via the SR receptor []. Eukaryotic SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor []. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane. In archaea, the SRP complex contains 7S RNA like its eukaryotic counterpart, yet only includes two of the six protein subunits found in the eukarytic complex: SRP19 and SRP54 [].This superfamily represents both the 9kDa SRP9 and the 14kDa SRP14 components. Both SRP9 and SRP14 have the same (beta)-α-β(3)-alpha fold. The heterodimer has pseudo two-fold symmetry and is saddle-like, consisting of a curved six-stranded β-sheet that has four helices packed on the convex side and an exposed concave surface lined with positively charged residues. The SRP9/SRP14 heterodimer is essential for SRP RNA binding, mediating the pausing of synthesis of ribosome associated nascent polypeptides that have been engaged by the targeting domain of SRP [ ].
Protein Domain
Name: Single-stranded DNA-binding protein
Type: Family
Description: Single-stranded DNA-binding protein (SSB) plays an important role in DNA replication, recombination and repair. It binds to ssDNA and to an array of partner proteins to recruit them to their sites of action during DNA metabolism [ , , , ].
Protein Domain
Name: Primosome PriB/single-strand DNA-binding
Type: Family
Description: The Escherichia coli single-strand binding protein [ ] (gene ssb), also known as the helix-destabilising protein, is a protein of 177 amino acids. It binds tightly, as a homotetramer, to single-stranded DNA (ss-DNA) and plays an important role in DNA replication, recombination and repair. Closely related variants of SSB are encoded in the genome of a variety of large self-transmissible plasmids. SSB has also been characterized in bacteria such as Proteus mirabilis or Serratia marcescens. Eukaryotic mitochondrial proteins that bind ss-DNA and are probably involved in mitochondrial DNA replication are structurally and evolutionary related to prokaryotic SSB.Primosomal replication protein N (PriB) is a specialist protein from bacteria that binds single-stranded DNA at the primosome assembly site (PAS). The primosome is a mobile multiprotein replication priming complex which is believe to operate on the lagging-strand template at the E. coli DNA replication fork [ ]. The primosome consists of one monomer of PriC and DnaT, two monomers of PriA, two dimers of PriB and one hexamer of DnaB [].
Protein Domain      
Protein Domain
Name: Hemerythrin-like
Type: Domain
Description: This entry represents a hemerythrin cation-binding domain that occurs [ ] in hemerythrins, myohemerythrin and related proteins. This domain binds iron in hemerythrin, but can bind other metals in related proteins, such as cadmium in a Nereis diversicolor protein () [ ]. This domain is also found in Repair of iron centres or Ric proteins [].
Protein Domain
Name: Zinc finger, CHY-type
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. Pirh2 is an eukaryotic ubiquitin protein ligase, which has been shown to promote p53 degradation in mammals. Pirh2 physically interacts with p53 and promotes ubiquitination of p53 independently of MDM2. Like MDM2, Pirh2 is thought to participate in an autoregulatory feedback loop that controls p53 function. Pirh2 proteins contain three distinct zinc fingers, the CHY-type, the CTCHY-type which is C-terminal to the CHY-type zinc finger and a RING finger. The CHY-type zinc finger has no currently known function [ ].As well as Pirh2, the CHY-type zinc finger is also found in the following proteins:Yeast helper of Tim protein 13. Hot13 may have a role in the assembly and recycling of the small Tims, a complex of the mitochondrial intermembrane space that participates in the TIM22 import pathway for assembly of the inner membrane [ ] Several plant hypothetical proteins that also contain haemerythrin cation binding domainsSeveral protozoan hypothetical proteins that also contain a Myb domainThe solution structure of this zinc finger has been solved and binds three zinc atoms as shown in the following schematic representation: ++---------+-----+ || | |CXHYxxxxxxxxxCCxxxxxCxxCHxxxxxHxxxxxxxxxxxCxxCxxxxxxxxxCxxC | | | | | | | |+-+-----------------+--+ +--+---------+--+ 'C': conserved cysteine involved in the binding of one zinc atom.'H': conserved histidine involved in the binding of one zinc atom.
Protein Domain
Name: Zinc finger, CTCHY-type
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [ , , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. Pirh2 is an eukaryotic ubiquitin protein ligase, which has been shown to promote p53 degradation in mammals. Pirh2 physically interacts with p53 and promotes ubiquitination of p53 independently of MDM2. Like MDM2, Pirh2 is thought to participate in an autoregulatory feedback loop that controls p53 function. Pirh2 proteins contain three distinct zinc fingers, the CHY-type, the CTCHY-type which is C-terminal to the CHY-type zinc finger and a RING finger. The CHY-type zinc finger has no known functions but the region encompassing the CTCHY-type zinc finger is required for binding to p53 in mammals [ ].The CTCHY-type zinc finger has so far only been found in Pirh2 proteins. It binds 3 zinc atoms as shown in the following schematic representation: The CTCHY-type zinc finger: +--+------------+------+| | | | CxxCxxxxxxxxxxHCxxCxxCxxxxxxxxxHCxxCxxCxxxxxxxxHxC| | | | | | | | +--+----------+------+ +--+-----------+-+'C': conserved cysteine involved in the binding of one zinc atom. 'H': conserved histidine involved in the binding of one zinc atom.
Protein Domain      
Protein Domain
Name: Prefoldin alpha-like
Type: Family
Description: This entry represents prefoldin subunit alpha-like proteins. Prefoldin (PFD) is a chaperone that interacts exclusively with type II chaperonins, hetero-oligomers lacking an obligate co-chaperonin that are found only in eukaryotes (chaperonin-containing T-complex polypeptide-1 (CCT)) and archaea. Eukaryotic PFD is a multi-subunit complex containing six polypeptides in the molecular mass range of 14-23kDa. In archaea, on the other hand, PFD is composed of two types of subunits, two alpha and four beta. The six subunits associate to form two back-to-back up-and-down eight-stranded barrels, from which hang six coiled coils. Each subunit contributes one (beta subunits) or two (alpha subunits) beta hairpin turns to the barrels. The coiled coils are formed by the N and C termini of an individual subunit. Overall, this unique arrangement resembles a jellyfish. The eukaryotic PFD hexamer is composed of six different subunits; however, these can be grouped into two alpha-like (PFD3 and -5) and four beta-like (PFD1, -2, -4, and -6) subunits based on amino acid sequence similarity with their archaeal counterparts. Eukaryotic PFD has a six-legged structure similar to that seen in the archaeal homologue [ , ]. This family contains the archaeal alpha subunit, eukaryotic prefoldin subunits 3 and 5 and the UXT (ubiquitously expressed transcript) family. Eukaryotic PFD has been shown to bind both actin and tubulin co-translationally. The chaperone then delivers the target protein to CCT, interacting with the chaperonin through the tips of the coiled coils. No authentic target proteins of any archaeal PFD have been identified, to date.
Protein Domain
Name: Mitotic spindle checkpoint protein Bub1/Mad3
Type: Family
Description: This represents the mitotic checkpoint serine/threonine-protein kinase Bub1. Saccharomyces cerevisiae Bub1 has a paralogue, Mad3, which is also included in this entry. Bub1 forms a complex with Mad1 and Bub3 that is crucial for preventing cell cycle progression into anaphase in the presence of spindle damage [ ], while Mad3 is a component of the spindle-assembly complex consisting of Mad2, Mad3, Bub3 and Cdc20 []. Mad3 contains a D-box and two KEN- boxes, which function together to mediate Cdc20-Mad3 interaction. Mad3 and an anaphase-promoting complex (APC) substrate, Hsl1, compete for Cdc20 binding in a D-box- and KEN-box-dependent manner [].Similar to its yeast homologues, human Bub1 is a critical component of the mitotic checkpoint that delays the onset of anaphase until all chromosomes have established bipolar attachment to the microtubules. In interphase cells it localises to centrosomes and suppresses centrosome amplification via regulating Plk1 activity [ ]. Mutations in the human Bub1 gene have been linked to cancers [, ].
Protein Domain
Name: Mad3/Bub1 homology region 1
Type: Domain
Description: Proteins containing this domain are checkpoint proteins involved in cell division. This region has been shown to be essential for the binding of Bub1 and Mad3 to Cdc20 [ ].
Protein Domain
Name: ADP/ATP carrier protein, eukaryotic type
Type: Family
Description: A variety of substrate carrier proteins that are involved in energy transfer are found in the inner mitochondrial membrane [ , , , , ]. Such proteins include: ADP,ATP carrier protein (ADP/ATP translocase); 2-oxoglutarate/malate carrier protein; phosphate carrier protein; tricarboxylate transport protein (or citrate transport protein); Graves disease carrier protein; yeast mitochondrial proteins MRS3 and MRS4; yeast mitochondrial FAD carrier protein; and many others.Sequence analysis of selected members of the carrier protein family has suggested the presence of six transmembrane (TM) domains, with varying degrees of sequence conservation and hydrophilicity []. The TM regions, and adjacent hydrophilic loops, are more highly conserved than other regions of the proteins []. All members of the family appear to consist of a tripartite structure, each of the repeated segments being ~100 residues in length []. Each repeat contains two TM domains, the first being morehydrophobic, with conserved glycyl and prolyl residues. Five of the six TM domains are followed by the conserved sequence (D/E)-Hy(K/R), where - denotes any residue and Hy is a hydrophobic position [ ].Mitochondrial ADP/ATP translocase, an abundant component of the inner membrane, carries ATP from the matrix into the inter-membrane space and transports ADP back [ , ]. The protein is an integral membrane protein that functions as a homodimer.Mutations of the human ADP/ATP translocase 1 (also known as SLC25A4) gene cause mitochondrial diseases, such as PEOA2 and MTDPS12B [ ]. This family contains proteins found in eucaryotes.
Protein Domain
Name: TraB/PrgY/GumN family
Type: Family
Description: This entry includes Tiki1/2 from humans, TraB/PrgY from the gut flora Enterococcus faecalis and gumN from the plant pathogen Xanthomonas. Tiki1 is homologous to TraB/PrgY. They have a pair of widely spaced GX2H motifs and a conserved glutamate. From the structural study, this group of proteins have been identified as an ancient metalloprotease clan with a common protein architecture -cobbled from the folds of the EreA/ChaN/PMT group- that mediates proteolytic activities [ ]. Tiki1 is a membrane-associated protease that inhibits Wnt via the cleavage of its amino terminus, diminishing Wnt's binding to receptors [ ].TraB/PrgY is an inhibitor peptide that may act as a protease to inactivate the mating pheromone [ ].
Protein Domain
Name: DNA-directed RNA polymerase, subunit RPB6
Type: Family
Description: Subunit RPB6/RPABC2 is a common component of eukaryotic RNA polymerases I, II and III which synthesize ribosomal RNA precursors, mRNA precursors and many functional non-coding RNAs, and small RNAs, respectively. In RNA polymerase II (Pol II), RPB6/RPABC2 is part of the clamp element and together with parts of RPB1 and RPB2 forms a pocket to which the RPB4-RPB7 subcomplex binds [ ].In addition to RNA polymerases I, II, and III, the essential RNA polymerases present in all eukaryotes, plants have two additional nuclear RNA polymerases, abbreviated as Pol IV and Pol V, that play nonredundant roles in siRNA-directed DNA methylation and gene silencing. Pol IV and Pol V are composed of subunits that are paralogous or identical to the 12 subunits of Pol II, including subunit RPB6 [ ].
Protein Domain
Name: Dihydrodipicolinate reductase, N-terminal
Type: Domain
Description: Dihydrodipicolinate reductase catalyzes the second step in the biosynthesis of diaminopimelic acid and lysine, the NAD or NADP-dependent reduction of 2,3-dihydrodipicolinate into 2,3,4,5-tetrahydrodipicolinate [ , , ].In Escherichia coli and Mycobacterium tuberculosis, dihydrodipicolinate reductase has equal specificity for NADH and NADPH, however in Thermotoga maritima there it has a greater affinity for NADPH [ ]. In addition, the enzyme is inhibited by high concentrations of its substrate, which consequently acts as a feedback control on the lysine biosynthesis pathway. In T. maritima, the enzyme also lacks N-terminal and C-terminal loops which are present in enzyme of the former two organisms.This entry represents the N-terminal domain of dihydrodipicolinate reductase which binds the dinucleotide NAD(P)H.
Protein Domain      
Protein Domain
Name: Dihydrodipicolinate reductase, C-terminal
Type: Domain
Description: This entry represents the C-terminal region of Dihydrodipicolinate reductase.Dihydrodipicolinate reductase catalyzes the second step in the biosynthesis of diaminopimelic acid and lysine, the NAD or NADP-dependent reduction of 2,3-dihydrodipicolinate into 2,3,4,5-tetrahydrodipicolinate [ , , ].In Escherichia coli and Mycobacterium tuberculosis, dihydrodipicolinate reductase has equal specificity for NADH and NADPH, however in Thermotoga maritima there it has a greater affinity for NADPH [ ]. In addition, the enzyme is inhibited by high concentrations of its substrate, which consequently acts as a feedback control on the lysine biosynthesis pathway. In T. maritima, the enzyme also lacks N-terminal and C-terminal loops which are present in enzyme of the former two organisms.
Protein Domain
Name: Cysteine oxygenase/2-aminoethanethiol dioxygenase
Type: Family
Description: This entry includes cysteine oxidases (PCOs) from plants and 2-aminoethanethiol dioxygenases (ADOs) from animals. PCOs oxidize N-terminal cysteine residues, thus preparing the protein for N-end rule pathway-mediated proteasomal degradation [ ]. ADO is responsible for endogenous cysteamine dioxygenase activity [].
Protein Domain
Name: Ribosomal protein L34e, conserved site
Type: Conserved_site
Description: A number of eukaryotic and archaebacterial ribosomal proteins belong to the L34e family. These include, vertebrate L34, mosquito L31 [], plant L34 [],yeast putative ribosomal protein YIL052c and archaebacterial L34e. This entry represents the conserved site of Ribosomal protein L34e.
Protein Domain
Name: Potentiating neddylation domain
Type: Domain
Description: This domain is found in the eukaryotic family defective in cullin neddylation, which includes DCN1 and DCN1-like proteins. Proteins of the DCN family may contribute to neddylation of cullin components of SCF-type E3 ubiquitin ligase complexes, which are multi-protein complexes required for polyubiquitination and subsequent degradation of target proteins by the 26S proteasome [ , ].The structure of this domain is composed entirely of alpha helices [ , ]. It has been referred to as potentiating neddylation domain (PONY) and can be found in association with an N-terminal UBA domain. The PONY domain contains a cullin-binding surface within its C-terminal region and is sufficient to promote neddylation [, ].
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom