Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 2001 to 2100 out of 38750 for *

Category restricted to ProteinDomain (x)

0.024s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Name: Ataxin-2, C-terminal
Type: Domain
Description: This entry represents a conserved region approximately 250 residues long located towards the C terminus of eukaryotic ataxin-2. Ataxin-2 is a protein of unknown function, within which expansion of a polyglutamine tract (due to expansion of unstable CAG repeats in the coding region of the SCA2 gene) causes spinocerebellar ataxia type 2 (SCA2), a late-onset neurodegenerative disorder [ ]. The expanded polyglutamine repeat in ataxin-2 causes disruption of the normal morphology of the Golgi complex and increased incidence of cell death []. Ataxin-2 is predicted to consist of mostly non-globular domains [].
Protein Domain
Name: Ribonucleotide reductase-like
Type: Homologous_superfamily
Description: The R2 protein of ribonucleotide reductase catalyses the reduction of all four ribonucleotides to deoxyribonucleotides for use in DNA synthesis. This catalysis involves generating and storing a tyrosyl radical, which is essential for ribonucleotide reduction. The crystal structure consists of a core of four helices in a closed bundle with a left-handed twist and one crossover connection, and a bimetal-ion centre in the middle of the bundle [ ].This entry represents proteins that are structurally related to the R2 protein of class I ribonucleotide reductase, including the alpha and beta subunits of methane monooxygenase, and delta 9-stearoyl-acyl carrier protein desaturase [ ].
Protein Domain
Name: Ferritin-like superfamily
Type: Homologous_superfamily
Description: Ferritin is one of the major non-haem iron storage proteins in animals, plants, and microorganisms. It is a multisubunit protein with a hollow interior, which contains a mineral core of hydrated ferric oxide, thereby ensuring its solubility in an aqueous environment [ ]. Each subunit consists of a closed, four-helical bundle with a left-handed twist and one crossover connection.This entry represents the ferritin-like superfamily. Proteins with this structure include ferritin and other ferritin-like proteins such as bacterioferritin (cytochrome b1) that binds haem between two subunits, non-haem ferritin, dodecameric ferritin homologue (DPS) that binds to and protects DNA, and the N-terminal domain of rubrerythrin that is found in many air-sensitive bacteria and archaea [ ]. In addition, ribonucleotide reductase-like proteins show a similar structure to the ferritin-like fold; these di-iron carboxylate proteins constitute a diverse class of non-haem iron enzymes performing a multitude of redox reactions []. The superfamily also includes the alpha and beta subunits of methane monooxygenase hydrolase, delta 9-stearoyl-acyl carrier protein desaturase and manganese catalase.
Protein Domain
Name: Fatty acid desaturase type 2, conserved site
Type: Conserved_site
Description: Fatty acid desaturases are enzymes that catalyse the insertion of a double bond at the delta position of fatty acids. There seem to be two distinct families of fatty acid desaturases which do not seem to be evolutionary related.Family 1 is composed of:Stearoyl-CoA desaturase (SCD) ( ) [ ]. Family 2 is composed of:Bacterial fatty acid desaturases.Plant stearoyl-acyl-carrier-protein desaturase ( ) [ ], this enzyme catalyzes the introduction of a double bond at the delta(9) position of steraoyl-ACP to produce oleoyl-ACP. This enzyme is responsible for the conversion of saturated fatty acids to unsaturated fatty acids in the synthesis of vegetable oils.Cyanobacterial DesA [ ], an enzyme that can introduce a second cis double bond at the delta(12) position of fatty acid bound to membranes glycerolipids. DesA is involved in chilling tolerance; the phase transition temperature of lipids of cellular membranes being dependent on the degree of unsaturation of fatty acids of the membrane lipids.Members of this entry are endoplasmic reticulum (ER) integral membrane proteins that share the same mushroom-like shape fold consisting of four transmembrane helices (TM1-TM4) which anchor them to the membrane, capped by a cytosolic domain containing a unique 9-10 histidine-coordinating di metal (di-iron) catalytic centre [ , ]. The structure of mouse stearoyl-CoA desaturase (SDC) revealed that TM2 and TM4 are longer than TM1 and TM3 and protrude into the cytosolic domain, providing three of the nine histidine residues that coordinate the two metal ions, while the other histidine residues are provided by the soluble domain in this enzyme [].This conserved region is found at the C-terminal part of family 2 enzymes.
Protein Domain
Name: Fatty acid desaturase, type 2
Type: Family
Description: Fatty acid desaturases are enzymes that catalyze the insertion of a double bond at the delta position of fatty acids.There seem to be two distinct families of fatty acid desaturases which do not seem to be evolutionary related.Family 1 is composed of:Stearoyl-CoA desaturase (SCD) ( ) [ ]. Family 2 is composed of:Bacterial fatty acid desaturases.Plant stearoyl-acyl-carrier-protein desaturase ( ) [ ], this enzyme catalyzes the introduction of a double bond at the delta(9) positionof steraoyl-ACP to produce oleoyl-ACP. This enzyme is responsible for the conversion of saturated fatty acids to unsaturated fatty acids in thesynthesis of vegetable oils. Cyanobacterial DesA [ ], an enzyme that can introduce a second cis double bond at the delta(12) position of fatty acid bound to membranes glycerolipids. DesA is involved in chilling tolerance; the phase transition temperature of lipids of cellular membranes being dependent on the degree of unsaturation of fatty acids of the membrane lipids.This entry contains fatty acid desaturases belonging to Family 2, also known as Acyl-ACP (Acyl Carrier Protein) desaturases.
Protein Domain
Name: Beta-hydroxydecanoyl thiol ester dehydrase, FabA/FabZ
Type: Family
Description: Fatty acids biosynthesis occurs by two distinct pathways: in fungi, mammals and mycobacteria, type I or associative fatty-acid biosynthesis (type I FAS) is accomplished by multifunctional proteins in which distinct domains catalyse specific reactions; in plants and most bacteria, type II or dissociative fatty-acid biosynthesis (type II FAS) is accomplished by distinct enzymes [ ].Both FabZ and FabA catalyse the dehydration of beta-hydroxyacyl acyl carrier protein (ACP) to trans 2-enoyl ACP. However, FabZ and FabA display subtle differences in substrate specificities, whereby FabA is most effective on acyl ACPs of 9-11 carbon atoms in length, while FabZ is less specific. Unlike FabA, FabZ does not function as an isomerase and cannot initiate unsaturated fatty acid biosynthesis. However, only FabZ can act during the elongation of unsaturated fatty acid chains.
Protein Domain
Name: Chlorophyll A-B binding protein, plant and chromista
Type: Family
Description: The light-harvesting complex (LHC) consists of chlorophylls A and B and the chlorophyll A-B binding protein. LHC functions as a light receptor that captures and delivers excitation energy to photosystems I and II with which it is closely associated. Under changing light conditions, the reversible phosphorylation of light harvesting chlorophyll a/b binding proteins (LHCII) represents a system for balancing the excitation energy between the two photosystems [ ].The N terminus of the chlorophyll A-B binding protein extends into the stroma where it is involved with adhesion of granal membranes and photo-regulated by reversible phosphorylation of its threonine residues [ ]. Both these processes are believed to mediate the distribution of excitation energy between photosystems I and II.This family also includes the photosystem II protein PsbS, which plays a role in energy-dependent quenching that increases thermal dissipation of excess absorbed light energy in the photosystem [ ].This entry is limited to plant and chromista proteins.
Protein Domain
Name: Tyrosinase copper-binding domain
Type: Domain
Description: Tyrosinase ( ) [ ] is a copper monooxygenases that catalyzes thehydroxylation of monophenols and the oxidation of o-diphenols to o-quinols. This enzyme, found in prokaryotes as well as in eukaryotes, is involved in theformation of pigments such as melanins and other polyphenolic compounds. Tyrosinase binds two copper ions (CuA and CuB). Each of the two copper ions hasbeen shown [ ] to be bound by three conserved histidines residues. The regionsaround these copper-binding ligands are well conserved and also shared by some hemocyanins, which are copper-containing oxygen carriers from the hemolymph ofmany molluscs and arthropods [ , ].At least two proteins related to tyrosinase are known to exist in mammals, and include TRP-1 (TYRP1) [ ], which is responsible for the conversion of 5,6-dihydro-xyindole-2-carboxylic acid (DHICA) to indole-5,6-quinone-2-carboxylic acid; and TRP-2 (TYRP2) [], which is the melanogenic enzyme DOPAchrome tautomerase( ) that catalyzes the conversion of DOPAchrome to DHICA. TRP-2 differs from tyrosinases and TRP-1 in that it binds two zinc ions insteadof copper [ ].Other proteins that belong to this family are plant polyphenol oxidases (PPO) ( ), which catalyze the oxidation of mono- and o-diphenols to o-diquinones []; and Caenorhabditis elegans hypothetical protein C02C2.1.
Protein Domain
Name: Di-copper centre-containing domain superfamily
Type: Homologous_superfamily
Description: Copper active sites play a major role in biological dioxygen activation. Oxygen intermediates have been studied in detail for the proteins and enzymes involved in reversible O2 binding (hemocyanin), activation (tyrosinase), and four-electron reduction to water (multicopper oxidases). Tyrosinase binds two copper ions (CuA and CuB). Each of the two copper ions has been shown to be bound by three conserved histidine residues. The regions around these copper-binding ligands are well-conserved and also shared by some hemocyanins, which are copper-containing oxygen carriers from the hemolymph of many molluscs and arthropods [ ].
Protein Domain
Name: Polyphenol oxidase, C-terminal
Type: Domain
Description: This domain represents the C terminus of polyphenol oxidases. This region is primarily found in eukaryotes, although a few bacterial members also exist. It is typically between 138 and 152 amino acids in length and the family is found in association with and . Many members are plant or plastid polyphenol oxidases, and there is a highly conserved KFDV sequence motif.
Protein Domain
Name: Polyphenol oxidase, central domain
Type: Domain
Description: This domain is found in bacteria and eukaryotes and is approximately 50 amino acids in length. It is found in association with and . Most members are annotated as being polyphenol oxidases, and many are from plants or plastids. There is a conserved DWL sequence motif.
Protein Domain
Name: Polyphenol oxidase
Type: Family
Description: This group represents a polyphenol oxidase, plant type.
Protein Domain
Name: Membrane insertase YidC/Oxa/ALB, C-terminal
Type: Domain
Description: This entry represents the C-terminal domain of YidC/Oxa1/ALB proteins from some species and full length protein from other species. Members of this group of proteins are found in bacteria and eukaryotes.YidC is a bacterial membrane protein which is required for the insertion and assembly of inner membrane proteins [ , ]. The well-characterised YidC protein from Escherichia coli and its close homologues contain a large N-terminal periplasmic domain (). COX18 is a mitochondrial membrane insertase required for the translocation of the C terminus of cytochrome c oxidase subunit II (MT-CO2/COX2) across the mitochondrial inner membrane. It plays a role in MT-CO2/COX2 maturation following the COX20-mediated stabilization of newly synthesized MT-CO2/COX2 protein and before the action of the metallochaperones SCO1/2 [ ].OXA1 is a mitochondrial inner membrane insertase that mediates the insertion of both mitochondrion-encoded precursors and nuclear-encoded proteins from the matrix into the inner membrane. It links mitoribosomes with the inner membrane [ ].Plant ALBINO3-like proteins are required for the insertion of some light harvesting chlorophyll-binding proteins (LHCP) into the chloroplast thylakoid membrane [ , ].
Protein Domain
Name: PIGA, GPI anchor biosynthesis
Type: Domain
Description: This domain is found on phosphatidylinositol N-acetylglucosaminyltransferase proteins. These proteins are involved in GPI anchor biosynthesis and are associated with the disease paroxysmal nocturnal haemoglobinuria [ ].
Protein Domain      
Protein Domain
Name: Plus-3 domain
Type: Domain
Description: The yeast Paf1 complex consists of Pfa1, Rtf1, Cdc73, Ctr9, and Leo1. The complex regulates histone H2B ubiquitination, histone H3 methylation, RNA polymerase II carboxy-terminal domain (CTD) Ser2 phosphorylation, and RNA 3' end processing. The conservation of Paf1 complex function in higher eukaryotes has been confirmed in human cells, Drosophila and Arabidopsis. The Plus3 domain spans the most conserved regions of the Rtf1 protein and is surrounded by regions of low complexity and coiled-coil propensity [ ]. It contains only a limited number of highly conserved amino acids, among which are three positively charged residues that gave the Plus3 domain its name. The capacity to bind single-stranded DNA is at least one function of the Plus3 domain [].The plus-3 domain is about 90 residues in length and is often found associated with the GYF domain ( ). The Plus3 domain structure consists of six α-helices intervened by a sequence of six β-strands in a mixed α/β topology. β-strands 1, 2, 5, and 6 compose a four-stranded antiparallel β-sheet with a β-hairpin insertion formed by strands 3 and 4. The N-terminal helices α1-α3 and C-terminal helix α6 pack together to form an α subdomain, while the β-strands and the small 3(10) helix α4 form a β-subdomain. The two subdomains pack together to form a compact, globular protein [ ].
Protein Domain
Name: Domain of unknown function DUF382
Type: Domain
Description: This domain is specific to the human splicing factor 3b subunit 2 and its orthologs.
Protein Domain
Name: Peptidase C14, caspase domain
Type: Domain
Description: This domain can be found in caspases (MEROPS family C12A) and metacaspases (MEROPS family C14B). Metacaspases adopt a caspase fold, with active site loops arranged similarly as other caspases [ ].Caspases (Cysteine-dependent ASPartyl-specific proteASE) are cysteine peptidases [ ]. They are tightly regulated proteins that require zymogen activation to become active, and once active can be regulated by caspase inhibitors. Caspases are mainly involved in mediating cell death (apoptosis) [, , ]. They have two main roles within the apoptosis cascade: as initiators that trigger the cell death process, and as effectors of the process itself. Caspases can have roles other than in apoptosis, such as caspase-1 (interleukin-1 beta convertase) (), which is involved in the inflammatory process. The activation of apoptosis can sometimes lead to caspase-1 activation, providing a link between apoptosis and inflammation, such as during the targeting of infected cells. Caspases may also be involved in cell differentiation [ ].Metacaspases are arginine/lysine-specific, in contrast to caspases, which are aspartate-specific. They are found only in plants [ , ], fungi [] and lower eukaryotes, including the protozoa []. While plant metacaspases have been shown to be involved in cell death pathways, in other organisms they have evolved alternative functions [].
Protein Domain
Name: Tapt1 family
Type: Family
Description: This family of membrane proteins is conserved in eukaryotes. It includes Tapt1 (transmembrane anterior posterior transformation 1) and homologues. Analysis of mouse Tapt1 has shown it to be involved in patterning of the vertebrate axial skeleton [ ]. Its cellular function is not known, but defective Tapt1 disrupts Golgi morphology and trafficking, and normal primary cilium formation [].The homologues in yeast, endoplasmic reticulum membrane protein 65, and Arabidopsis, POD1, seem to be involved in protein folding in the endoplasmic reticulum [ , ].
Protein Domain      
Protein Domain
Name: Phosphatidylserine decarboxylase-related
Type: Family
Description: Phosphatidylserine decarboxylase plays a pivotal role in the synthesis of phospholipid by the mitochondria. The substrate phosphatidylserine is synthesized extramitochondrially and must be translocated to the mitochondria prior to decarboxylation [ ]. Phosphatidylserine decarboxylases is responsible for conversion of phosphatidylserine to phosphatidylethanolamine and plays a central role in the biosynthesis of aminophospholipids [ ].This family also includes L-tryptophan decarboxylase from the mushroom Psilocybe cubensis, which is required for the biosynthesis of the psychotropic agent psilocybin. This enzyme catalyses the first step in the biosynthetic pathway, converting L-tryptophan to tryptamine [ ].
Protein Domain
Name: Ribosomal protein L44e
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].A number of eukaryotic and archaeal ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of mammalian [ ], Trypanosoma brucei and fungal L44, Caenorhabditis elegans rpl-36.A, and Haloarcula marismortui LA [].
Protein Domain
Name: Nicastrin
Type: Family
Description: Nicastrin and presenilin are two major components of the gamma-secretase complex, which executes the intramembrane proteolysis of type I integral membrane proteins such as the amyloid precursor protein (APP) and Notch. Nicastrin is synthesised in fibroblasts and neurons as an endoglycosidase-H-sensitive glycosylated precursor protein (immature nicastrin) and is then modified by complex glycosylation in the Golgi apparatus and by sialylation in the trans-Golgi network (mature nicastrin) [ ].
Protein Domain
Name: Domain of unknown function DUF3700
Type: Domain
Description: This entry represents a domain found in plant proteins that is approximately 120 amino acids in length. There are two conserved sequence motifs: YGL and LRDR.
Protein Domain
Name: Transcription factor TFIIH subunit p52/Tfb2
Type: Family
Description: This entry represents the p52/Tfb2 subunit in the TFIIH complex, which is not only required for transcription but also plays a central role in DNA repair. The TFIIH multiprotein complex consists of a 7-subunit core (XPB, p62, p52, p44, p34, and TTDA) that is associated with a 3-subunit CDK-activating kinase module (MAT1, cyclin H and Cdk7) [ , , ]. The p52 subunit interacts with subunit XPB, which is an ATP-dependent 3'-5' DNA helicase, and stimulates its ATPase activity [].
Protein Domain
Name: Fibronectin type III-like domain
Type: Domain
Description: This domain has a fibronectin type III-like structure [ ]. It is often found in association with and . Its function is unknown.
Protein Domain
Name: LL-diaminopimelate aminotransferase/aminotransferase ALD1
Type: Family
Description: Two lysine biosynthesis pathways evolved separately in organisms, the diaminopimelic acid (DAP) and aminoadipic acid (AAA) pathways. The DAP pathway synthesizes L-lysine from aspartate and pyruvate, and diaminopimelic acid is an intermediate. This pathway is utilised by most bacteria, some archaea, some fungi, some algae, and plants. The AAA pathway synthesizes L-lysine from alpha-ketoglutarate and acetyl coenzyme A (acetyl-CoA), and alpha-aminoadipic acid is an intermediate. This pathway is utilised by most fungi, some algae, the bacterium Thermus thermophilus, and probably some archaea, such as Sulfolobus, Thermoproteus, and Pyrococcus. No organism is known to possess both pathways [ ].There four known variations of the DAP pathway in bacteria: the succinylase, acetylase, aminotransferase, and dehydrogenase pathways. These pathways share the steps converting L-aspartate to L-2,3,4,5- tetrahydrodipicolinate (THDPA), but the subsequent steps leading to the production of meso-diaminopimelate, the immediate precursor of L-lysine, are different [ ].The succinylase pathway acylates THDPA with succinyl-CoA to generate N-succinyl-LL-2-amino-6-ketopimelate and forms meso-DAP by subsequent transamination, desuccinylation, and epimerization. This pathway is utilised by proteobacteria and many firmicutes and actinobacteria. The acetylase pathway is analogous to the succinylase pathway but uses N-acetyl intermediates. This pathway is limited to certain Bacillus species, in which the corresponding genes have not been identified. The aminotransferase pathway converts THDPA directly to LL-DAP by diaminopimelate aminotransferase (DapL) without acylation. This pathway is shared by cyanobacteria, Chlamydia, the archaeon Methanothermobacter thermautotrophicus, and the plant Arabidopsis thaliana. The dehydrogenase pathway forms meso-DAP directly from THDPA, NADPH, and NH4 _ by using diaminopimelate dehydrogenase (Ddh). This pathway is utilised by some Bacillus and Brevibacterium species and Corynebacterium glutamicum. Most bacteria use only one of the four variants, although certain bacteria, such as C. glutamicum and Bacillus macerans, possess both the succinylase and dehydrogenase pathways.This entry includes LL-diaminopimelate aminotransferase DapL from bacteria and aminotransferase ALD1 from plants. DapL is involved in the synthesis of meso-diaminopimelate (m-DAP or DL-DAP), required for both lysine and peptidoglycan biosynthesis. This enzyme catalyzes the direct conversion of tetrahydrodipicolinate to LL-diaminopimelate, a reaction that requires three enzymes in E.coli. It is also able to use meso-diaminopimelate, cystathionine, lysine or ornithine as substrates [ ]. ALD1 is involved in the biosynthesis of pipecolate (Pip), a metabolite that orchestrates defense amplification, positive regulation of SA biosynthesis, and priming to guarantee effective local resistance induction and the establishment of SAR [, ].
Protein Domain
Name: Epoxide hydrolase-like
Type: Family
Description: The α/β hydrolase fold is common to a number of hydrolytic enzymes of widely differing phylogenetic origin and catalytic function. The core of each enzyme is an α/β-sheet (rather than a barrel), containing8 strands connected by helices [ ]. The enzymes are believed to have diverged from a common ancestor,preserving the arrangement of the catalytic residues. All have a catalytic triad, the elements of which are borne on loops, which are the best conserved structural features of the fold. The epoxide hydrolases (EH) add water toepoxides, forming the corresponding diol. On the basis of sequence similarity, it has been proposed that the mammalian soluble EHs contain 2 evolutionarily distinct domains, the N-terminal domain is similar to bacterialhaloacid dehalogenase, while the C-terminal domain is similar to soluble plant EH, microsomal EH, and bacterial haloalkane dehalogenase (HLD) []. The mechanism of HLD, established by X-ray crystallographic analysisof an HDL-substrate intermediate [ ], involves nucleophilic attack of Asp-124 on the halogen-substitutedterminal carbon of the substrate, forming a covalently-bound ester intermediate. The Asp-260/His-289 pair activate a water molecule that hydrolyses the ester intermediate to release the product. The similarity of EH toHLD is important for deducing a catalytic mechanism for EH. Mutagenesis experiments on murine soluble EH confirmed the crucial role of nucleophile Asp-333 and His-523 in the catalytic mechanism and the importance ofconserved His-263 and His-332 [ ].
Protein Domain
Name: Wings apart-like protein, C-terminal
Type: Domain
Description: This entry represents the conserved region of D. melanogaster wings apart-like protein, WAPL. It is involved in the regulation of heterochromatin structure [ ]. hWAPL (), the human homologue, is found to play a role in the development of cervical carcinogenesis, and is thought to have similar functions to Drosophila wapl protein [ ]. Malfunction of the hWAPL pathway is thought to activate an apoptotic pathway that consequently leads to cell death [].The WAPL-like proteins can be found in metazoa, fungi and plants.
Protein Domain
Name: Clathrin light chain
Type: Family
Description: Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport [ ]. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors [, ].Clathrin is a trimer composed of three heavy chains and three light chains, each monomer projecting outwards like a leg; this three-legged structure is known as a triskelion [ , ]. The heavy chains form the legs, their N-terminal β-propeller regions extending outwards, while their C-terminal α-α-superhelical regions form the central hub of the triskelion. Peptide motifs can bind between the β-propeller blades. The light chains appear to have a regulatory role, and may help orient the assembly and disassembly of clathrin coats as they interact with hsc70 uncoating ATPase []. Clathrin triskelia self-polymerise into a curved lattice by twisting individual legs together. The clathrin lattice forms around a vesicle as it buds from the TGN, plasma membrane or endosomes, acting to stabilise the vesicle and facilitate the budding process []. The multiple blades created when the triskelia polymerise are involved in multiple protein interactions, enabling the recruitment of different cargo adaptors and membrane attachment proteins []. This entry represents clathrin light chains, which are more divergent in sequence than the heavy chains [ ]. In higher eukaryotes, two genes encode distinct but related light chains, each of which can yield two separate forms via alternative splicing. In yeast there is a single light chain whose sequence is only distantly related to that of higher eukaryotes. Clathrin light chains have a conserved acidic N-terminal domain, a central coiled-coil domain and a conserved C-terminal domain.
Protein Domain
Name: K+/H+ exchanger
Type: Domain
Description: The monovalent Cation:Proton antiporter-2 (CPA2) family acts as a K+/H+ exchangers that facilitate potassium-efflux, possibly by potassium-proton antiport. This family includes KefB and KefC transporters, which are part of a glutathione-gated K(+) efflux system in Escherichia coli [ ]. The activity of the KefB and KefC potassium channels protects E. coli cells against methylglyoxal and limits the amount of DNA damage [].
Protein Domain
Name: Regulator of K+ conductance, N-terminal
Type: Domain
Description: The regulator of K+ conductance (RCK) domain is found in many ligand-gated K+ channels, most often attached to the intracellular carboxy terminus. The domain is prevalent among prokaryotic K+ channels, and also found in eukaryotic, high-conductance Ca2+-activated K+ channels (BK channels) [ , , ]. Largely involved in redox-linked regulation of potassium channels, the N-terminal part of the RCK domain is predicted to be an active dehydrogenase at least in some cases []. Some have a conserved sequence motif (G-x-G-x-x-G-x(n)-[DE]) for NAD+ binding [ ], but others do not, reflecting the diversity of ligands for RCK domains. The C-terminal part is less conserved, being absent in some channels, such as the kefC antiporter from Escherichia coli. It is predicted to bind unidentified ligands and to regulate sulphate, sodium and other transporters.The X-ray structure of several RCK domains has been solved [ , , ]. It reveals an α-β fold similar to dehydrogenase enzymes. The domain forms a homodimer, producing a cleft between two lobes. It has a composite structure, with an N-terminal (RCK-N), and a C-terminal (RCK-C) subdomain. The RCK-N subdomain forms a Rossmann fold with two alpha helices on one side of a six stranded parallel beta sheet and three alpha helices on the other side. The RCK-C subdomain is an all-β-strand fold. It forms an extention of the dimer interface and further stabilises the RCK homodimer [, , ]. Ca2+ is a ligand that opens the channel in a concentration-dependent manner. Two Ca2+ ions are located at the base of a cleft between two RCK domains, coordinated by the carboxylate groups of two glutamate residues, and by an aspartate residue [, , ].RCK domains occur in at least five different contexts:As a single domain on the C terminus of some K+ channels (for example, many prokaryotic K+ channels).As two tandem RCK domains on the C terminus of some transporters that form gating rings (for example, eukaryotic BK channels). The gating ring has an arrangement of eight identical RCK domains, one from each of the four pore-forming subunits and four from the intracellular solution.As two domains, one at the N terminus and another at the C terminus of transporter (for example, the prokaryotic trk system potassium uptake protein A) [ ].As a soluble protein (not part of a K+ channel) consisting of two tandem RCK domains.As a soluble protein consisting of a single RCK domain.This entry represents the N-terminal subdomain of RCK.
Protein Domain
Name: Tetrahydrofolate dehydrogenase/cyclohydrolase, conserved site
Type: Conserved_site
Description: Enzymes that participate in the transfer of one-carbon units require the coenzyme tetrahydrofolate (THF). Various reactions generate one-carbon derivatives of THF, which can be interconverted between differentoxidation states by methylene-THF dehydrogenase ( ), methenyl-THF cyclohydrolase ( ) and formyl-THF synthetase () [ , ]. The dehydrogenase and cyclohydrolaseactivities are expressed by a variety of multifunctional enzymes, including the tri-functional eukaryotic C1-tetrahydrofolate synthase []; a bifunctional eukaryotic mitochondrial protein; and thebifunctional Escherichia coli folD protein [ , ]. Methylene-tetrahydrofolate dehydrogenase andmethenyltetrahydrofolate cyclo-hydrolase share an overlapping active site [ ], and as such areusually located together in proteins, acting in tandem on the carbon-nitrogen bonds of substrates other than peptide bonds.The sequence of the dehydrogenase/cyclohydrolase domain is highly conserved in all forms of the enzyme. This entry contains two conserved signature patterns; the first one is located in the N-terminal part of these enzymes and contains three acidic residues. The second pattern is a highly conserved sequence of 9 amino acids, which is located in the C-terminal section.
Protein Domain
Name: Phosphomannomutase
Type: Family
Description: This enzyme ( ) is involved in the synthesis of the GDP-mannose and dolichol-phosphate-mannose required for a number of critical mannosyl transfer reactions [ ].
Protein Domain
Name: NADH:flavin oxidoreductase/NADH oxidase, N-terminal
Type: Domain
Description: The TIM-barrel fold is a closed barrel structure composed of an eight-fold repeat of beta-alpha units, where the eight parallel beta strands on the inside are covered by the eight alpha helices on the outside [ ]. It is a widely distributed fold which has been found in many enzyme families that catalyse completely unrelated reactions []. The active site is always found at the C-terminal end of this domain.Proteins in this entry are a variety of NADH:flavin oxidoreductase/NADH oxidase enzymes, found mostly in bacteria or fungi, that contain a TIM-barrel fold. They commonly use FMN/FAD as cofactor and include: dimethylamine dehydrogenasetrimethylamine dehydrogenase12-oxophytodienoate reductaseNADPH dehydrogenaseNADH oxidase
Protein Domain      
Protein Domain
Name: Glycosyl transferase CAP10 domain
Type: Domain
Description: The CAP10 domain is found in glycosyltransferases from animals, plants and fungi. Rumi is a Drosophila protein with a CAP10 domain that functions as a protein O-glucosyltransferase. In human and mouse, three potential homologues exist: one with a high degree of identity to Drosophila Rumi (52%), and two others with lower degrees of identity but including a CAP10 domain (KDELC1 and KDELC2) [ ]. Rumi catalyzes the transfer of glucose and/or xylose from UDP-glucose and UDP-xylose, respectively, to a serine within the consensus Cys-Xaa-Ser-Xaa-Pro-Cys) in epidermal growth factor repeats, such as those found in coagulation factors F7, F9 and NOTCH proteins. Notch signaling is regulated by Notch glucosylation and glucosylation is required for the correct folding and cleavage of Notch [, ].CAP10 from Cryptococcus neoformans encodes a xylosyltransferase [ ]. This pathogenic fungus, which most commonly affects the central nervous system and causes fatal meningoencephalitis primarily in patients with AIDS, produces a thick extracellular polysaccharide capsule which is well recognised as a virulence factor. The CAP10 domain is required for capsule formation and virulence []. The capsule is primarily made of two xylose-containing polysaccharides, glucuronoxylomannan and galactoxylomannan, and the glycosyltransferase transfers xylose to alpha-1,3-dimannoside in a beta-1,2-linkage [].
Protein Domain
Name: Squalene cyclase
Type: Family
Description: Several enzymes catalyse mechanistically related reactions which involve the highly complex cyclic rearrangement of squalene or its 2,3 oxide. Squalene cyclase (SQCY) and 2,3-oxidosqualene cyclase (OSQCY) are integral membrane proteins that catalyse a cationic cyclisation cascade converting linear triterpenes to fused ring compounds [ , ]. Lanosterol synthase () (oxidosqualene-lanosterol cyclase) catalyses the cyclisation of (S)-2,3-epoxysqualene to lanosterol, the initial precursor of cholesterol, steroid hormones and vitamin D in vertebrates and of ergosterol in fungi (gene ERG7). Cycloartenol synthase ( ) (2,3-epoxysqualene-cycloartenol cyclase), is a plant enzyme that catalyses the cyclization of (S)-2,3-epoxysqualene to cycloartenol [ ], and hopene synthase () (squalene-hopene cyclase), is a bacterial enzyme that catalyses the cyclisation of squalene into hopene or diplopterol, a key step in hopanoid (triterpenoid) metabolism [ , ] also found in Aspergillus fumigatus as part of the gene cluster that mediates the biosynthesis fumihopaside A []. These enzymes are evolutionary related [] proteins of about 70 to 85kDa and have an alpha 6 - alpha 6 barrel fold. Deletion of a single glycine residue of Alicyclobacillus acidocaldarius SQCY alters its substrate specificity into that of eukaryotic OSQCY []. Both enzymes have a second minor domain, which forms an α-α barrel that is inserted into the major domain.
Protein Domain
Name: PFTB repeat
Type: Repeat
Description: This repeat is found in a number of proteins, including prenyltransferase subunit beta and geranylgeranyl transferase subunit beta.
Protein Domain
Name: Serine/threonine-protein kinase Rio1
Type: Family
Description: This entry represents serine/threonine-protein kinases ( ) such as Rio1. RIO kinases are atypical members of the protein kinase family that are required for ribosome biogenesis and cell cycle progression [ ]. Rio1 is present in all organisms from archaea to mammals, and was shown to be absolutely essential in Saccharomyces cerevisiae (Baker's yeast) for the processing of 18S ribosomal RNA, as well as for proper cell cycle progression and chromosome maintenance [].
Protein Domain
Name: RIO kinase
Type: Domain
Description: Protein phosphorylation, which plays a key role in most cellular activities, is a reversible process mediated by protein kinases and phosphoprotein phosphatases. Protein kinases catalyse the transfer of the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. Phosphoprotein phosphatases catalyse the reverse process. Protein kinases fall into three broad classes, characterised with respect to substrate specificity []:Serine/threonine-protein kinasesTyrosine-protein kinasesDual specificity protein kinases (e.g. MEK - phosphorylates both Thr and Tyr on target proteins)Protein kinase function is evolutionarily conserved from Escherichia coli to human [ ]. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation []. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins. The catalytic subunits of protein kinases are highly conserved, and several structures have been solved [], leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases [].This entry represents RIO kinase, they exhibit little sequence similarity with eukaryotic protein kinases, and are classified as atypical protein kinases [ ]. The conformation of ATP when bound to the RIO kinases is unique when compared with ePKs, such as serine/threonine kinases or the insulin receptor tyrosine kinase, suggesting that the detailed mechanism by which the catalytic aspartate of RIO kinases participates in phosphoryl transfer may not be identical to that employed in known serine/threonine ePKs. Representatives of the RIO family are present in organisms varying from Archaea to humans, although the RIO3 proteins have only been identified in multicellular eukaryotes, to date. Yeast Rio1 and Rio2 proteins are required for proper cell cycle progression and chromosome maintenance, and are necessary for survival of the cells. These proteins are involved in the processing of 20 S pre-rRNA via late 18 S rRNA processing.
Protein Domain
Name: RIO kinase, conserved site
Type: Conserved_site
Description: Protein phosphorylation, which plays a key role in most cellular activities, is a reversible process mediated by protein kinases and phosphoprotein phosphatases. Protein kinases catalyse the transfer of the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. Phosphoprotein phosphatases catalyse the reverse process. Protein kinases fall into three broad classes, characterised with respect to substrate specificity []:Serine/threonine-protein kinasesTyrosine-protein kinasesDual specificity protein kinases (e.g. MEK - phosphorylates both Thr and Tyr on target proteins)Protein kinase function is evolutionarily conserved from Escherichia coli to human [ ]. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation []. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins. The catalytic subunits of protein kinases are highly conserved, and several structures have been solved [], leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases [].This entry represents RIO kinase, they exhibit little sequence similarity with eukaryotic protein kinases, and are classified as atypical protein kinases [ ]. The conformation of ATP when bound to the RIO kinases is unique when compared with ePKs, such as serine/threonine kinases or the insulin receptor tyrosine kinase, suggesting that the detailed mechanism by which the catalytic aspartate of RIO kinases participates in phosphoryl transfer may not be identical to that employed in known serine/threonine ePKs. Representatives of the RIO family are present in organisms varying from Archaea to humans, although the RIO3 proteins have only been identified in multicellular eukaryotes, to date. Yeast Rio1 and Rio2 proteins are required for proper cell cycle progression and chromosome maintenance, and are necessary for survival of the cells. These proteins are involved in the processing of 20 S pre-rRNA via late 18 S rRNA processing.
Protein Domain      
Protein Domain
Name: SCAMP
Type: Family
Description: In vertebrates, secretory carrier membrane proteins (SCAMPs) 1-3 constitute a family of putative membrane-trafficking proteins composed of cytoplasmic N-terminal sequences with NPF repeats, four central transmembrane regions (TMRs), and a cytoplasmic tail. SCAMPs probably function in endocytosis by recruiting EH-domain proteins to the N-terminal NPF repeats but may have additional functions mediated by their other sequences [ ].
Protein Domain
Name: MIF4G-like, type 3
Type: Domain
Description: MIF4G stands for middle domain of eukaryotic initiation factor 4G (eIF4G). eIF4G is a component of the translation initiation factor eIF4F complex and the cytoplasmic cap-binding protein complex (CBC). In the cytoplasm, cap binding complexes, distinct in their composition from nuclear cap-binding complexes, have important roles in the initiation of mRNA translation.The MIF4G domain also occurs in other proteins, including CBP80, a component of the nuclear CBC and NMD2, involved in the cytoplasmic nonsense-mediated mRNA decay. The domain is rich in α-helices and may contain multiple α-helical repeats. In eIF4G, this domain binds to the two other components of the eIF4F complex, eIF4A and eIF3E, and to RNA and DNA [ ].
Protein Domain
Name: Transposase, MuDR, plant
Type: Domain
Description: The plant MuDR transposase domain is present in plant (and some fungal) proteins that are presumed to be the transposases for Mutator transposable elements [ , ]. The function of these proteins is unknown.
Protein Domain
Name: Acetamidase/Formamidase
Type: Family
Description: This family includes amidohydrolases of formamide [ ] and acetamide [ , ]. The formamidase from Methylophilus methylotrophus (Bacterium W3A1) forms a homotrimer suggesting that this may be a common property of other members of this family.
Protein Domain
Name: 5'-3' exoribonuclease
Type: Family
Description: This entry includes 5'-3'exoribonuclease type 1 and type 2. Putative viral exonucleases 059L and 012L, plant Xrn3 and Xrn4 also belong to this family. 5'-3'-exoribonucleases are enzymes that degrade RNA by removing terminal nucleotides from the 5' end. An exosome and a 5'-3'-exoribonuclease are important in the degradation of very unstable transcripts [ ]. 5'-3'exoribonuclease type 1 (Xrn1, also known as kem1) occurs in animal and fungal lineages. In Saccharomyces cerevisiae, Xrn1 can be activated by Dcs1, a non-essential hydrolase that involved in mRNA decapping. The activation of Xrn1 by Dcs1 is important for respiration [ ].5'-3' exoribonuclease type 2 (Xrn2, also known as Rat1) occurs in animal, plant and fungal lineages. In Saccharomyces cerevisiae, Rat1 serves to terminate RNA polymerase II (RNAPII) molecules engaged in the production of uncapped RNA []. The concomitant loss of Xrn4 and ABH1/CBP80, a subunit of the mRNA cap binding complex, results in Arabidopsis plants manifesting myriad developmental defects [ ], suggesting that this enzyme is not only important for RNA processing.
Protein Domain
Name: Ribosomal protein L15, conserved site
Type: Conserved_site
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].L15 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L15 is known to bind the 23S rRNA. Ribosomal protein, L15 from bacteria and plant chloroplasts (nuclear-encoded) belong to this family. Vertebrate L27a, Tetrahymena thermophila L29 and fungal L27a (L29, CRP-1, CYH2) also are members of this group.Ribosomal L18E protein from a number of archaebacteria show homology to both the eukaryotic L18 and eubacterial ribosomal protein L15, an observation which has been seen to substantiate the belief that archaea represent an evolutionary stage between bacteria and eukaryotes [ ].This signature covers a conserved region in the C-terminal section of these proteins.
Protein Domain
Name: Ribosomal protein L18e/L15P
Type: Domain
Description: This entry represents both L15 and L18e ribosomal proteins, which share a common structure consisting mainly of parallel beta sheets (beta-α-β units) with a core of three turns of irregular (β-β-alpha)n superhelix [ , ]. This family includes higher eukaryotic ribosomal 60S L27A, prokaryotic 50S L15, fungal mitochondrial L10, plant L27A, mitochondrial L15, chloroplast L18-3 proteins, 60S L18 from eukaryotes and 50S L18e from Archaea.Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].
Protein Domain
Name: Ribosomal protein L15, bacterial-type
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].This entry represents ribosomal protein L15 and homologues found in bacteria, chloroplasts and mitochondria.
Protein Domain
Name: Malic oxidoreductase
Type: Family
Description: Malic enzymes (malate oxidoreductases) catalyse the oxidative decarboxylation of malate to form pyruvate, a reaction important in a number of metabolic pathways - e.g. carbon dioxide released from the reaction may be used in sugar production during the Calvin cycle of photosynthesis [ ]. There are 3 forms of the enzyme []: an NAD-dependent form that decarboxylates oxaloacetate; an NAD-dependent form that does not decarboxylate oxalo-acetate; and an NADPH-dependent form. The malic enzyme from Bacillus stearothermophilus is NADPH-dependent, and also catalyses the decarboxylation of oxalacetate []. Other proteins known to be similar to malic enzymes are the Escherichia coli scfA protein; an enzyme from Zea mays (Maize), formerly thought to be cinnamyl-alcohol dehydrogenase []; and the hypothetical Saccharomyces cerevisiae protein YKL029c.Studies on the liver malic enzyme reveals that it can be alkylated by bromopyruvate, resulting in the loss of oxidative decarboxylation and the subsequent enhancement of pyruvate reductase activity [ ]. The alkylated form is able to bind NADPH but not L-malate, indicating impaired substrate-or divalent metal ion-binding in the active site []. Sequence analysis has highlighted a cysteine residue as the point of alkylation, suggesting that it may play an important role in the activity of the enzyme [], although it is absent in the sequences from some species.There are three well conserved regions in the enzyme sequences. Two of them seem to be involved in the binding NAD or NADP. The significance of the third one, located in the central part of the enzymes, is not yet known.
Protein Domain
Name: Malic enzyme, NAD-binding
Type: Domain
Description: This entry represents the NAD-binding domain of malic enzymes. Malic enzymes (malate oxidoreductases) catalyse the oxidative decarboxylation of malate to form pyruvate, a reaction important in a number of metabolic pathways - e.g. carbon dioxide released from the reaction may be used in sugar production during the Calvin cycle of photosynthesis [ ]. There are 3 forms of the enzyme []: an NAD-dependent form that decarboxylates oxaloacetate; an NAD-dependent form that does not decarboxylate oxalo-acetate; and an NADPH-dependent form []. Other proteins known to be similar to malic enzymes are the Escherichia coli scfA protein; an enzyme from Zea mays (Maize), formerly thought to be cinnamyl-alcohol dehydrogenase []; and the hypothetical Saccharomyces cerevisiae protein YKL029c.Studies on the duck liver malic enzyme reveals that it can be alkylated by bromopyruvate, resulting in the loss of oxidative decarboxylation and the subsequent enhancement of pyruvate reductase activity [ ]. The alkylated form is able to bind NADPH but not L-malate, indicating impaired substrate or divalent metal ion-binding in the active site []. Sequence analysis has highlighted a cysteine residue as the point of alkylation, suggesting that it may play an important role in the activity of the enzyme [], although it is absent in the sequences from some species.Malic enzyme is a tetramer comprised of subunits with four domains each [ , , ].
Protein Domain      
Protein Domain
Name: Organic solute transporter subunit alpha/Transmembrane protein 184
Type: Family
Description: This entry includes Organic solute transporter subunit alpha (OSTalpha, also known as SLC51A) and Transmembrane protein 184 (TMEM184). Ost-alpha protein is a 7-transmembrane (TM) domain containing protein that forms a transporter complex with Ost-beta protein, which is a single-TM domain polypeptide. This heterodimerisation is required for the delivery of the complex to the plasma membrane. The OSTalpha-OSTbeta complex serves as a multispecific transporter that may participate in cellular uptake of bile acids, some endogenous and exogenous steroids, and eicosanoids. It functions via a facilitated diffusion mechanism. Interestingly, this transporter also transports dehydroepiandrosterone sulfate (DHEAS) and pregnenolone sulfate (PREGS), which are major excitatory neurosteroids. This suggests a possible function for OSTalpha-OSTbeta complex in the brain [ ]. In plants this complex may transport brassinosteroid-like compounds and act as regulators of cell death [].Human TMEM184C is a possible tumour suppressor which may play a role in cell growth [ ]. This entry also includes Arabidopsis protein LAZ1, which is required for programmed cell death (PCD) associated with hypersensitive response (HR) [].
Protein Domain
Name: Calcium-transporting P-type ATPase, N-terminal autoinhibitory domain
Type: Domain
Description: This entry represents the N-terminal autoinhibitory calmodulin-binding domain characteristic of certain calcium-transporting ATPases [ ]. This domain binds calmodulin in a calcium-dependent fashion and has a conserved RRFR sequence motif. There are two completely conserved residues (F and W) that may be functionally important.
Protein Domain
Name: Acyl transferase/acyl hydrolase/lysophospholipase
Type: Homologous_superfamily
Description: This superfamily represents a structural domain with a 3-layer α/β/α topology. This domain can be found in acyl transferases such as bacterial malonyl-CoA ACP transacylase (FabD) and the homologous domain from eukaryotic fatty acid synthase [ ]. This domain is also found in lysophospholipases such as cytosolic phospholipase A2 (which has additional structural features) [], and in patatin proteins, which are plant glycoproteins that act as non-specific lipid acyl hydrolases [].
Protein Domain
Name: Patatin-like phospholipase domain
Type: Domain
Description: The patatin glycoprotein is a nonspecific lipid acyl hydrolase that is found in high concentrations in mature potato tubers. Patatin is reported to play a role in plant signaling, to cleave fatty acids from membrane lipids, and to act as defense against plant parasites. Proteins encoding a patatin-like phospholipase (PNPLA) domain are ubiquitously distributed across all life forms, including eukaryotes and prokaryotes, and are observed to participate in a miscellany of biological roles, including sepsis induction, host colonization, triglyceride metabolism, and membrane trafficking. PNPLA domain containing proteins display lipase and transacylase properties and appear to have major roles in lipid and energy homeostasis [, , ].The ~180-amino acid PNPLA domain harbors the evolutionarily conserved consensus serine lipase motif Gly-X-Ser-X-Gly.cIt displays an alpha/beta class protein fold with approximately three layers, basically alpha/beta/alpha in content, in which a central six-stranded β-sheet is sandwiched essentially between α-helices front and back. The central β-sheet contains five parallel strands and an antiparallel strand at the edge of the sheet. The PNPLA domain has a Ser-Asp catalytic dyad. The catalytic Ser resides in a sharp nucleophile elbow turn loop which follows a β-strand(beta5) of the central β-sheet and precedes a helix (helix C) [ , ].
Protein Domain
Name: rRNA adenine dimethylase-like, C-terminal
Type: Homologous_superfamily
Description: KsgA is a universally conserved rRNA adenine dimethyltransferase that catalyses the transfer of a total of four methyl groups from S-adenosyl-l-methionine (S-AdoMet) to two adjacent adenosine bases in 16S rRNA. This enzyme and the resulting modified adenosine bases appear to be conserved in all species of eubacteria, eukaryotes, and archaebacteria, and in eukaryotic organelles. The KsgA enzymes are homologous to another family of RNA methyltransferases, the Erm enzymes, which methylate a single adenosine base in 23S rRNA [ ].This superfamily represents a rRNA adenine dimethylase-like domain present in these enzymes, which consists of four α-helices and one 3(10)-helix and forms a cleft with the N-terminal domain and is thought to stabilise it. Other studies have suggested that due to the positively charged surface patch, the C-terminal domain may be involved in recognition and binding of the ribosomal rRNA substrate, and mediates substrate specificity [ , ].
Protein Domain
Name: XPG-I domain
Type: Domain
Description: This entry represents a domain found on Xeroderma Pigmentosum Complementation Group G (XPG) protein [ ]. XPG is a DNA endonuclease involved in DNA excision repair []. The internal XPG (XPG-I) domain contains many cysteine and glutamate amino acid residues that are frequently found in various enzyme active sites of DNA nucleases. The I domain, together with the N-terminal, forms the catalytic domain that contains the active site [].
Protein Domain
Name: XPG, N-terminal
Type: Domain
Description: Xeroderma pigmentosum (XP) [ ] is a human autosomal recessive disease, characterised by a high incidence of sunlight-induced skin cancer. People's skin cells with this condition are hypersensitive to ultraviolet light, due to defects in the incision step of DNA excision repair. There are a minimum of seven genetic complementation groups involved in this pathway: XP-A to XP-G. XP-G is one of the most rare and phenotypically heterogeneous of XP, showing anything from slight to extreme dysfunction in DNA excision repair [, ]. XP-G can be corrected by a 133 Kd nuclear protein, XPGC []. XPGC is an acidic protein that confers normal UV resistance in expressing cells []. It is a magnesium-dependent, single-strand DNA endonuclease that makes structure-specific endonucleolytic incisions in a DNA substrate containing a duplex region and single-stranded arms [, ]. XPGC cleaves one strand of the duplex at the border with the single-stranded region [].XPG (ERCC-5) belongs to a family of proteins that includes RAD2 from Saccharomyces cerevisiae (Baker's yeast) and rad13 from Schizosaccharomyces pombe (Fission yeast), which are single-stranded DNA endonucleases [ , , ]; mouse and human FEN-1, a structure-specific endonuclease; RAD2 from fission yeast and RAD27 from budding yeast; fission yeast exo1, a 5'-3' double-stranded DNA exonuclease that may act in a pathway that corrects mismatched base pairs; yeast DHS1, and yeast DIN7. Sequence alignment of this family of proteins reveals that similarities are largely confined to two regions. The first is located at the N-terminal extremity (N-region) and corresponds to the first 95 to 105 amino acids. The second region is internal (I-region) and found towards the C terminus; it spans about 140 residues and contains a highly conserved core of 27 amino acids that includes a conserved pentapeptide (E-A-[DE]-A-[QS]). It is possible that the conserved acidic residues are involved in the catalytic mechanism of DNA excision repair in XPG. The amino acids linking the N- and I-regions are not conserved.This entry represents the N-terminal of XPG.
Protein Domain
Name: XPG/Rad2 endonuclease
Type: Family
Description: Xeroderma pigmentosum (XP) [ ] is a human autosomal recessive disease, characterised by a high incidence of sunlight-induced skin cancer. People's skin cells with this condition are hypersensitive to ultraviolet light, due to defects in the incision step of DNA excision repair. There are a minimum of seven genetic complementation groups involved in this pathway: XP-A to XP-G. XP-G is one of the most rare and phenotypically heterogeneous of XP, showing anything from slight to extreme dysfunction in DNA excision repair [, ]. XP-G can be corrected by a 133 Kd nuclear protein, XPGC []. XPGC is an acidic protein that confers normal UV resistance in expressing cells []. It is a magnesium-dependent, single-strand DNA endonuclease that makes structure-specific endonucleolytic incisions in a DNA substrate containing a duplex region and single-stranded arms [, ]. XPGC cleaves one strand of the duplex at the border with the single-stranded region [].XPG (ERCC-5) belongs to a family of proteins that includes RAD2 from Saccharomyces cerevisiae (Baker's yeast) and rad13 from Schizosaccharomyces pombe (Fission yeast), which are single-stranded DNA endonucleases [ , , ]; mouse and human FEN-1, a structure-specific endonuclease; RAD2 from fission yeast and RAD27 from budding yeast; fission yeast exo1, a 5'-3' double-stranded DNA exonuclease that may act in a pathway that corrects mismatched base pairs; yeast DHS1, and yeast DIN7. Sequence alignment of this family of proteins reveals that similarities are largely confined to two regions. The first is located at the N-terminal extremity (N-region) and corresponds to the first 95 to 105 amino acids. The second region is internal (I-region) and found towards the C terminus; it spans about 140 residues and contains a highly conserved core of 27 amino acids that includes a conserved pentapeptide (E-A-[DE]-A-[QS]). It is possible that the conserved acidic residues are involved in the catalytic mechanism of DNA excision repair in XPG. The amino acids linking the N- and I-regions are not conserved.Proteins in this family also includes yeast Mkt1, which is a post-transcriptional regulator. It contains two domains, XPG-N and XPG-I, which are conserved among a family of nucleases. However, it contains only two of the seven Asp residues involved in Mg2 binding suggesting that it has no nuclease activity [ , ].
Protein Domain
Name: Cysteine-rich transmembrane CYSTM domain
Type: Domain
Description: Proteins containing CYSTM domain are short cysteine-rich membrane proteins that most probably dimerise together to form a transmembrane sulfhydryl-lined pore. The CYSTM domain is always present at the extreme -terminus of the protein in which it is present. Furthermore, like the yeast prototypes, the majority of the proteins also possess a proline/glutamine-rich segment upstream of the CYSTM domain that is likely to form a polar, disordered head in the cytoplasm. The presence of an atypical well-conserved acidic residue at the C-terminal end of the TM helix suggests that this might interact with a positively charged moiety in the lipid head group. Consistently across the eukaryotes, the different versions of the CYSTM domain appear to have roles in stress-response or stress-tolerance, and, more specifically, in resistance to deleterious substances, implying that these might be general functions of the whole family [ , ]. This entry also includes Protein CADMIUM TOLERANCE 1-5 from rice, which confers resistance to heavy metal ions such as cadmium and copper [, ].
Protein Domain
Name: Amino-acid N-acetyltransferase
Type: Family
Description: This entry represents amino-acid N-acetyltransferase or N-acetylglutamate synthase, which is the product of the argA gene and the first enzyme in arginine biosynthesis. This enzyme displays more diversity between bacteria, fungi and mammals than other enzymes in arginine metabolism, and N-acetylglutamate itself can have different roles in different taxonomic groups: in prokaryotes, lower eukaryotes and plants it is the first intermediate in arginine biosynthesis, while in vertebrates it is an allosteric cofactor for the first enzyme in the urea cycle [ ]. In bacteria, arginine can regulate ornithine biosynthesis via a feedback inhibition mechanism []. This enzyme may also act on aspartate.
Protein Domain
Name: DAHP synthetase, class II
Type: Family
Description: Members of the 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthetase family ( ) catalyse the first step in aromatic amino acid biosynthesis from chorismate. Class I (see ) includes bacterial and yeast enzymes; class II includes higher plants and various microorganisms [ ]. They have minimal sequence identity and different regulatory and oligomeric properties, however, they have structural similarities that indicate a common ancestry for the class I and class II DAHPSs [].The first step in the common pathway leading to the biosynthesis of aromatic compounds is the stereospecific condensation of phosphoenolpyruvate (PEP) and D-erythrose-4-phosphate (E4P) giving rise to 3-deoxy-D-arabino-heptulosonate-7-phosphate (DAHP) [ ]. This reaction is catalyzed by DAHP synthase, a metal-activated enzyme, which in microorganisms is the target for negative-feedback regulation by pathway intermediates or by end products.
Protein Domain
Name: MCM domain
Type: Domain
Description: Proteins shown to be required for the initiation of eukaryotic DNA replication share a highly conserved domain of about 210 amino-acid residues [ , , ]. The latter shows some similarities [] with that of various other families of DNA-dependent ATPases. Eukaryotes seem to possess a family of eight proteins that contain this domain. They were first identified in yeast where most of them have a direct role in the initiation of chromosomal DNA replication by interacting directly with autonomously replicating sequences (ARS). They were thus called 'minichromosome maintenance proteins' with gene symbols prefixedby MCM. These six proteins are: MCM2, also known as cdc19 (in S.pombe).MCM3, also known as DNA polymerase alpha holoenzyme-associated protein P1, RLF beta subunit or ROA.MCM4, also known as CDC54, cdc21 (in S.pombe) or dpa (in Drosophila).MCM5, also known as CDC46 or nda4 (in S.pombe).MCM6, also known as mis5 (in S.pombe).MCM7, also known as CDC47 or Prolifera (in A.thaliana).MCM8, also known as as REC (in Drosophila).MCMThese proteins are evolutionarily related and belong to the AAA+ superfamily. They contain the Mcm family domain, which includes motifs that are required for ATP hydrolysis (such as the Walker A and B, and R-finger motifs). Mcm2-7 forms a hexameric complex which is the replicative helicase involved in replication initiation and elongation, whereas Mcm8 and Mcm9 from and separate one, conserved among many eukaryotes except yeast and C. elegans. Mcm8/9 complex play a role during replication elongation or recombination, being involved in the repair of double-stranded DNA breaks and DNA interstrand cross-links by homologous recombination. Drosophila is the only organism that has MCM8 without MCM9, involved in meiotic recombination [ , ].
Protein Domain
Name: Importin-beta, N-terminal domain
Type: Domain
Description: This entry represents the N-terminal domain of importin-beta (also known as karyopherins-beta) that is important for the binding of the Ran GTPase protein [ ].Members of the importin-beta (karyopherin-beta) family can bind and transport cargo by themselves, or can form heterodimers with importin-alpha. As part of a heterodimer, importin-beta mediates interactions with the pore complex, while importin-alpha acts as an adaptor protein to bind the nuclear localisation signal (NLS) on the cargo through the classical NLS import of proteins. Importin-beta is a helicoidal molecule constructed from 19 HEAT repeats. Many nuclear pore proteins contain FG sequence repeats that can bind to HEAT repeats within importins [ , ], which is important for importin-beta mediated transport.Ran GTPase helps to control the unidirectional transfer of cargo. The cytoplasm contains primarily RanGDP and the nucleus RanGTP through the actions of RanGAP and RanGEF, respectively. In the nucleus, RanGTP binds to importin-beta within the importin/cargo complex, causing a conformational change in importin-beta that releases it from importin-alpha-bound cargo. As a result, the N-terminal auto-inhibitory region on importin-alpha is free to loop back and bind to the major NLS-binding site, causing the cargo to be released [ ]. There are additional release factors as well.
Protein Domain      
Protein Domain      
Protein Domain
Name: Methyltransferase TRM13
Type: Domain
Description: This entry consists of eukaryotic and bacterial proteins that specifically methylates guanosine-4 in various tRNAs with a Gly(CCG), His or Pro signatures [ ]. The alignment contains some conserved cysteines and histidines that might form a zinc binding site.
Protein Domain
Name: TRM13/UPF0224 family, U11-48K-like CHHC zinc finger domain
Type: Domain
Description: This zinc binding domain [ ] has four conserved zinc chelating residues in a CHHC pattern and is known as the CHHC U11-48K-type zinc finger.The CHHC U11-48K-type zinc finger is found only in eukaryotes. It has been identified in spliceosomal U11-48K proteins, tRNA methyl-transferases TRM13 and gametocyte specific factors (GTSF). The CHHC U11-48K-type zinc finger (~30 residues) is present in a single copy in U11-48k AND TRM13, whereas GTSF contains two repeats separated by a short linker. The CHHC U11-48K-type zinc finger may function as a RNA recognition and binding module [ , ].The CHHC U11-48K-type zinc finger contains four invariant Cys and His residues with a consensus sequence of C-x(5,7)-H-x(7,9)-H-x(3,4)-C. It stochiometrically binds zinc ions in a one-to-one ratio. The structure of the CHHC U11-48K-type zinc finger consists of a beta hairpin followed by a helix. the zinc ion is coordinated by Cys and His residues from the beta hairpin and by His and Cys residues from the helix, respectively. The helical region has an α-helical conformation that is interrupted in the middle by a single pi-helical turn [ ].
Protein Domain
Name: Zinc finger, CCCH-type, TRM13
Type: Domain
Description: This domain is found at the N terminus of TRM13 methyltransferase proteins. It is presumed to be a zinc binding domain.
Protein Domain
Name: Uncharacterised conserved protein UCP031279
Type: Family
Description: There is currently no experimental data for members of this group or their homologues, nor do they exhibit features indicative of any function. Members of this entry are mainly found in proteobacteria.
Protein Domain
Name: Peptidase S9, serine active site
Type: Active_site
Description: Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes [ ]. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Many families of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases [].Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base [ ]. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [ , ].This signature defines the active site of the serine peptidases belonging to MEROPS peptidase family S9 (prolyl oligopeptidase family, clan SC). The protein fold of the peptidase domain for members of this family resembles that of serine carboxypeptidase D, the type example of clan SC. Examples of protein families containing this active site are:Prolyl endopeptidase ( ) (PE) (also called post-proline cleaving enzyme). PE is an enzyme that cleaves peptide bonds on the C-terminal sideof prolyl residues. The sequence of PE has been obtained from Sus scrofa (Pig) and from bacteria (Flavobacterium meningosepticum and Aeromonas hydrophila); there is a high degree of sequence conservationbetween these sequences. Escherichia coli protease II ( ) (oligopeptidase B) (gene prtB) which cleaves peptide bonds on the C-terminal side of lysyl and argininylresidues. Dipeptidyl peptidase IV ( ) (DPP IV). DPP IV is an enzyme that removes N-terminal dipeptides sequentially from polypeptides havingunsubstituted N-termini provided that the penultimate residue is proline. Saccharomyces cerevisiae (Baker's yeast) vacuolar dipeptidyl aminopeptidase A (DPAP A) (gene: STE13) which is responsible for the proteolytic maturation of the alpha-factor precursor.Yeast vacuolar dipeptidyl aminopeptidase B (DPAP B) (gene: DAP2).Acylamino-acid-releasing enzyme ( ) (acyl-peptide hydrolase). This enzyme catalyzes the hydrolysis of the amino-terminal peptide bond ofan N-acetylated protein to generate a N-acetylated amino acid and a protein with a free amino-terminus.This signature contains the conserved serine residue that has been experimentally shown (in E. coli protease II as well as in pig and bacterial PE) to be necessary for the catalytic mechanism. This serine, which is part of the catalytic triad (Ser, His, Asp), is generally located about 150 residues away from the C-terminal extremity ofthese enzymes (which are all proteins that contain about 700 to 800 amino acids).
Protein Domain
Name: Uncharacterised conserved protein UCP031088, alpha/beta hydrolase, At1g15070
Type: Family
Description: This group represents a predicted alpha/beta hydrolase, At1g15070 type, from plants.
Protein Domain
Name: Protein of unknown function DUF1499
Type: Family
Description: This family consists of several hypothetical bacterial and plant proteins of around 125 residues in length. The function of this family is unknown.
Protein Domain
Name: 4-hydroxy-tetrahydrodipicolinate synthase, DapA
Type: Family
Description: 4-hydroxy-tetrahydrodipicolinate synthase dapA is a homotetrameric enzyme of lysine biosynthesis. It catalyses the condensation of (S)-aspartate-beta-semialdehyde [(S)-ASA] and pyruvate to 4-hydroxy-tetrahydrodipicolinate (HTPA) []. E. coli has several paralogs closely related to dihydrodipicoline synthase, as well as the more distant N-acetylneuraminate lyase. It is worth noting that despite the real product of this enzyme being 4-hydroxy-2,3,4,5-tetrahydro-L,L-dipicolinic acid, it is still known in most publications as dihydropicolinate synthase (DHDPS).The sequences of dapA from different sources are well-conserved. The structure takes the form of a homotetramer, in which 2 monomers arerelated by an approximate 2-fold symmetry [ ]. Each monomer comprises 2 domains: an 8-fold alpha-/β-barrel, and a C-terminal α-helical domain. The fold resembles that of N-acetylneuraminate lyase. The active site lysine is located in the barrel domain, and has access via 2 channels on the C-terminal side of the barrel [].
Protein Domain
Name: DapA-like
Type: Family
Description: Dihydrodipicolinate synthase ( ) (DHDPS, DapA) catalyses, in higher plants, some fungi and bacteria (gene dapA), the first reaction specific to the biosynthesis of lysine and of diaminopimelate [ ]. DHDPS is responsible for the condensation of aspartate semialdehyde and pyruvate by a ping-pong mechanism in which pyruvate first binds to the enzyme by forming aSchiff-base with a lysine residue [ , ].Other proteins are structurally related to DHDPS and probably also act via a similar catalytic mechanism [ ]:Escherichia coli N-acetylneuraminate lyase (EC 4.1.3.3) (gene nanA), which catalyses the condensation of N-acetyl-D-mannosamine and pyruvate to form N-acetylneuraminate.Trans-o-hydroxybenzylidenepyruvate hydratase-aldolase.D-4-deoxy-5-oxoglucarate dehydratase.Rhizobium meliloti protein mosA [ ], which is involved in the biosynthesis of the rhizopine 3-o-methyl-scyllo-inosamine.Thermoproteus tenax 2-dehydro-3-deoxy-D-gluconate/2-dehydro-3-deoxy-phosphogluconate aldolase (KdgA) [].
Protein Domain
Name: Schiff base-forming aldolase, active site
Type: Active_site
Description: This entry represents an active site found in members of a structural superfamily of Schiff base-forming aldolases that catalyse reactions in different biological pathways. This superfamily includes members such as dihydrodipicolinate synthase (DHDPS), N-acetylneuraminate lyase (NAL) and 2-keto-3-deoxygluconate aldolase (KDG aldolase) []. One of the Escherichia coli proteins containing this site, dapA (), was first identified as dihydrodipicolinate synthase (DHDPS) [ ]. Later, it has been identified as 4-hydroxy-tetrahydrodipicolinate synthases () [ ]. Among the putative DHDPS genes annotated in the A. tumefaciens genome, dapA7 gene product has been shown to possess DHDPS enzyme activity and is allosterically inhibited by lysine [ ].
Protein Domain
Name: Schiff base-forming aldolase, conserved site
Type: Conserved_site
Description: This entry represents an conserved site found in members of a structural superfamily of Schiff base-forming aldolases that catalyse reactions in different biological pathways. This superfamily includes members such as dihydrodipicolinate synthase (DHDPS), N-acetylneuraminate lyase (NAL) and 2-keto-3-deoxygluconate aldolase (KDG aldolase) []. One of the Escherichia coli proteins containing this site, dapA (), was first identified as dihydrodipicolinate synthase (DHDPS) []. Later, it has been identified as 4-hydroxy-tetrahydrodipicolinate synthases () [ ]. Among the putative DHDPS genes annotated in the A. tumefaciens genome, dapA7 gene product has been shown to possess DHDPS enzyme activity and is allosterically inhibited by lysine [ ].
Protein Domain
Name: Rubber elongation factor
Type: Family
Description: This family consists of the highly related rubber elongation factor (REF), small rubber particle protein (SRPP) and stress-related protein (SRP) sequences. REF and SRPP are released from the rubber particle membrane into the cytosol during osmotic lysis of the sedimentable organelles (lutoids). The exact function of this family is unknown [ ].
Protein Domain
Name: F-box associated domain, type 3
Type: Domain
Description: This domain occurs in a diverse superfamily of genes in plants. Most examples are found C-terminal to an F-box ( ), a 60 amino acid motif involved in ubiquitination of target proteins to mark them for degradation. Two-hybid experiments support the idea that most members are interchangeable F-box subunits of SCF E3 complexes [ ]. Some members have two copies of this domain.
Protein Domain
Name: Sec7 domain
Type: Domain
Description: Protein containing this domain are highly divergent in their overall sequence, however, they share a common region of roughly 200 amino acids known as the SEC7 domain [[cite27373159], ]. The 3D structure of the domain displays several α-helices []. It was found to be associated with other domains involved in guanine nucleotide exchange (e.g., CDC25, Dbl) in mammalian guanine-nucleotide-exchange factors [].SEC7 domain containing proteins are guanine nucleotide exchange factors (GEFs) specific for the ADP-rybosylation factors (ARF), a Ras-like GTPases which is important for vesicular protein trafficking. These proteins can be divided into five families, based on domain organisation and conservation of primary amino acid sequence: GBF/BIG, cytohesins,eFA6, BRAGs, and F-box [ ]. They are found in all eukaryotes, and are involved in membrane remodeling processes throughout the cell [].
Protein Domain
Name: Mon2/Sec7/BIG1-like, HDS
Type: Domain
Description: This entry represents a HDS (homology downstream of Sec7) domain found towards the C-terminal of guanine nucleotide exchange factors involved Golgi transport, such as budding yeast protein Sec7, protein Mon2 and BIG1-like proteins [ , ]. Sec7 is involved in the secretory pathway as a protein binding scaffold for the COPII-COPI protein switch for maturation of the VTC intermediate compartments for Golgi compartment biogenesis []. Sec7 has four conserved HDS1-4 domains which act to integrate the signals from several small GTPases, including Arf1 itself, to switch Sec7 from a strongly autoinhibited to a strongly auto activated form [].
Protein Domain
Name: Sec7, C-terminal domain superfamily
Type: Homologous_superfamily
Description: The Sec7 domain was named after the first protein found to contain such a region [ ]. It has been shown to be linked with guanine nucleotide exchange function [, ]. The 3D structure of the domain displays several α-helices []. It was found to be associated with other domains involved in guanine nucleotide exchange (e.g., CDC25, Dbl) in mammalian factors [ ].This superfamily represents the alpha orthogonal structural domain which is found at the C terminus of the Sec7 domain ( ).
Protein Domain
Name: CDC48, domain 2
Type: Domain
Description: The CDC48 N-terminal domain is a protein domain found in AAA ATPases including cell division protein 48 (CDC48), VCP-like ATPase (VAT) and N-ethylmaleimide sensitive fusion protein. It is a substrate recognition domain which binds polypeptides, prevents protein aggregation, and catalyses refolding of permissive substrates. It is composed of two equally sized subdomains. The amino-terminal subdomain forms a double-psi β-barrel whose pseudo-twofold symmetry is mirrored by an internal sequence repeat of 42 residues. The carboxy-terminal subdomain forms a novel six-stranded β-clamp fold [ ]. Together these subdomains form a kidney-shaped structure. This entry represents the carboxy-terminal subdomain.
Protein Domain
Name: TSC-22 / Dip / Bun
Type: Family
Description: Several eukaryotic proteins are evolutionary related and are thought to be involved in transcriptional regulation. These proteins are highly similar in a region of about 50 residues that include a conserved leucine-zipper domain most probably involved in homo- or hetero-dimerisation. Proteins containing this signature include:Vertebrate protein TSC-22 [ ], a transcriptional regulator which seems to act on C-type natriuretic peptide (CNP)promoter. Mammalian protein DIP (DSIP-immunoreactive peptide) [ ], a protein whose function is not yet known.Drosophila protein bunched [ ] (gene bun) (also known as shortsighted), a probable transcription factor required for peripheral nervous system morphogenesis, eye development and oogenesis.Caenorhabditis elegans hypothetical protein T18D3.7.
Protein Domain
Name: Casparian strip membrane protein
Type: Family
Description: This family consists of CASP and CASP-like proteins. In vascular plants Casparian strips span the cell wall of adjacent endodermal cells to form a tight junction that blocks extracellular diffusion [ ]. Casparian Strip Membrane Domain Proteins (CASPs) are four-membrane-span proteins that recruit the lignin polymerisation machinery necessary for the deposition of Casparian strips in the endodermis. CASP-like proteins (CASPLs) are found in all major divisions of land plants as well as in green algae. In Arabidopsis, CASPLs show specific expression in a variety of cell types [].
Protein Domain      
Protein Domain
Name: Phosphatidic acid phosphatase type 2/haloperoxidase
Type: Domain
Description: This entry represents type 2 phosphatidic acid phosphatase (PAP2; ) enzymes, such as phosphatidylglycerophosphatase B from Escherichia coli. PAP2 enzymes have a core structure consisting of a 5-helical bundle, where the beginning of the third helix binds the cofactor [ ]. PAP2 enzymes catalyse the dephosphorylation of phosphatidate, yielding diacylglycerol and inorganic phosphate []. In eukaryotic cells, PAP activity has a central role in the synthesis of phospholipids and triacylglycerol through its product diacylglycerol, and it also generates and/or degrades lipid-signalling molecules that are related to phosphatidate.Other related enzymes have a similar core structure, including haloperoxidases such as bromoperoxidase (contains one core bundle, but forms a dimer), chloroperoxidases (contains two core bundles arranged as in other family dimers), bacitracin transport permease from Bacillus licheniformis, glucose-6-phosphatase from rat. The vanadium-dependent haloperoxidases exclusively catalyse the oxidation of halides, and act as histidine phosphatases, using histidine for the nucleophilic attack in the first step of the reaction [ ]. Amino acid residues involved in binding phosphate/vanadate are conserved between the two families, supporting a proposal that vanadium passes through a tetrahedral intermediate during the reaction mechanism.
Protein Domain
Name: Hydantoinase/dihydropyrimidinase
Type: Family
Description: Dihydropyrimidinase (DHPase) catalyses the second step of the reductive pyrimidine degradation, the reversible hydrolytic ring opening of dihydropyrimidines [ ]. Primarily converts 5,6-dihydrouracil to N-carbamyl-beta-alanine (also called 3-ureidopropanoate) but also acts on dihydrothymine and hydantoin. The enzyme is a metalloenzyme [].This entry represents the hydantoinase/dihydropyrimidinase family, which also includes D-phenylhydantoinases. This enzyme catalyses the stereospecific hydrolysis of the cyclic amide bond of D-hydantoin derivatives with an aromatic side chains at the 5'-position, and has no activity on dihydropyrimidines [ ].Dihydropyrimidinase-related proteins (collapsin response mediator proteins, CRMPs) share sequence similarity with liver DHPase. Although purified CRMP does not hydrolyse DHPase substrates, it is likely that a relatedactivity accounts for its participation in neuronal growth cone signaling [ ]. CRMP3 has histone H4 deacetylase activity [].
Protein Domain
Name: CyclinH/Ccl1
Type: Family
Description: Cyclins are eukaryotic proteins that play an active role in controlling nuclear cell division cycles [ ], and regulate cyclin dependent kinases (CDKs). This entry includes cyclin-H from vertebrates, mcs2 from fission yeast and Ccl1 from budding yeasts. They are cyclins that play a role in cell cycle. They are also subunits forming the core-TFIIH basal transcription factor. Human cyclin-H regulates CDK7, the catalytic subunit of the CDK-activating kinase (CAK) enzymatic complex [ ]. mcs2 posseses kinase activity that can be detected when myelin basic protein (MBP) is provided as an exogenous substrate []. Ccl1 is a regulatory component of the TFIIK complex, which is the protein kinase component of transcription factor IIH (TFIIH). TFIIH is essential for both basal and activated transcription and is involved in nucleotide excision repair (NER) of damaged DNA and in polymerase II transcription [].
Protein Domain
Name: Solute carrier family 13
Type: Family
Description: Integral membrane proteins that mediate the intake of a wide variety of molecules with the concomitant uptake of sodium ions (sodium symporters) canbe grouped, on the basis of sequence and functional similarities into a number of distinct families. The SLC13 family consists mainly of dicarboxylate and sulfate transporters [, ], and includes of the following proteins:Mammalian sodium/sulphate cotransporter [ ].Mammalian renal sodium/dicarboxylate cotransporter [ ], which transports succinate and citrate.Mammalian sodium/citrate cotransporter [ ], which mediates the entry of citrate into cells, which is a critical participant of biochemical pathways [, ].Mammalian intestinal sodium/dicarboxylate cotransporter.Chlamydomonas reinhardtii putative sulphur deprivation response regulator SAC1 [ ].These transporters are proteins of from 430 to 620 amino acids which are highly hydrophobic and which probably contain about 12 transmembrane regions.
Protein Domain
Name: Enolase
Type: Family
Description: Enolase (2-phospho-D-glycerate hydrolase) is an essential, homodimeric enzyme that catalyses the reversible dehydration of 2-phospho-D-glycerate to phosphoenolpyruvate as part of the glycolytic and gluconeogenesis pathways [ , ]. The reaction is facilitated by the presence of metal ions []. In vertebrates, there are 3 different, tissue-specific isoenzymes, designated alpha, beta and gamma. Alpha is present in most tissues, beta is localised in muscle tissue, and gamma is found only in nervous tissue. The functional enzyme exists as a dimer of any 2 isoforms. In immature organs and in adult liver, it is usually an alpha homodimer, in adult skeletal muscle, a beta homodimer, and in adult neurons, a gamma homodimer. In developing muscle, it is usually an alpha/beta heterodimer, and in the developing nervous system, an alpha/gamma heterodimer []. The tissue specific forms display minor kinetic differences. Tau-crystallin, one of the major lens proteins in some fish, reptiles and birds, has been shown [] to be evolutionary related to enolase.Neuron-specific enolase is released in a variety of neurological diseases, such as multiple sclerosis and after seizures or acute stroke. Several tumour cells have also been found positive for neuron-specific enolase. Beta-enolase deficiency is associated with glycogenosis type XIII defect.
Protein Domain
Name: Enolase, C-terminal TIM barrel domain
Type: Domain
Description: Enolase (2-phospho-D-glycerate hydrolase) is an essential, homodimeric enzyme that catalyses the reversible dehydration of 2-phospho-D-glycerate to phosphoenolpyruvate as part of the glycolytic and gluconeogenesis pathways [, ]. The reaction is facilitated by the presence of metal ions []. In vertebrates, there are 3 different, tissue-specific isoenzymes, designated alpha, beta and gamma. Alpha is present in most tissues, beta is localised in muscle tissue, and gamma is found only in nervous tissue. The functional enzyme exists as a dimer of any 2 isoforms. In immature organs and in adult liver, it is usually an alpha homodimer, in adult skeletal muscle, a beta homodimer, and in adult neurons, a gamma homodimer. In developing muscle, it is usually an alpha/beta heterodimer, and in the developing nervous system, an alpha/gamma heterodimer []. The tissue specific forms display minor kinetic differences. Tau-crystallin, one of the major lens proteins in some fish, reptiles and birds, has been shown [] to be evolutionary related to enolase.Neuron-specific enolase is released in a variety of neurological diseases, such as multiple sclerosis and after seizures or acute stroke. Several tumour cells have also been found positive for neuron-specific enolase. Beta-enolase deficiency is associated with glycogenosis type XIII defect.
Protein Domain
Name: Amino acid permease, conserved site
Type: Conserved_site
Description: Amino acid permeases are integral membrane proteins involved in the transport of amino acids into the cell. A number of such proteins have been found to beevolutionary related [ , , ].These proteins seem to contain up to 12 transmembrane segments. The best conserved region in this family is located in the second transmembrane segment.
Protein Domain
Name: Auxin efflux carrier, plant type
Type: Family
Description: This entry is mostly composed of known or predicted PIN proteins from plants, though some homologous prokaryotic proteins are also included. The PIN proteins are components of auxin efflux systems from plants. These carriers are saturable, auxin-specific, and localized to the basal ends of auxin transport-competent cells [ , ]. Plants typically posses several of these proteins, each displaying a unique tissue-specific expression pattern. They are expressed in almost all plant tissues including vascular tissues and roots, and influence many processes including the establishment of embryonic polarity, plant growth, apical hook formation in seedlings and the photo- and gravitrophic responses. These plant proteins are typically 600-700 amino acyl residues long and exhibit 8-12 transmembrane segments.
Protein Domain
Name: Vps4 oligomerisation, C-terminal
Type: Domain
Description: This domain is found at the C-terminal of ATPase proteins involved in vacuolar sorting. It forms an alpha helix structure and is required for oligomerisation [ ].
Protein Domain
Name: Glutamyl/glutaminyl-tRNA synthetase
Type: Family
Description: The aminoacyl-tRNA synthetases (also known as aminoacyl-tRNA ligases) catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction [ , ]. These proteins differ widely in size and oligomeric state, and have limited sequence homology []. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric []. Class II aminoacyl-tRNA synthetases share an anti-parallel β-sheet fold flanked by α-helices [], and are mostly dimeric or multimeric, containing at least three conserved regions [, , ]. However, tRNA binding involves an α-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan, valine, and some lysine synthetases (non-eukaryotic group) belong to class I synthetases. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, phenylalanine, proline, serine, threonine, and some lysine synthetases (non-archaeal group), belong to class-II synthetases. Based on their mode of binding to the tRNA acceptor stem, both classes of tRNA synthetases have been subdivided into three subclasses, designated 1a, 1b, 1c and 2a, 2b, 2c [].Glutamate-tRNA ligase (also known as glutamyl-tRNA synthetase; ) is a class Ic ligase and shows several similarities with glutamate-tRNA ligase concerning structure and catalytic properties. It is an alpha2 dimer. To date one crystal structure of a glutamate-tRNA ligase (Thermus thermophilus) has been solved. The molecule has the form of a bent cylinder and consists of four domains. The N-terminal half (domains 1 and 2) contains the 'Rossman fold' typical for class I ligases and resembles the corresponding part of Escherichia coli GlnRS, whereas the C-terminal half exhibits a GluRS-specific structure [ ].
Protein Domain
Name: Glutamine-tRNA ligase, alpha-bundle domain superfamily
Type: Homologous_superfamily
Description: The aminoacyl-tRNA synthetases (also known as aminoacyl-tRNA ligases) catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction [ , ]. These proteins differ widely in size and oligomeric state, and have limited sequence homology []. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric []. Class II aminoacyl-tRNA synthetases share an anti-parallel β-sheet fold flanked by α-helices [], and are mostly dimeric or multimeric, containing at least three conserved regions [, , ]. However, tRNA binding involves an α-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan, valine, and some lysine synthetases (non-eukaryotic group) belong to class I synthetases. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, phenylalanine, proline, serine, threonine, and some lysine synthetases (non-archaeal group), belong to class-II synthetases. Based on their mode of binding to the tRNA acceptor stem, both classes of tRNA synthetases have been subdivided into three subclasses, designated 1a, 1b, 1c and 2a, 2b, 2c [].Glutamate-tRNA ligase (also known as glutamyl-tRNA synthetase; ) is a class Ic ligase and shows several similarities with glutamate-tRNA ligase concerning structure and catalytic properties. It is an alpha2 dimer. To date one crystal structure of a glutamate-tRNA ligase (Thermus thermophilus) has been solved. The molecule has the form of a bent cylinder and consists of four domains. The N-terminal half (domains 1 and 2) contains the 'Rossman fold' typical for class I ligases and resembles the corresponding part of Escherichia coli GlnRS, whereas the C-terminal half exhibits a GluRS-specific structure [ ].This superfamily represents the C-terminal end of the Glutamine-tRNA ligase catalytic domain. It forms an α-bundle domain.
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom