Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 1701 to 1800 out of 38750 for *

Category restricted to ProteinDomain (x)

0.02s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Name: Sec1-like, domain 2
Type: Homologous_superfamily
Description: Sec1-like molecules have been implicated in a variety of eukaryotic vesicle transport processes including neurotransmitter release by exocytosis [ ].They regulate vesicle transport by binding to a t-SNARE from the syntaxin family. This process is thought to prevent SNARE complex formation, a protein complex required for membrane fusion. Whereas Sec1 molecules are essential for neurotransmitter release and other secretory events, their interaction with syntaxin molecules seems to represent a negative regulatory step in secretion []. The nSec1 polypeptide chain can be divided into three domains. The first domain, consists of a five-stranded parallel β-sheet flanked by five α-helices. The second domain, like the first one, has an α-β-alpha fold, however the β-sheet of domain 2 features five parallel strands with an additional antiparallel strand on one edge. The third domain is a large insertion between the third and fourth parallel strands of domain 2, and can be subdivided in two [ ].This entry represents domain 2 from the Sec1 family which includes Sec1, Sly1, Slp1/Vps33, yeast Vps45/Stt10, Unc-18 from nematodes, Munc-18b/muSec1, Munc-18c from mouse, Rop from Drosophila, Munc-18/n-Sec1/rbSec1A and rbSec1B from rat [ , , , ].
Protein Domain
Name: Sec1-like protein
Type: Family
Description: Sec1-like molecules have been implicated in a variety of eukaryotic vesicle transport processes including neurotransmitter release by exocytosis [ ].They regulate vesicle transport by binding to a t-SNARE from the syntaxin family. This process is thought to prevent SNARE complex formation, a protein complex required for membrane fusion. Whereas Sec1 molecules are essential for neurotransmitter release and other secretory events, their interaction with syntaxin molecules seems to represent a negative regulatory step in secretion [].
Protein Domain      
Protein Domain      
Protein Domain
Name: Gibberellin regulated protein
Type: Family
Description: This is the GASA gibberellin regulated cysteine rich protein (GRPs) family. The expression of these proteins is up-regulated by the plant hormone gibberellin, most of these proteins have a role in plant development and some of its members have antimicrobial activity [ , ]. There are 12 cysteine residues conserved within the alignment giving the potential for these proteins to possess 6 disulphide bonds.Included in this family are some GRPs found in fruits and pollens that have been identified as allergens, including peach Pru p 7, Japanese apricot Pru m 7, orange Cit s 7, pomegranate Pun g 7, and cypress pollen GRP [ , , ].
Protein Domain
Name: Cupin 1
Type: Domain
Description: This entry represents the conserved β-barrel fold of the 'cupin' superfamily ('cupa' is the Latin term for a small barrel). This family contains 11S and 7S plant seed storage proteins, and germins. Plant seed storage proteins provide the major nitrogen source for the developing plant [ , , ].This domain can also be found as a central component of many microbial proteins including certain types of phosphomannose isomerase, polyketide synthase, epimerase, and dioxygenase [ ].
Protein Domain
Name: 11-S seed storage protein, plant
Type: Family
Description: Plant seed storage proteins, whose principal function appears to be the major nitrogen source for the developing plant, can be classified, on the basis oftheir structure, into different families. 11S-type globulins are non-glycosylated proteins which form hexameric structures [ , ]. Each of the subunits in the hexamer is itself composed of an acidic and a basic chain derived from a single precursor and linked by a disulphide bond. This structure is shown in the followingrepresentation. +-------------------------+ | |xxxxxxxxxxxCxxxxxxxxxxxxxxxxxxxxxxNGxCxxxxxxxxxxxxxxxxxxxxxxx |------Acidic-subunit-------------||-----Basic-subunit------||-----------------About-480-to-500-residues-----------------| 'C': conserved cysteine involved in a disulphide bond.Members of the 11-S family include pea and broad bean legumins, oil seed rapecruciferin, rice glutelins, cotton beta-globulins, soybean glycinins, pumpkin 11-S globulin, oat globulin, sunflower helianthinin G3, etc.This family represents the precursor protein which is cleaved into the two chains. These proteins contain two β-barrel domains.This family is a member of the 'cupin' superfamily on the basis of their conserved barrel domain ('cupa' is the Latin termfor a small barrel).
Protein Domain
Name: CALMODULIN-BINDING PROTEIN60
Type: Family
Description: CALMODULIN-BINDING PROTEIN60 (CBP60) family from plants have been known to be involved in both biotic and abiotic stress responses [ ]. Some members (for example, ), are known to be involved in the induction of plant defence responses [ ]. In Arabidopsis, CBP60s have eight members, including CBP60g and SARD1, which encode positive regulators of plant immunity that promote production of salicylic acid (SA) and affect expression of SA-dependent and SA-independent defense genes [, ].
Protein Domain
Name: Ribosomal protein L29/L35
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ].Ribosomal protein L29 is one of the proteins from the large ribosomal subunit. L29 belongs to a family of ribosomal proteins of 63 to 138 amino-acid residues which, on the basis of sequence similarities, groups: Red algal L29.Bacterial L29.Mammalian L35Caenorhabditis elegans L35 (ZK652.4).Yeast L35.L29 is located on the surface of the large ribosomal subunit, where it participates in forming a protein ring that surrounds the polypeptide exit channel, providing structural support for the ribosome [ ]. L29 is involved in forming the translocon binding site, along with L19, L22, L23, L24, and L31e. In addition, L29 and L23 form the interaction site for trigger factor (TF) on the ribosomal surface, adjacent to the exit tunnel []. L29 forms numerous interactions with L23 and with the 23S rRNA.This family includes eubacterial and archaeal L29 and eukariotic L35 ribosomal proteins, which constitute the uL29 family [ ].
Protein Domain
Name: Helix-hairpin-helix, base-excision DNA repair, C-terminal
Type: Homologous_superfamily
Description: This entry represents the extreme C terminus of the helix-hairpin-helix base-excision DNA repair enzyme family, including the DNA glycosylases and lyases [ , , ].
Protein Domain
Name: rRNA small subunit methyltransferase G
Type: Family
Description: This entry represents a rRNA small subunit methyltransferase G. Previously identified as a glucose-inhibited division protein B that appears to be present and in a single copy in all complete eubacterial genomes so far sequenced. Specifically methylates the N7 position of a guanosine in 16S rRNA [ , , ].
Protein Domain
Name: Magnesium chelatase ChlI-like, catalytic domain
Type: Domain
Description: This domain can be found in the magnesium chelatase ChlI subunit, the catalytic domain that binds and hydrolyses ATP. This domain contains the nucleotide binding Walker motif [ ] and can also be found in the competence protein ComM from Haemophilus influenzae and Lon proteases from archaea.Magnesium-chelatase is a three-component enzyme that catalyses the insertion of Mg 2+into protoporphyrin IX. This is the first unique step in the synthesis of (bacterio)chlorophyll. As a result, it is thought that Mg-chelatase has an important role in channeling intermediates into the (bacterio)chlorophyll branch in response to conditions suitable for photosynthetic growth. ChlI and BchD have molecular weights between 38-42kDa [ , ].
Protein Domain
Name: Magnesium chelatase, ATPase subunit I
Type: Family
Description: This entry represents one of two ATPase subunits of the trimeric magnesium chelatase responsible for insertion of magnesium ion into protoporphyrin IX. This is an essential step in the biosynthesis of both chlorophyll and bacteriochlorophyll [ ]. This subunit is found in green plants, photosynthetic algae, cyanobacteria and other photosynthetic bacteria. It is also found in certain archaea not known to make chlorins.
Protein Domain
Name: Conserved hypothetical protein CHP02058
Type: Family
Description: This family consists of few members, broadly distributed. It occurs so far in several Firmicutes (twice in Oceanobacillus), one Cyanobacterium, one alpha Proteobacterium, and (with a long prefix) in plants. The function is unknown. The alignment includes a perfectly conserved motif GxGxDxHG near the N terminus.
Protein Domain
Name: Ribosomal RNA-processing protein 7, C-terminal domain
Type: Domain
Description: Ribosomal RNA-processing protein 7 (RRP7) is an essential protein in yeast that is involved in pre-rRNA processing and ribosome assembly [ ]. It is speculated to be required for correct assembly of rpS27 into the pre-ribosomal particle [, ]. This entry represents the C-terminal domain of RRP7.
Protein Domain
Name: Chalcone/stilbene synthase, N-terminal
Type: Domain
Description: Chalcone synthases (CHS) ( ) and stilbene synthases (STS) (formerly known as resveratrol synthases) are related plant enzymes, members of the plant polyketide synthase superfamily. CHS is an important enzyme in flavonoid biosynthesis and STS is a key enzyme in stilbene-type phyloalexin biosynthesis. Both enzymes catalyse the addition of three molecules of malonyl-CoA to a starter CoA ester (a typical example is 4-coumaroyl-CoA), producing either a chalcone (with CHS) or stilbene (with STS) [ ].These enzymes have a conserved cysteine residue, located in the central section of the protein sequence, which is essential for the catalytic activity of both enzymes and probably represents the binding site for the 4-coumaryl-CoA group [, ].This entry represents the N-terminal domain of chalcone and stilbene synthases and related proteins.
Protein Domain
Name: Chalcone/stilbene synthase, C-terminal
Type: Domain
Description: Chalcone synthases (CHS) ( ) and stilbene synthases (STS) (formerly known as resveratrol synthases) are related plant enzymes. CHS is an important enzyme in flavanoid biosynthesis and STS is a key enzyme in stilbene-type phyloalexin biosynthesis. Both enzymes catalyse the addition of three molecules of malonyl-CoA to a starter CoA ester (a typical example is 4-coumaroyl-CoA), producing either a chalcone (with CHS) or stilbene (with STS) [ ]. These enzymes have a conserved cysteine residue, located in the central section of the protein sequence, which is essential for the catalytic activity of both enzymes and probably represents the binding site for the 4-coumaryl-CoA group [, ].This domain of chalcone synthase is reported to be structurally similar to domains in thiolase and beta-ketoacyl synthase. The differences in activity are accounted for by differences in the N-terminal domain [ ].
Protein Domain
Name: Polyketide synthase, type III
Type: Family
Description: Type III polyketide synthases include plant naringenin-chalcone synthases (CHSs) [ , ] and stilbene synthases (STSs) (resveratrol synthases) [, ]. This group also includes CHS-related enzymes such as bibenzyl synthase (BBS) [] and acridone synthase (ACS) [] that share a common chemical mechanism but differ from CHS in their substrate specificity and/or in the stereochemistry of the polyketide cyclisation reaction. It also includes prokaryotic type III polyketide synthases (PKSs).Type III polyketide synthases catalyse formation of structurally diverse polyketides. They are homodimeric iterative PKSs and contain two independent active sites each of which catalyses single or multiple condensation reactions to generate polyketides of different lengths [ ]. CHS and STS are plant-specific polyketide synthases. With a starter CoA-ester they perform three sequential condensation steps with acetate units from malonyl-CoA to form a tetraketide intermediate that is folded to the ring systems specific to the different products. Each monomer subunit is capable of performing all three condensation steps, and malonyl-CoA is the direct donor of the acetate units. The structure of the Medicago sativa (Alfalfa) CHS2 has the active site architecture that defines the sequence and chemistry of multiple decarboxylation and condensation reactions [].CHSs ( ) are key enzymes in flavonoid assembly. They synthesize naringenin chalcone, the precursor for a large number of flavonoids. CHS is essential for formation of 4,2',4',6'-tetrahydroxychalcone, a plant secondary metabolite that sits at a critical metabolic branch point leading to the biosynthesis of anthocyanin pigments, anti-microbial phytoalexins, and flavonoid inducers of Rhizobium nodulation genes. STSs ( ) occur in a limited number of unrelated plants and synthesize the backbone of the stilbene phytoalexins that have antifungal properties and contribute to pathogen defence. This group also contains prokaryotic type III PKSs [ ]. They have been shown to be responsible for the biosynthesis of natural products such as 1,3,6,8-tetrahydroxynaphthalene (THN) (produced by RppA protein) [ ] and for the formation of key components of more complex molecules such as the antimicrobial agent vancomycin (produced by DpgA protein) []. Streptomyces coelicolor THN synthase (THNS) is believed to be involved in the biosynthesis of the prenylated naphthoquinone cytotoxin marinone in a marine sediment-derived actinomycete []. PKS11 from Mycobacterium tuberculosis is involved in the biosynthesis of methyl-branched alkylpyrones and catalyses the extension of medium- and long-chain aliphatic acyl-CoA substrates by using malonyl-CoA and methylmalonyl-CoA as extender molecules to synthesize polyketide products [].
Protein Domain
Name: Chalcone/stilbene synthase, active site
Type: Active_site
Description: Chalcone synthases (CHS) ( ) and stilbene synthases (STS) (formerly known as resveratrol synthases) are related plant enzymes, members of the plant polyketide synthase superfamily. CHS is an important enzyme in flavonoid biosynthesis and STS is a key enzyme in stilbene-type phyloalexin biosynthesis. Both enzymes catalyse the addition of three molecules of malonyl-CoA to a starter CoA ester (a typical example is 4-coumaroyl-CoA), producing either a chalcone (with CHS) or stilbene (with STS) [ ].These enzymes have a conserved cysteine residue, located in the central section of the protein sequence, which is essential for the catalytic activity of both enzymes and probably represents the binding site for the 4-coumaryl-CoA group [ , ].
Protein Domain
Name: Trichome birefringence-like, N-terminal domain
Type: Domain
Description: This entry represents the N-terminal C-rich predicted sugar binding domain found in Trichome birefringence-like (TBL) proteins from streptophytes. This domain is followed by the PC-Esterase (acyl esterase) domain in PMR5 and ESK1 [ , ].
Protein Domain
Name: PC-Esterase
Type: Family
Description: The PC-Esterase family [ ] is comprised of Cas1p, the Homo sapiens C7orf58, Arabidopsis thaliana PMR5 and a group of plant freezing resistance/cold acclimatization proteins typified by Arabidopsis thaliana ESKIMO1 (also known as XOAT1) [, ], animal FAM55D proteins, and animal FAM113 proteins. The PC-Esterase family has features that are both similar and different from the canonical GDSL/SGNH superfamily []. The members of this family are predicted to have Acyl esterase activity and predicted to modify cell-surface biopolymers such as glycans and glycoproteins [, ]. The Cas1p protein has a Cas1_AcylT domain, in addition, with the opposing acyltransferase activity []. The C7orf58 family has a ATP-Grasp domain fused to the PC-Esterase and is the first identified secreted tubulin-tyrosine ligase like enzyme in eukaryotes []. The plant family with PMR5, XOAT1, TBL3 etc have an N-terminal C rich potential sugar binding domain followed by the PC-Esterase domain [].In Arabidopsis, XOAT1 catalyzes the 2-O-acetylation of xylan, followed by nonenzymatic acetyl migration to the O-3 position, resulting in products that are monoacetylated at both O-2 and O-3 positions [ , ]. Its role is essential for the production of functional xylem vessels [, ]. It functions as a negative regulator of cold acclimation, and mutations in the ESK1 gene provides strong freezing tolerance [].
Protein Domain
Name: PGG domain
Type: Domain
Description: The PGG domain is named for the highly conserved sequence motif found at the start of the domain. Its function is not known.
Protein Domain
Name: NOT2/NOT3/NOT5, C-terminal
Type: Domain
Description: The Ccr4-Not complex controls mRNA metabolism at multiple levels in eukaryotic cells [ , , ]. This complex is a major cytoplasmic deadenylase consisting of a combination of at least nine subunits, four of which have deadenylase activity []. The conserved core of the CCR4-NOT complex consists of two major modules: a catalytic module comprising two deadenylases (CAF1 or its paralogue POP2, and CCR4a or its paralogue CCR4b) and the NOT module, which minimally consists of NOT1, NOT2 and NOT3 [].This entry represents the C-terminal domain of NOT2, NOT3 and NOT5 (subunits 2, 3 and 5 of the complex). The C-terminal regions of NOT2 and NOT3 contain a conserved SH3-like NOT-box domain which mediates NOT-module assembly [ ].
Protein Domain
Name: Chromosomal replication initiator, DnaA C-terminal
Type: Domain
Description: This entry represents the C-terminal domain of bacterial DnaA proteins [ , , ] that play an important role in initiating and regulating chromosomal replication. DnaA is an ATP- and DNA-binding protein. It binds specifically to 9 bp nucleotide repeats known as dnaA boxes which are found in the chromosome origin of replication (oriC).DnaA is a protein of about 50kDa that contains two conserved regions: the first is located in the N-terminal half and corresponds to the ATP-binding domain, the second is located in the C-terminal half and could be involved in DNA-binding. The protein may also bind the RNA polymerase beta subunit, the dnaB and dnaZ proteins, and the groE gene products (chaperonins) [ ].
Protein Domain
Name: Trp repressor/replication initiator
Type: Homologous_superfamily
Description: The Trp repressor (TrpR) binds to at least five operators in the Escherichia coli genome, repressing gene expression. The operators at which it binds vary considerably in DNA sequence and location within the promoter; when bound to the Trp operon it recognises the sequence 5'-ACTAGT-3' and acts to prevent the initiation of transcription. The TrpR controls the trpEDCBA (trpO) operon and the genes for trpR, aroH, mtr and aroL, which are involved in the biosynthesis and uptake of the amino acid tryptophan [ ]. The repressor binds to the operators only in the presence of L-tryptophan, thereby controlling the intracellular level of its effector; the complex also regulates Trp repressor biosynthesis by binding to its own regulatory region. TrpR acts as a dimer that is composed of identical 6-helical subunits, where four of the helices form the core of the protein and intertwine with the corresponding helices from the other subunit.The bacterial chromosomal replication initiation factor DnaA is a monomeric protein that shows structural similarity to the TrpR, except that it contains additional N-terminal helices. DnaA is a member of the AAA+ family of ATPases, and forms a large, oligomeric assembly at the replication origin site (oriC); the oligomeric complex of DnaA recognises and processes specific origin sequences in order to initiate replication in bacteria [ ].
Protein Domain
Name: Protein of unknown function DUF1635
Type: Family
Description: The members of this family include sequences that are parts of hypothetical proteins expressed by plant species. The region in question is about 170 amino acids long.
Protein Domain
Name: Ribosomal protein L34Ae
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].A number of eukaryotic and archaebacterial ribosomal proteins belong to the L34e family. These include, vertebrate L34, mosquito L31 [], plant L34 [],yeast putative ribosomal protein YIL052c and archaebacterial L34e.
Protein Domain
Name: Zinc finger, LIM-type
Type: Domain
Description: This entry represents LIM-type zinc finger (Znf) domains. LIM domains coordinate one or more zinc atoms, and are named after the three proteins (LIN-11, Isl1 and MEC-3) in which they were first found. They consist of two zinc-binding motifs that resemble GATA-like Znf's, however the residues holding the zinc atom(s) are variable, involving Cys, His, Asp or Glu residues. LIM domains are involved in proteins with differing functions, including gene expression, and cytoskeleton organisation and development [ , ]. Protein containing LIM Znf domains include:Caenorhabditis elegans mec-3; a protein required for the differentiation of the set of six touch receptor neurons in this nematode.C. elegans. lin-11; a protein required for the asymmetric division of vulval blast cells.Vertebrate insulin gene enhancer binding protein isl-1. Isl-1 binds to one of the two cis-acting protein-binding domains of the insulin gene.Vertebrate homeobox proteins lim-1, lim-2 (lim-5) and lim3.Vertebrate lmx-1, which acts as a transcriptional activator by binding to the FLAT element; a beta-cell-specific transcriptional enhancer found in the insulin gene.Mammalian LH-2, a transcriptional regulatory protein involved in the control of cell differentiation in developing lymphoid and neural cell types.Drosophila melanogaster (Fruit fly) protein apterous, required for the normal development of the wing and halter imaginal discs.Vertebrate protein kinases LIMK-1 and LIMK-2.Mammalian rhombotins. Rhombotin 1 (RBTN1 or TTG-1) and rhombotin-2 (RBTN2 or TTG-2) are proteins of about 160 amino acids whose genes are disrupted by chromosomal translocations in T-cell leukemia.Mammalian and avian cysteine-rich protein (CRP), a 192 amino-acid protein of unknown function. Seems to interact with zyxin.Mammalian cysteine-rich intestinal protein (CRIP), a small protein which seems to have a role in zinc absorption and may function as an intracellular zinc transport protein.Vertebrate paxillin, a cytoskeletal focal adhesion protein.Mus musculus (Mouse) testin which should not be confused with rat testin which is a thiol protease homologue (see ). Helianthus annuus (Common sunflower) pollen specific protein SF3.Chicken zyxin. Zyxin is a low-abundance adhesion plaque protein which has been shown to interact with CRP.Yeast protein LRG1 which is involved in sporulation [ ].Saccharomyces cerevisiae (Baker's yeast) rho-type GTPase activating protein RGA1/DBM1.C. elegans homeobox protein ceh-14.C. elegans homeobox protein unc-97.S. cerevisiae hypothetical protein YKR090w.C. elegans hypothetical proteins C28H8.6.These proteins generally contain two tandem copies of the LIM domain in their N-terminal section. Zyxin and paxillin are exceptions in that they contain respectively three and four LIM domains at their C-terminal extremity. In apterous, isl-1, LH-2, lin-11, lim-1 to lim-3, lmx-1 and ceh-14 and mec-3 there is a homeobox domain some 50 to 95 amino acids after the LIM domains.LIM domains contain seven conserved cysteine residues and a histidine. The arrangement followed by these conserved residues is:C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD]LIM domains bind two zinc ions [ ]. LIM does not bind DNA, rather it seems to act as an interface for protein-protein interaction.
Protein Domain
Name: Starch synthase, catalytic domain
Type: Domain
Description: This region represents the catalytic domain of glycogen (or starch) synthases that use ADP-glucose (), rather than UDP-glucose ( ) as in animals, as the glucose donor. This enzyme is found in bacteria and plants. Whether the name given is glycogen synthase or starch synthase depends on context, and therefore on substrate.
Protein Domain
Name: Bacterial/plant glycogen synthase
Type: Family
Description: This entry represents glycogen (GS) and starch synthases (SS) from bacteria and plants. GS and SS are involved in the elongation of the linear chains of glycogen and starch, respectively, by catalysing the transfer of the glucosyl moiety of the activated glucosyl donor (UDP-glucose or ADP-glucose, depending on the organism in question) to the nonreducing ends of a preexisting alpha(1 to 4) glucan primer [ ].
Protein Domain
Name: Structural maintenance of chromosomes protein
Type: Family
Description: The SMC (structural maintenance of chromosomes) family of proteins, exist in virtually all organisms, including bacteria and archaea. The SMC proteins are essential for successful chromosome transmission during replication and segregation of the genome in all organisms. They function together with other proteins in a range of chromosomal transactions, including chromosome condensation, sister-chromatid cohesion, recombination, DNA repair and epigenetic silencing of gene expression [ , ].SMCs are generally present as single proteins in bacteria, and as at least six distinct proteins in eukaryotes. The proteins range in size from approximately 110 to 170kDa, and share a five-domain structure, with globular N- and C-terminal domains separated by a long (circa 100 nm or 900 residues) coiled coil segment in the centre of which is a globular ''hinge'' domain, characterised by a set of four highly conserved glycine residuesthat are typical of flexible regions in a protein. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T), which has been shown by mutational studies to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif (XXXXD, where X is any hydrophobic residue), and a LSGG motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases [ ]. All SMC proteins appear to form dimers, either forming homodimers, as in the case of prokaryotic SMC proteins, or heterodimers between different but related SMC proteins. The dimers form core components of large multiprotein complexes. The best known complexes are cohesin, which is responsible for sister-chromatid cohesion, and condensin, which is required for full chromosome condensation in mitosis. SMC dimers are arranged in an antiparallel alignment. This orientation brings the N- and C-terminal globular domains (from either different or identical protamers) together, which unites an ATP binding site (Walker A motif) within the N-terminal domain with a Walker B motif (DA box) within the C-terminal domain, to form a potentially functional ATPase. Protein interaction and microscopy data suggest that SMC dimers form a ring-like structure which might embrace DNA molecules. Non-SMC subunits associate with the SMC amino- and carboxy-terminal domains.Proteins in this entry include SMC1/2/3/4 from Saccharomyces cerevisiae. SMC1-SMC3 heterodimer is part of the cohesin complex, which is required for sister chromatid cohesion in mitosis and meiosis []. SMC2-SMC4 heterodimer is part of the condensin complex, which is required for chromosome condensation during both mitosis and meiosis [, ].
Protein Domain      
Protein Domain
Name: ZF-HD homeobox protein, Cys/His-rich dimerisation domain
Type: Domain
Description: The homeodomain (HD) is a 60-amino acid DNA-binding domain found in many transcription factors. HD-containing proteins are found indiverse organisms such as humans, Drosophila, nematode worms, and plants, where they play important roles in development. Zinc-finger-homeodomain (ZF-HD) subfamily proteins have only been identified in plants, and likely play plant specific roles. ZF-HD proteins are expressed predominantly orexclusively in floral tissue, indicating a likely regulatory role during floral development []. The ZF-HD class of homeodomain proteins may also beinvolved in the photosynthesis-related mesophyll-specific gene expression of phosphoenolpyruvate carboxylase in C4 species [] and in pathogen signalingand plant defense mechanisms [ ]. These proteins share three domains of high sequence similarity: the homeodomain (II) located at the carboxy-terminus, and two other segments (Iaand Ib) located in the amino-terminal part. These N-terminal domains contain five conserved cysteine residues and at least three conserved histidineresidues whose spacing ressembles zinc-binding domains involved in dimerization of transcription factors. Although the two domains contain atleast eight potential zinc-binding amino-acids, the unique spacing of the conserved cysteine and histidine residues within domain Ib suggests that bothdomains form one rather than two zinc finger structures. The two conserved motifs Ia and Ib constitute a dimerization domain which is sufficient for theformation of homo- and heterodimers [ ]. This entry represents the N-terminal Cysteine/Histidine-rich dimerization domain. The companion ZF-HD homeobox domain is described in .
Protein Domain
Name: Ribosomal protein L23/L25, N-terminal
Type: Domain
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].The N-terminal domain appears to be specific to the eukaryotic ribosomal proteins L25, L23, and L23a.
Protein Domain
Name: Ribosomal protein L23/L25, conserved site
Type: Conserved_site
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ].Ribosomal protein L23 is one of the proteins from the large ribosomal subunit that binds to a specific region on either the 23S or 26S rRNA. This entry includes eukaryotic L25 and bacterial and eukaryotic L23.
Protein Domain
Name: Ribosomal protein L23
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ].This entry represents the archaeal ribosomal L23 protein, and some, although not all bacterial L23 proteins.
Protein Domain
Name: Ribosomal protein L25/L23
Type: Family
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].This entry represents both eukaryotic (yeast) L25 and prokaryotic and eukaryotic L23 proteins, which constitute the uL23 family [ ].
Protein Domain
Name: Ribosomal protein L23/L15e core domain superfamily
Type: Homologous_superfamily
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].Both the L23 and L15e ribosomal proteins have a core domain consisting of a beta-(alpha)-beta-α-β(2) structure folded into three layers, alpha/beta/alpha, where the β-sheets are antiparallel.
Protein Domain
Name: GTP cyclohydrolase II, RibA
Type: Family
Description: GTP cyclohydrolase II (also known as ribA) catalyses the first committed step in the biosynthesis of riboflavin (vitamin b2). The enzyme converts GTP to 2,5-diamino-6-beta-ribosyl-4(3H)-pyrimidinone 5'-phosphate (APy), formate and pyrophosphate, and requires magnesium as a cofactor. In numerous bacteria and in plants, GTP cyclohydrolase II occurs in the C-terminal of a bifunctional enzyme known as ribBA, which also comprises an N-terminal 3,4-dihydroxy-2-butanone 4-phosphate synthase (DHBP_synthase, also known as RibB) ( ). RibB catalyses the conversion of D-ribulose 5-phosphate to formate and 3,4-dihydroxy-2-butanone 4-phosphate [ ]. A paralogous protein is encoded in the genome of Streptomyces coelicolor, which converts GTP to 2-amino-5-formylamino-6-ribosylamino-4(3H)-pyrimidinone 5'-phosphate (FAPy), an activity that has otherwise been reported for unrelated GTP cyclohydrolases III [ ].
Protein Domain
Name: 3,4-dihydroxy-2-butanone 4-phosphate synthase, RibB
Type: Family
Description: 3,4-dihydroxy-2-butanone 4-phosphate synthase ( ) (DHBP synthase) (RibB) catalyses the conversion of D-ribulose 5-phosphate to formate and 3,4-dihydroxy-2-butanone 4-phosphate, the latter serving as the biosynthetic precursor for the xylene ring of riboflavin [ ]. In Photobacterium leiognathi, the riboflavin synthesis genes ribB (DHBP synthase), ribE (riboflavin synthase), ribH (lumazone synthase) and ribA (GTP cyclohydrolase II) all reside in the lux operon []. RibB is sometimes found as a bifunctional enzyme with GTP cyclohydrolase II that catalyses the first committed step in the biosynthesis of riboflavin (). No sequences with significant homology to DHBP synthase are found in the metazoa.
Protein Domain
Name: DHBP synthase RibB-like alpha/beta domain superfamily
Type: Homologous_superfamily
Description: This superfamily represents a structural domain consisting of segregated alpha and beta regions in 3-layers. Homologous domains with this structure are found in:3,4-dihydroxy-2-butanone 4-phosphate synthase ( ) (DHBP synthase) (RibB) A family of eukaryotic and prokaryotic hypothetical proteins that includes YrdC and YciO from Escherichia coli and MTH1692 from the archaea Methanothermobacter thermautotrophicus (Methanobacterium thermoformicicum)DHBP synthase RibB catalyses the conversion of D-ribulose 5-phosphate to formate and 3,4-dihydroxy-2-butanone 4-phosphate, the latter serving as the biosynthetic precursor for the xylene ring of riboflavin [ ]. In Photobacterium leiognathi, the riboflavin synthesis genes ribB (DHBP synthase), ribE (riboflavin synthase), ribH (lumazone synthase) and ribA (GTP cyclohydrolase II) all reside in the lux operon []. RibB is sometimes found as a bifunctional enzyme with GTP cyclohydrolase II that catalyses the first committed step in the biosynthesis of riboflavin (). No sequences with significant homology to DHBP synthase are found in the metazoa. The YrdC family of hypothetical proteins are widely distributed in eukaryotes and prokaryotes and occur as: (i) independent proteins, (ii) with C-terminal extensions, and (iii) as domains in larger proteins, some of which are implicated in regulation [ ]. YrdC from Escherichia coli preferentially binds to double-stranded RNA and DNA. YrdC is predicted to be an rRNA maturation factor, as deletions in its gene lead to immature ribosomal 30S subunits and, consequently, fewer translating ribosomes []. Therefore, YrdC may function by keeping an rRNA structure needed for proper processing of 16S rRNA, especially at lower temperatures. Threonylcarbamoyl-AMP synthase (Sua5) is an example of a multi-domain protein that contains an N-terminal YrdC-like domain and a C-terminal Sua5 domain. Sua5 was identified in Saccharomyces cerevisiae (Baker's yeast) as a suppressor of a translation initiation defect in the cytochrome c gene and is required for formation of a threonylcarbamoyl group on adenosine at position 37 in tRNAs [, ]. HypF is involved in the synthesis of the active site of [NiFe]-hydrogenases [ ].
Protein Domain
Name: Riboflavin biosynthesis protein RibBA
Type: Family
Description: This group represents the riboflavin biosynthesis protein RibBA which has both GTP cyclohydrolase II and 3,4-dihydroxy-2-butanone 4-phosphate synthase activities [ ].
Protein Domain
Name: UAA transporter
Type: Family
Description: This family includes transporters with a specificity for UDP-N-acetylglucosamine [ ].
Protein Domain
Name: Protein of unknown function DUF726
Type: Family
Description: This family consists of several uncharacterised eukaryotic proteins.
Protein Domain
Name: DNA recombination and repair protein Rad51-like, C-terminal
Type: Domain
Description: This domain is found at the C terminus of DNA repair and recombination protein Rad51, and eukaryotic and archaeal Rad51-like proteins. It is critical for DNA binding [ ]. Rad51 is a homologue of the bacterial RecA protein. Rad51 and RecA share a core ATPase domain.Unlike eubacteria, several archaeal species have two recA/RAD51-like genes, called RadA and RadB. Among eukaryotes, yeast contain four RAD51-like genes (RAD51, DMC1, RAD55/rhp55, and RAD57/rhp57). In vertebrate animals and plants, there are different RAD51-like genes: RAD51, RAD51B, RAD51C, RAD51D, DMC1, XRCC2, and XRCC3 [ ].
Protein Domain
Name: DNA recombination and repair protein, RecA-like
Type: Family
Description: The recA gene product is a multifunctional enzyme that plays a role in homologous recombination, DNA repair and induction of the SOS response [ ]. In homologous recombination, the protein functions as a DNA-dependent ATPase, promoting synapsis, heteroduplex formation and strand exchange between homologous DNAs []. RecA also acts as a protease cofactor that promotes autodigestion of the lexA product and phage repressors. The proteolytic inactivation of the lexA repressor by an activated form of recA may cause a derepression of the 20 or so genes involved in the SOS response, which regulates DNA repair, induced mutagenesis, delayed cell division and prophage induction in response to DNA damage []. RecA is a protein of about 350 amino acid residues. Its sequence is very well conserved [ , , ] among eubacterial species. It is also found in the chloroplast of plants []. RecA-like proteins are found in archaea and diverse eukaryotic organisms, like fission yeast, mouse or human. In the filament visualised by X-ray crystallography, β-strand 3, the loop C-terminal to β-strand 2, and α-helix D of the core domain form one surface that packs against αa-helix A and β-strand 0 (the N-terminal domain) of an adjacent monomer during polymerisation []. The core ATP-binding site domain is well conserved, with 14 invariant residues. It contains the nucleotide binding loop between β-strand 1 and α-helix C. The Escherichia coli sequence GPESSGKT matches the consensus sequence of amino acids (G/A)XXXXGK(T/S) for the Walker A box (also referred to as the P-loop) found in a number of nucleoside triphosphate (NTP)-binding proteins. Another nucleotide binding motif, the Walker B box is found at β-strand 4 in the RecA structure. The Walker B box is characterised by four hydrophobic amino acids followed by an acidic residue (usually aspartate). Nucleotide specificity and additional ATP-binding interactions are contributed by the amino acid residues at β-strand 2 and the loop C-terminal to that strand, all of which are greater than 90% conserved among bacterial RecA proteins.This entry represents a subset of the RecA-like proteins.
Protein Domain
Name: DNA recombination and repair protein RecA-like, ATP-binding domain
Type: Domain
Description: The recA gene product is a multifunctional enzyme that plays a role in homologous recombination, DNA repair and induction of the SOS response [ ]. In homologous recombination, the protein functions as a DNA-dependent ATPase, promoting synapsis, heteroduplex formation and strand exchange between homologous DNAs []. RecA also acts as a protease cofactor that promotes autodigestion of the lexA product and phage repressors. The proteolytic inactivation of the lexA repressor by an activated form of recA may cause a derepression of the 20 or so genes involved in the SOS response, which regulates DNA repair, induced mutagenesis, delayed cell division and prophage induction in response to DNA damage []. RecA is a protein of about 350 amino acid residues. Its sequence is very well conserved [ , , ] among eubacterial species. It is also found in the chloroplast of plants []. RecA-like proteins are found in archaea and diverse eukaryotic organisms, like fission yeast, mouse or human. In the filament visualised by X-ray crystallography, β-strand 3, the loop C-terminal to β-strand 2, and α-helix D of the core domain form one surface that packs against αa-helix A and β-strand 0 (the N-terminal domain) of an adjacent monomer during polymerisation []. The core ATP-binding site domain is well conserved, with 14 invariant residues. It contains the nucleotide binding loop between β-strand 1 and α-helix C. The Escherichia coli sequence GPESSGKT matches the consensus sequence of amino acids (G/A)XXXXGK(T/S) for the Walker A box (also referred to as the P-loop) found in a number of nucleoside triphosphate (NTP)-binding proteins. Another nucleotide binding motif, the Walker B box is found at β-strand 4 in the RecA structure. The Walker B box is characterised by four hydrophobic amino acids followed by an acidic residue (usually aspartate). Nucleotide specificity and additional ATP-binding interactions are contributed by the amino acid residues at β-strand 2 and the loop C-terminal to that strand, all of which are greater than 90% conserved among bacterial RecA proteins.This entry represents the ATP-binding domain found in the N-terminal part of RecA proteins.
Protein Domain
Name: Protein kinase, C-terminal
Type: Domain
Description: Protein phosphorylation, which plays a key role in most cellular activities, is a reversible process mediated by protein kinases and phosphoprotein phosphatases. Protein kinases catalyse the transfer of the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. Phosphoprotein phosphatases catalyse the reverse process. Protein kinases fall into three broad classes, characterised with respect to substrate specificity []:Serine/threonine-protein kinasesTyrosine-protein kinasesDual specificity protein kinases (e.g. MEK - phosphorylates both Thr and Tyr on target proteins)Protein kinase function is evolutionarily conserved from Escherichia coli to human [ ]. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation []. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins. The catalytic subunits of protein kinases are highly conserved, and several structures have been solved [ ], leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases [].This domain is found in a large variety of protein kinases with different functions and dependencies. Protein kinase C, for example, is a calcium-activated, phospholipid-dependent serine- and threonine-specific enzyme. It is activated by diacylglycerol which, in turn, phosphorylates a range of cellular proteins. This domain is most often found associated with .
Protein Domain
Name: AGC-kinase, C-terminal
Type: Domain
Description: The AGC (cAMP-dependent, cGMP-dependent and protein kinase C) protein kinase family embraces a collection of protein kinases that display a high degree of sequence similarity within their respective kinase domains. AGC kinase proteins are characterised by three conserved phosphorylation sites that critically regulate their function. The first one is located in an activation loop in the centre of the kinase domain. The two other phosphorylation sites are located outside the kinase domain in a conserved region on its C-terminal side, the AGC-kinase C-terminal domain. These sites serves as phosphorylation-regulated switches to control both intra- and inter-molecular interactions. Without these priming phosphorylations, the kinases are catalytically inactive [ , , ].Several structures of the AGC-kinase C-terminal domain have been solved. The first phosphorylation site is located in a turn motif, the second one at the end of the domain in an hydrophobic pocket. In PKB the phosphorylated hydrophobic motif engages a hydrophobic groove within the N-lobe of the kinase domain which orders alpha helices close to the active site [ ].Protein phosphorylation, which plays a key role in most cellular activities, is a reversible process mediated by protein kinases and phosphoprotein phosphatases. Protein kinases catalyse the transfer of the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. Phosphoprotein phosphatases catalyse the reverse process. Protein kinases fall into three broad classes, characterised with respect to substrate specificity []:Serine/threonine-protein kinasesTyrosine-protein kinasesDual specificity protein kinases (e.g. MEK - phosphorylates both Thr and Tyr on target proteins)Protein kinase function is evolutionarily conserved from Escherichia coli to human [ ]. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation []. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins. The catalytic subunits of protein kinases are highly conserved, and several structures have been solved [], leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases [].
Protein Domain
Name: Ubiquitin conjugation factor E4, core
Type: Domain
Description: This entry represents the most conserved part of the core region of ubiquitin conjugation factor E4 (or Ub elongating factor, or Ufd2P), running from helix α-11 to α-38. It consists of 31 helices of variable length connected by loops of variable size forming a compact unit; the helical packing pattern of the compact unit consists of five structural repeats that resemble tandem Armadillo (ARM) repeats. This domain is involved in ubiquitination as it binds Cdc48p and escorts ubiquitinated proteins from Cdc48p to the proteasome for degradation. The core is structurally similar to the nuclear transporter protein importin-alpha. The core is associated with the U-box at the C terminus, ( ), which has ligase activity. Ubiquitin conjugation factor E4 is involved in N-terminal ubiquitin fusion degradation proteolytic pathway (UFD pathway). E4 binds to the ubiquitin moieties of preformed conjugates and catalyses ubiquitin chain assembly in conjunction with E1, E2, and E3. E4 appears to influence the formation and topology of the multi-Ub chain as it enhances ubiquitination at 'Lys-48' but not at 'Lys-29' of the N-terminal Ub moiety.
Protein Domain
Name: Saccharopine dehydrogenase, NADP binding domain
Type: Domain
Description: This entry represents the NADP binding domain of saccharopine dehydrogenase. In some organisms this enzyme is found as a bifunctional polypeptide with lysine ketoglutarate reductase. The saccharopine dehydrogenase can also function as a saccharopine reductase [ , ].Saccharopine dehydrogenase ( ) catalyses the condensation of l-alpha-aminoadipate-delta-semialdehyde (AASA) with l-glutamate to give an imine, which is reduced by NADPH to give saccharopine [ ]. In some organisms this enzyme is found as a bifunctional polypeptide with lysine ketoglutarate reductase (PF). Saccharopine dehydrogenase can also function as a saccharopine reductase. Saccharopine is an intermediate in lysine metabolism.Homospermidine synthase (HSS) ( ) catalyses the synthesis of the polyamine homospermidine from 2 putrescine molecules in an NAD +-dependent reaction [ ]. HSS evolved from the alternative spermidine biosynthetic pathway enzyme carboxyspermidine dehydrogenase [ , ] and the structure of HSS is related to lysine metabolic enzymes [].
Protein Domain      
Protein Domain
Name: Chorismate mutase, AroQ class, eukaryotic type
Type: Family
Description: Chorismate mutase (CM) is a regulatory enzyme ( ) required for biosynthesis of the aromatic amino acids phenylalanine and tyrosine. CM catalyzes the Claisen rearrangement of chorismate to prephenate, which can subsequently be converted to precursors of either L-Phe or L-Tyr. In bifunctional enzymes the CM domain can be fused to a prephenate dehydratase (P-protein for Phe biosynthesis), to a prephenate dehydrogenase (T-protein, for Tyr biosynthesis), or to 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase ( ). Besides these prokaryotic bifunctional enzymes, monofunctional CMs occur in prokaryotes as well as in fungi, plants and nematode worms [ ].The type I or AroH class of CM is represented by Bacillus subtilis aroH, a monofunctional, nonallosteric, homotrimeric enzyme characterized by its pseudo-alpha/β-barrel 3D structure. Each monomer folds into a 5-stranded mixed β-sheet packed against an α-helix and a 3-10 helix. The core is formed by a closed barrel of mixed β-sheets surrounded by helices. The interfaces between adjacent subunits form three equivalent clefts that harbor the active sites [ ].The type II or AroQ class of CM has a completely different all-helical 3D structure, represented by the CM domain of the bifunctional Escherichia coli P-protein. This type is named after the Enterobacter agglomerans monofunctional CM encoded by the aroQ gene [ ]. All CM domains from bifunctional enzymes as well as most monofunctional CMs belong to this class, including archaeal CM.Eukaryotic CM from plants and fungi form a separate subclass of AroQ, represented by the Baker's yeast allosteric CM [ ]. These enzymes show only partial sequence similarity to the prokaryotic CMs due to insertions of regulatory domains, but the helix-bundle topology and catalytic residues are conserved and the 3D structure of the E. coli CM dimer resembles a yeast CM monomer [, , ]. The E. coli P-protein CM domain consists of 3 helices and lacks allosteric regulation. The yeast CM has evolved by gene duplication and dimerization and each monomer has 12 helices. Yeast CM is allosterically activated by Trp and inhibited by Tyr [ ].This entry represents chorismate mutase from eukaryotes.
Protein Domain
Name: Checkpoint protein Hus1/Mec3
Type: Family
Description: This entry consists of the human Hus1 protein and budding yeast Mec3. They are components of the checkpoint clamp complex involved in the surveillance mechanism that allows the DNA repair pathways to act to restore the integrity of the DNA prior to DNA synthesis or separation of the replicated chromosomes [ , ]. Hus1, Rad1, and Rad9 (which share homology with Mec1, Rad17, Ddc1 in budding yeast) are three evolutionarily conserved proteins required for checkpoint control. These proteins are known to form a stable complex. Structurally, the Ddc1-Mec3-Rad17 complex is similar to the PCNA complex, which forms trimeric ring-shaped clamps. Ddc1-Mec3-Rad17 plays a role in checkpoint activation that permits DNA-repair pathways to prevent cell cycle progression in response to DNA damage and replication stress [, ].
Protein Domain      
Protein Domain
Name: Calycin
Type: Homologous_superfamily
Description: Calycins form a large protein superfamily that share similar β-barrel structures. Calycins can be divided into families that include lipocalins, fatty acid binding proteins, triabin, and thrombin inhibitor [ ]. Of these families, the lipocalin family () is the largest and functionally the most diverse. Lipocalins are extracellular proteins that share several common recognition properties such as ligand binding, receptor binding and the formation of complexes with other macromolecules. Lipocalins include the retinol binding protein, lipocalin allergen, aphrodisin (a sex hormone), alpha-2U-globulin, prostaglandin D synthase, beta-lactoglobulin, bilin-binding protein, and the nitrophorins [ , , , ]. Bacterial hypothetical proteins YodA from Escherichia coli and YwiB from Bacillus subtilis share a similar calycin β-barrel structure. Part of the YodA hypothetical protein has a calycin-like structure [].
Protein Domain
Name: Chitinase-like
Type: Family
Description: This entry includes Chitinase 1/2 from Tulipa saxatilis subsp. bakeri. They are class IIIb chitinases [ ].
Protein Domain
Name: Nuclear factor related to kappa-B-binding protein
Type: Family
Description: Nuclear factor related to kappa-B-binding protein, also known as INO80 complex subunit G, is a component of the metazoan INO80 complex involved in chromatin remodelling, transcription regulation, DNA replication and DNA repair [ , ].
Protein Domain
Name: CSC1/OSCA1-like, 7TM region
Type: Domain
Description: This entry represents the seven transmembrane domain region of plant OSCA1, yeast PHM7 and RSN1 and CSC1-like protein 1 (also known as TRANSMEMBRANE PROTEIN 63A) [ , , ]. Members of this entry are mechanosensitive calcium-permeable ion channels consisting of an N-terminal transmembrane domain (RSN1_TM), a cytosolic domain and a 7TM region at the C-terminal. This domain is found in eukaryotic transmembrane proteins that are involved in diverse functions, including phosphate metabolism () [ ] and spore wall assembly () [ ].
Protein Domain
Name: Vesicle transport v-SNARE, N-terminal
Type: Domain
Description: v-SNARE proteins are required for protein traffic between eukaryotic organelles. The v-SNAREs on transport vesicles interact with t-SNAREs on target membranes in order to facilitate this [ ]. This domain is the N-terminal half of the v-Snare proteins.
Protein Domain
Name: Protein of unknown function DUF1997
Type: Family
Description: This family of proteins are functionally uncharacterised.
Protein Domain
Name: Mediator of RNA polymerase II transcription subunit 25, von Willebrand factor type A domain
Type: Domain
Description: The overall function of the full-length Med25 is efficiently to coordinate the transcriptional activation of RAR/RXR (retinoic acid receptor/retinoic X receptor) in higher eukaryotic cells. Human Med25 consists of several domains with different binding properties, the N-terminal, VWA domain which is this one, an SD2 domain from residues 229-381, a PTOV(B) or ACID domain from 395-545, an SD2 domain from residues 564-645 and a C-terminal NR box-containing domain (646-650) from 646-747. This VWA or von Willebrand factor type A domain when bound to RAR and the histone acetyltransferase CBP is responsible for recruiting Med1 to the rest of the Mediator complex [].
Protein Domain
Name: PAS domain
Type: Domain
Description: PAS domains are involved in many signalling proteins where they are used as a signal sensor domain [ ]. PAS domains appear in archaea, bacteria and eukaryotes. Several PAS-domain proteins are known to detect their signal by way of an associated cofactor. Heme,flavin, and a 4-hydroxycinnamyl chromophore are used in different proteins. The PAS domain was named after three proteins that it occurs in: Per- period circadian proteinArnt- Ah receptor nuclear translocator proteinSim- single-minded protein.PAS domains are often associated with PAC domains . It appears that these domains are directly linked, and that together they form the conserved 3D PAS fold. The division between the PAS and PAC domains is caused by major differences in sequences in the region connecting these two motifs [ ]. In human PAS kinase, this region has been shown to be very flexible, and adopts different conformations depending on the bound ligand []. Probably the most surprising identification of a PAS domain was that in EAG-like K-channels [ ].
Protein Domain
Name: PAS-associated, C-terminal
Type: Domain
Description: The PAS (Per, Arnt, Sim) domain [ , ] is an approximately 300 amino-acid segment of sequence similarity which is conserved between the Drosophila protein period clock (PER), the Ah receptor nuclear translocator (ARNT) and the Drosophila single-minded (SIM). It is composed of two or more imperfect repeats (PAS-1, PAS-2). In addition, some proteins have another similar region of 40-45 amino acids situated carboxy-terminal to any PAS repeat and which contributes to the PAS structural domain: the PAC motif. The PAS family can be divided in two groups; the proteins that have the PAS motif followed by a PAC motif, and those that do not. It appears that these domains are directly linked, and that together they form the conserved 3D PAS fold. The division between the PAS and PAC domains is caused by major differences in sequences in the region connecting these two motifs []. Within the bHLH/PAS proteins, the PAS domain is involved in protein dimerization with another protein of the family. It has also been associated with light reception, light regulation and circadian rhythm regulators (clock).In bacteria, the PAS domain is usually associated with the input domain of a histidine kinase, or a sensor protein that regulates a histidine kinase.
Protein Domain
Name: PAC motif
Type: Repeat
Description: PAC motifs occur C-terminal to a subset of all known PAS motifs (see ). It is proposed to contribute to the PAS domain fold [ , , ].
Protein Domain
Name: Glucose/Sorbosone dehydrogenase
Type: Domain
Description: Proteins containing this domain are thought to be glucose/sorbosone dehydrogenases. The best characterised of these proteins is soluble glucose dehydrogenase ( ) from Acinetobacter calcoaceticus, which oxidises glucose to gluconolactone. The enzyme is a calcium-dependent homodimer which uses PQQ as a cofactor [ ].
Protein Domain
Name: Soluble quinoprotein glucose/sorbosone dehydrogenase
Type: Homologous_superfamily
Description: Quinoproteins form a class of dehydrogenases distinct from the NAD(P)- and flavin-dependent enzymes, using one of four different quinone cofactors for the oxidation of a variety of compounds. Soluble glucose dehydrogenase (s-GDH) from the bacterium Acinetobacter calcoaceticus is a quinoprotein that requires the cofactor pyrroloquinoline quinone (PQQ) to catalyse the oxidation of glucose to gluconolactone [ ]. s-GDH has six 4-stranded β-sheets in a β-propeller fold.
Protein Domain
Name: Membrane-anchored ubiquitin-fold protein
Type: Family
Description: Ubiquitin-fold proteins are an important class of eukaryotic post-translational modifiers [ , ]. They are generally short proteins (less than 200 amino acids) which contain the core β-grasp fold (also known as the ubiquitin fold) and a C-terminal extension which enables their attachment to other proteins through the terminal carboxyl group. Protein-conjugated ubiquitins have been implicated an a wide variety of cellular process including proteolysis, DNA repair, transcription and autophagy Some ubiquitin-like proteins are not conjugated to proteins, but are instead anchored to the cell membrane by attachment to phospholipids or isoprenes. The functions of these membrane-associated proteins are not generally well understood. In the case of isoprene attachemnet, the prenyl group may also play a role in enhancing protein-protein interactions.This entry represents a group of membrane-associated ubiquitin-fold proteins found in plants and animals [ ]. In Arabidopsis, membrane-anchored ubiquitin-fold (MUB) proteins recruit and dock specific E2s to the plasma membrane. They appear to interact noncovalently with an E2 surface opposite the active site that forms a covalent linkage with Ub []. The animal homologues of MUBs, also known as UBL3, have also been identified as ubiquitin-like proteins [].
Protein Domain
Name: Clp, repeat (R) domain
Type: Domain
Description: Molecular chaperones recognize unfolded or misfolded proteins by binding to hydrophobic surface patches not normally exposed in the native proteins. Members of the Clp/Hsp100 family of chaperones are present in eubacteria and within organelles of all eukaryotes, promoting disaggregation and disassembly of protein complexes and participating in energy-dependent protein degradation. The ClpA, ClpB, and ClpC subfamilies of the Clp/Hsp100 ATPases contain a conserved N-terminal domain of ~150 amino acids, which in turn consists of two repeats of ~75 residues. Although the Clp repeat (R) domain contains two approximate sequence repeats, it behaves as a single cooperatively folded unit. The Clp R domain is thought to provide a means for regulating the specificity of and to enlarge the substrate pool available to Clp/Hsp100 chaperone or protease complexes. These roles can be assisted through the binding of an adaptor protein. Adaptor proteins bind to the Clp R domain, modulate the target specificity of the Clp/Hsp100 complex to a particular substrate of interest, and may also regulate the activity of the complex [, , , , , ].The Clp R domain is monomeric and partially alpha helical. It is a single folding unit with pseudo 2-fold symmetry. The Clp R domain structure consists of two four-helix bundles connected by a flexible loop [ , , ]. This entry represents the Clp repeat (R) domain [ ].
Protein Domain
Name: ClpA/B, conserved site 1
Type: Conserved_site
Description: The ClpA/B family of ATP-binding proteins includes the regulatory subunit of the ATP-dependent protease Clp, ClpA; heat shock proteins ClpB, 104 and 78; and chloroplast proteins CD4a (ClpC) and CD4b [ , ]. The proteins are thought to protect cells from stress by controlling the aggregation and denaturation of vital cellular structures. They vary in size, but share two conserved regions of about 200 amino acids that each contains an ATP-binding site [].This entry represents a conserved site found in the first conserved region.
Protein Domain      
Protein Domain
Name: HSP40/DnaJ peptide-binding
Type: Homologous_superfamily
Description: The Escherichia coli Hsp40 DnaJ and Hsp70 DnaK cooperate in the binding of proteins at intermediate stages of folding, assembly, and translocation across membranes [ ]. Binding of protein substrates to the DnaK C-terminal domain is controlled by ATP binding and hydrolysis in the N-terminal ATPase domain. The interaction of DnaJ with DnaK is mediated at least in part by the highly conserved N-terminal J-domain of DnaJ. The J-domain interaction is localized to the ATPase domain of DnaK and is likely to be dominated by electrostatic interactions. J-domain may tether DnaK to DnaJ-bound substrates, which DnaK then binds with its C-terminal peptide-binding domain. The peptide-binding domain of DnaJ is comprised of a beta sandwich made up of 6 β-strands divided into 2 sheets.
Protein Domain
Name: Chaperone DnaJ, C-terminal
Type: Domain
Description: Molecular chaperones are a diverse family of proteins that function to protect proteins in the intracellular milieu from irreversible aggregation during synthesis and in times of cellular stress. The bacterial molecular chaperone DnaK is an enzyme that couples cycles of ATP binding, hydrolysis, and ADP release by an N-terminal ATP-hydrolizing domain to cycles of sequestration and release of unfolded proteins by a C-terminal substrate binding domain. Dimeric GrpE is the co-chaperone for DnaK, and acts as a nucleotide exchange factor, stimulating the rate of ADP release 5000-fold [ ]. DnaK is itself a weak ATPase; ATP hydrolysis by DnaK is stimulated by its interaction with another co-chaperone, DnaJ. Thus the co-chaperones DnaJ and GrpE are capable of tightly regulating the nucleotide-bound and substrate-bound state of DnaK in ways that are necessary for the normal housekeeping functions and stress-related functions of the DnaK molecular chaperone cycle.Besides stimulating the ATPase activity of DnaK through its J-domain, DnaJ also associates with unfolded polypeptide chains and prevents their aggregation []. Thus, DnaK and DnaJ may bind to one and the same polypeptide chain to form a ternary complex. The formation of a ternary complex may result in cis-interaction of the J-domain of DnaJ with the ATPase domain of DnaK. An unfolded polypeptide may enter the chaperone cycle by associating first either with ATP-liganded DnaK or with DnaJ. DnaK interacts with both the backbone and side chains of a peptide substrate; it thus shows binding polarity and admits only L-peptide segments. In contrast, DnaJ has been shown to bind both L- and D-peptides and is assumed to interact only with the side chains of the substrate.This domain consists of the C-terminal region of the DnaJ protein. The function of this domain is unknown. It is found associated with and . DnaJ is a chaperone associated with the Hsp70 heat-shock system involved in protein folding and renaturation after stress. The two C-terminal domains CTDI and this, CTDII, are necessary for maintaining the J-domains in their specific relative positions [].
Protein Domain
Name: Domain of unknown function DUF627, N-terminal
Type: Domain
Description: This domain represents the N-terminal region of several plant proteins of unknown function.
Protein Domain
Name: Domain of unknown function DUF629
Type: Domain
Description: This domain represents a region of several plant proteins of unknown function. A C2H2 zinc finger is predicted in this region in some family members, but the spacing between the cysteine residues is not conserved throughout the family.
Protein Domain
Name: WW domain binding protein 11
Type: Family
Description: Synonym(s): Rsp5 or WWP domainThe WW domain is a short conserved region in a number of unrelated proteins, which folds as a stable, triple stranded β-sheet. This short domain of approximately 40 amino acids, may be repeated up to four times in some proteins [ , , , ]. The name WW or WWP derives from the presence of two signature tryptophan residues that are spaced 20-23 amino acids apart and are present in most WW domains known to date, as well as that of a conserved Pro. The WW domain binds to proteins with particular proline-motifs, [AP]-P-P-[AP]-Y, and/or phosphoserine- phosphothreonine-containing motifs [, ]. It is frequently associated with other domains typical for proteins in signal transduction processes.A large variety of proteins containing the WW domain are known. These include; dystrophin, a multidomain cytoskeletal protein; utrophin, a dystrophin-like protein of unknown function; vertebrate YAP protein, substrate of an unknown serine kinase; Mus musculus (Mouse) NEDD-4, involved in the embryonic development and differentiation of the central nervous system; Saccharomyces cerevisiae (Baker's yeast) RSP5, similar to NEDD-4 in its molecular organisation; Rattus norvegicus (Rat) FE65, a transcription-factor activator expressed preferentially in liver; Nicotiana tabacum (Common tobacco) DB10 protein, amongst others.This entry represents WW domain-binding protein 11, which may play a role in the regulation of pre-mRNA processing, and also EARLY FLOWERING 5, which acts as a repressor of flowering in Arabidopsis thaliana [ ].
Protein Domain
Name: ClpA/B family
Type: Family
Description: This family belongs to the AAA+ (ATPase associated with diverse cellular activities) superfamily. Most of these proteins of this family (ClpA, ClpC, ClpD, ClpE, ClpX, HslU) are involved in proteolysis and associate with a separate proteolytic subunit (ClpP, HslV) to form the active protease. ClpB is not involved in proteolysis but rather acts in collaboration with the DnaK (Hsp70) chaperone system to disassemble and refold large protein aggregates.A group of ATP-binding proteins that includes the regulatory subunit of the ATP-dependent protease clpA; heat shock proteins clpB, 104 and 78; and chloroplast proteins CD4a (ClpC) and CD4b belong to this family [, ]. The proteins are thought to protect cells from stress by controlling the aggregation and denaturation of vital cellular structures. They vary in size, but share a domain which contains an ATP-binding site.These signatures which span the ATP binding region also identify the bacterial DNA polymerase III subunit tau ( ), ATP-dependent protease La ( ) and the mitochondrial lon protease homologue ( ), both of which belong to MEROPS peptidase family S16.
Protein Domain
Name: Clp ATPase, C-terminal
Type: Domain
Description: Most Clp ATPases form complexes with peptidase subunits and are involved in protein degradation, though some, such as ClpB, do not associate with peptidases and are involved in protein disaggregation [ ]. This entry represents the C-terminal domain of Clp ATPases, often referred to as the D2-small domain, which forms a mixed α-β structure. Compared with the adjacent AAA D1-small domain () it lacks the long coiled-coil insertion, and instead of helix C4 contains a β-strand (e3) that is part of a three stranded β-pleated sheet. In Thermophilus the whole protein forms a hexamer with the D1-small and D2-small domains located on the outside of the hexamer, with the long coiled-coil being exposed on the surface. The D2-small domain is essential for oligomerisation, forming a tight interface with the D2-large domain of a neighbouring subunit, thereby providing enough binding energy to stabilise the functional assembly [ ].
Protein Domain      
Protein Domain
Name: ClpA/B, conserved site 2
Type: Conserved_site
Description: The ClpA/B family of ATP-binding proteins includes the regulatory subunit of the ATP-dependent protease Clp, ClpA; heat shock proteins ClpB, 104 and 78; and chloroplast proteins CD4a (ClpC) and CD4b [ , ]. The proteins are thought to protect cells from stress by controlling the aggregation and denaturation of vital cellular structures. They vary in size, but share two conserved regions of about 200 amino acids that each contains an ATP-binding site [].This entry represents a conserved site found in the second conserved region.Proteins containing this site are listed below:Escherichia coli clpA, which acts as the regulatory subunit of the ATP- dependent protease clp.Rhodopseudomonas blastica clpA homolog.Escherichia coli heat shock protein clpB and homologues in other bacteria.Bacillus subtilis protein mecB.Yeast heat shock protein 104 (gene HSP104), which is vital for tolerance to heat, ethanol and other stresses.Neurospora heat shock protein hsp98.Yeast mitochondrial heat shock protein 78 (gene HSP78) [ ].CD4A and CD4b, two highly related tomato proteins that seem to be located in the chloroplast.Trypanosoma brucei protein clp.Porphyra purpurea chloroplast encoded clpC.
Protein Domain
Name: L-lactate/malate dehydrogenase
Type: Family
Description: This family contains both lactate and malate dehydrogenases. Malate dehydrogenases catalyse the interconversion of malate to oxaloacetate. The enzyme participates in the citric acid cycle.L-lactate dehydrogenase ( ) (LDH) [ ] catalyses the reversible NAD-dependent interconversion of pyruvate to L-lactate. In vertebrate muscles and in lactic acid bacteria it represents the final step in anaerobic glycolysis. This tetrameric enzyme is present in prokaryotic and eukaryotic organisms. In vertebrates there are three isozymes of LDH: the M form (LDH-A), found predominantly in muscle tissues; the H form (LDH-B), found in heart muscle and the X form (LDH-C), found only in the spermatozoa of mammals and birds. In birds and crocodilian eye lenses, LDH-B serves as a structural protein and is known as epsilon-crystallin [].L-2-hydroxyisocaproate dehydrogenase ( ) (L-hicDH) [ ] catalyses the reversible and stereospecific interconversion between 2-ketocarboxylic acids and L-2-hydroxy-carboxylic acids. L-hicDH is evolutionary related to LDH's.
Protein Domain
Name: Lactate/malate dehydrogenase, C-terminal
Type: Domain
Description: L-lactate dehydrogenases are metabolic enzymes which catalyse the conversion of L-lactate to pyruvate, the last step in anaerobic glycolysis []. L-lactate dehydrogenase is also found as a lens crystallin in bird and crocodile eyes. L-2-hydroxyisocaproate dehydrogenases are also members of the family. Malate dehydrogenases catalyse the interconversion of malate to oxaloacetate []. The enzyme participates in the citric acid cycle.This entry represents the C-terminal, and is thought to be an is an unusual alpha+beta fold.
Protein Domain
Name: Lactate dehydrogenase/glycoside hydrolase, family 4, C-terminal
Type: Homologous_superfamily
Description: This entry represents a structural motif found at the C-terminal of lactate dehydrogenase ( )and malate dehydrogenases ( ), as well as at the C-terminal of family 4 glycoside hydrolases ( ). These domains have an unusual fold consisting of segregated α-helical and β-sheet regions, although they contain predominantly anti-parallel β-sheets [ , , ].L-lactate dehydrogenases are metabolic enzymes that catalyse the conversion of L-lactate to pyruvate, the last step in anaerobic glycolysis. L-lactate dehydrogenase is also found as a lens crystallin in bird and crocodile eyes. Malate dehydrogenases catalyse the interconversion of malate to oxaloacetate. The enzyme participates in the citric acid cycle.O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families [ , ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in 'clans'. Glycoside hydrolase family 4 comprises enzymes with several known activities; 6-phospho-beta-glucosidase (); 6-phospho-alpha-glucosidase ( ); alpha-galactosidase ( ).
Protein Domain
Name: Malate dehydrogenase, type 1
Type: Family
Description: This entry represents the NAD-dependent malate dehydrogenase found in eukaryotes and certain gamma proteobacteria. The enzyme is involved in the citric acid cycle as well as the glyoxalate cycle. Several isoforms exist in eukaryotes. In Saccharomyces cereviseae, for example, there are cytoplasmic, mitochondrial and peroxisomal forms. Although malate dehydrogenases have in some cases been mistaken for lactate dehydrogenases due to the similarity of these two substrates and the apparent ease with which evolution can toggle these activities, critical residues have been identified [ ] which can discriminate between the two activities.
Protein Domain
Name: Membralin
Type: Family
Description: Membralin is evolutionarily highly conserved, though it appears to represent a unique protein family. The protein appears to contain several transmembrane regions. In humans it is expressed in certain cancers, particularly ovarian cancers [ ]. Membralin-like gene homologues have been identified in plants including grape, cotton and tomato [].
Protein Domain
Name: P-type ATPase, subfamily IIIA
Type: Family
Description: Transmembrane ATPases are membrane-bound enzyme complexes/ion transporters that use ATP hydrolysis to drive the transport of protons across a membrane. Some transmembrane ATPases also work in reverse, harnessing the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. There are several different types of transmembrane ATPases, which can differ in function (ATP hydrolysis and/or synthesis), structure (e.g., F-, V- and A-ATPases, which contain rotary motors) and in the type of ions they transport [ , ]. The different types include:F-ATPases (ATP synthases, F1F0-ATPases), which are found in mitochondria, chloroplasts and bacterial plasma membranes where they are the prime producers of ATP, using the proton gradient generated by oxidative phosphorylation (mitochondria) or photosynthesis (chloroplasts).V-ATPases (V1V0-ATPases), which are primarily found in eukaryotes and they function as proton pumps that acidify intracellular compartments and, in some cases, transport protons across the plasma membrane [ ]. They are also found in bacteria [].A-ATPases (A1A0-ATPases), which are found in Archaea and function like F-ATPases, though with respect to their structure and some inhibitor responses, A-ATPases are more closely related to the V-ATPases [ , ].P-ATPases (E1E2-ATPases), which are found in bacteria and in eukaryotic plasma membranes and organelles, and function to transport a variety of different ions across membranes.E-ATPases, which are cell-surface enzymes that hydrolyse a range of NTPs, including extracellular ATP.P-ATPases (also known as E1-E2 ATPases) ([intenz:3.6.3.-]) are found in bacteria and in a number of eukaryotic plasma membranes and organelles []. P-ATPases function to transport a variety of different compounds, including ions and phospholipids, across a membrane using ATP hydrolysis for energy. There are many different classes of P-ATPases, which transport specific types of ion: H+, Na +, K +, Mg 2+, Ca 2+, Ag +and Ag 2+, Zn 2+, Co 2+, Pb 2+, Ni 2+, Cd 2+, Cu +and Cu 2+. P-ATPases can be composed of one or two polypeptides, and can usually assume two main conformations called E1 and E2. This family represent the plasma membrane proton efflux P-type ATPase found in plants, fungi, protozoa, slime molds, and related bacterial and archaeal putative H(+)-ATPases. The best studied representative is from yeast [ ].
Protein Domain
Name: UDP-glucose/GDP-mannose dehydrogenase, dimerisation
Type: Domain
Description: The UDP-glucose/GDP-mannose dehydrogenases are a small group of enzymes which possesses the ability to catalyse the NAD-dependent 2-fold oxidation of an alcohol to an acid without the release of an aldehyde intermediate [ , ].The enzymes have a wide range of functions. In plants UDP-glucose dehydrogenase, , is an important enzyme in the synthesis of hemicellulose and pectin [ ], which are the components of newly formed cell walls; while in zebrafish UDP-glucose dehydrogenase is required for cardiac valve formation []. In Xanthomonas campestris, a plant pathogen, UDP-glucose dehydrogenase is required for virulence []. GDP-mannose dehydrogenase, , catalyses the formation of GDP-mannuronic acid, which is the monomeric unit from which the exopolysaccharide alginate is formed. Alginate is secreted by a number of bacteria, which include Pseudomonas aeruginosa and Azotobacter vinelandii. In P. aeruginosa, alginate is believed to play an important role in the bacteria's resistance to antibiotics and the host immune response [ ], while in A. vinelandii it is essential for the encystment process [].This entry represents an alpha helical region that serves as the dimerisation interface for these enzymes [ , ].
Protein Domain
Name: Coiled-coil domain-containing protein 90-like
Type: Family
Description: This entry includes coiled-coil domain-containing proteins 90 (CCDC90) and related proteins. CCDC90A is a key regulator of the mitochondrial calcium uniporter (MCU) and hence was renamed MCUR1 [ , , ]. A study in mammals and in yeast homologue fmp32 has reported that MCUR1 is a cytochrome c oxidase assembly factor and that it has an indirect role as a regulator of MCU [], however, subsequent publications confirmed the function of MCUR1 as a regulator of MCU [, ]. The role of CCDC90B proteins is still not known.
Protein Domain      
Protein Domain
Name: Glycosyl transferase, family 8
Type: Family
Description: The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates ([intenz:2.4.1.-]) and related proteins into distinct sequence based families has been described [ ]. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form 'clans'.Glycosyltransferase family 8 comprises enzymes with a number of known activities; lipopolysaccharide galactosyltransferase ( ), lipopolysaccharide glucosyltransferase 1 ( ), glycogenin glucosyltransferase ( ), inositol 1-α-galactosyltransferase ( ), α-1,3-xylosyltransferase and β-1,3-glucuronyltransferase [ ]. These enzymes have a distant similarity to family GT_24.
Protein Domain
Name: ASXL, HARE-HTH domain
Type: Domain
Description: This domain, known as the HARE-HTH domain, adopts the winged helix-turn-helix fold and is predicted to bind DNA. It can be found at the N terminus of the ASXL protein. It can also be found in several other eukaryotic chromatin proteins (such as HB1 in plants), diverse restriction endonucleases and DNA glycosylases, the RNA polymerase delta subunit of Gram-positive bacteria and certain bacterial proteins that combine features of the RNA polymerase alpha-subunit and sigma factors [ ]. The genetic interaction of the HARE-HTH containing ASXL with the methyl cytosine hydroxylating Tet2 protein is suggestive of a role for the domain in discriminating sequences with DNA modifications such as hmC []. Bacterial versions include fusions to diverse restriction endonucleases, and a DNA glycosylase where it may play a similar role in detecting modified DNA. Certain bacterial version of the HARE-HTH domain show fusions to the helix-hairpin-helix domain of the RNA polymerase alpha subunit and the HTH domains found in regions 3 and 4 of the sigma factors []. These versions are predicted to function as a novel inhibitor of the binding of RNA polymerase to transcription start sites, similar to the Bacillus delta protein [, ].This domain consists of four α-helices (helices I-II-III-IV) and an antiparallel β-sheet composed of three short β-strands at the top of a "twisted tripod"formed by helices II, III, and IV [ ].The Asx-like (Asxl) proteins includes Asxl1-3. They are putative Polycomb group (PcG) proteins, which act by forming multiprotein complexes that are required to maintain the transcriptionally repressive state of homeotic genes throughout development. Asxl1 is involved in transcriptional regulation mediated by ligand-bound retinoic acid receptors (RARs) and peroxisome proliferator-activated receptor gamma (PPARG) [ ].The delta protein is a dispensable subunit of Bacillus subtilis RNA polymerase (RNAP) that has major effects on the biochemical properties of the purified enzyme. In the presence of delta, RNAP displays an increased specificity of transcription, a decreased affinity for nucleic acids, and an increased efficiency of RNA synthesis because of enhanced recycling [ ]. The delta protein, contains two distinct regions, an N-terminal domain and a glutamate and aspartate residue-rich C-terminal region []. It participates in both the initiation and recycling phases of transcription.
Protein Domain
Name: Vacuolar protein sorting-associated protein 13, VPS13 adaptor binding domain
Type: Domain
Description: This entry represents the VPS13 adaptor binding (VAB) domain, previously known as SHR-BD, found in VPS13 [ , ]. These proteins interact with membrane-specific adaptor proteins such as Ypt35, Spo71 and the mitochondrial membrane protein Mcp1, to be recruited to different membranes. This domain interacts with Ypt35 which recruits VPS13 to endosomal and vacuolar membranes, and with Mcp1 to target VPS13 at mitochondria []. In plants, this domain is found to be the region which interacts with SHR or the SHORT-ROOT transcription factor, a regulator of root-growth and asymmetric cell division that separates ground tissue into endodermis and cortex. The plant protein containing the SHR-BD is named SHRUBBY or SHBY () [ ]. This domain interacts with Proline-X-Proline (Pro-X-Pro) motif present in receptor proteins at contact sites [].This domain comprises six repeated modules, each of them containing nine β-strands connected by loops and arranged into a β-sandwich [ ].VPS13 proteins have been implicated in processes including vesicle fusion, autophagy, and actin regulation. They bind phospholipids and act as channels that mediate the transfer of lipids between membranes at organelle contact sites [ , , ]. It has been proposed that members of this entry have the capacity to bind and likely transfer tens of glycerolipids at once. Yeast VPS13 acts at multiple cellular sites, namely the interface between mitochondria and the vacuole, on endosomes, on the nuclear-vacuole junction and the vacuole, depending on the carbon source and metabolic state. Most evidence showed that mammalian VPS13A, VPS13C and VPS13D localize at contacts between the ER and other organelles, i.e. VPS13A and VPS13D bridge the ER to mitochondria, VPS13C bridges the ER to late endosomes and lysosomes and VPS13B may localize to endosome-endosome contacts [, , ]. Mutations in human VPS13 proteins (VPS13A-D) cause different diseases such as Chorea-acanthocytosis, Cohen syndrome, Parkinson's disease, and spastic ataxia, respectively which suggests they have different functions [, ]. Members of this entry belong to the repeating β-groove (RBG) superfamily. These proteins share a structure made of multiple repeating modules consisting of five β-sheets followed by a loop [].
Protein Domain
Name: Vacuolar protein sorting-associated protein 13
Type: Family
Description: Vacuolar protein sorting-associated protein 13 (VPS13) is involved in the delivery of proteins to the vacuole in vegetatively growing yeast [ ] and also regulates membrane morphogenesis during sporulation [, ]. It mediates the transfer of lipids between membranes at organelle contact sites [, ] and it is involved in mitochondrial lipid homeostasis [, ]. In humans, the hereditary disorders chorea acanthocytosis and Cohen syndrome are caused by mutations in members of this family that are orthologues of yeast VPS13 (VPS13A and VPS13B respectively). Human VPS13A binds phospholipids and is required for the formation or stabilization of ER-mitochondria contact sites which enable transfer of lipids between the ER and mitochondria [, , ]. It is also required for efficient lysosomal protein degradation [].These proteins belong to the RNG superfamily (repeating β-grooves).
Protein Domain
Name: Nodulin
Type: Family
Description: Nodulin is a plant protein of unknown function. It is induced during nodulation in legume roots after rhizobium infection.
Protein Domain
Name: Exportin-1/Importin-beta-like
Type: Domain
Description: The exchange of macromolecules between the nucleus and cytoplasm takes place through nuclear pore complexes within the nuclear membrane. Active transport of large molecules through these pore complexes require carrier proteins, called karyopherins (importins and exportins), which shuttle between the two compartments.This domain is found close to the N terminus of yeast exportin 1 (Xpo1, Crm1, ), as well as adjacent to the N-terminal domain of importin-beta ( ). Exportin 1 is a nuclear export receptor that translocates proteins out of the nucleus; it interacts with leucine-rich nuclear export signal (NES) sequences in proteins to be transported, as well as with RanGTP [ , ]. Importin-beta is a nuclear import receptor that translocates proteins into the nucleus; it interacts with RanGTP and importin-alpha, the latter binding with the nuclear localisation signal (NLS) sequences in proteins to be transported [].
Protein Domain
Name: Protein of unknown function DUF4336
Type: Family
Description: The function of these proteins is not known.
Protein Domain
Name: Pre-ATP-grasp domain superfamily
Type: Homologous_superfamily
Description: The ATP-grasp fold is one of several distinct ATP-binding folds, and is found in enzymes that catalyse the formation of amide bonds, catalysing the ATP-dependent ligation of a carboxylate-containing molecule to an amino or thiol group-containing molecule [ ]. This fold is found in many different enzyme families, including various peptide synthetases, biotin carboxylase, synapsin, succinyl-CoA synthetase, pyruvate phosphate dikinase, and glutathione synthetase, glutathionylspermidine synthase, amongst others []. These enzymes contribute predominantly to macromolecular synthesis, using ATP-hydrolysis to activate their substrates. This superfamily represents the pre-ATP-grasp structural domain, which precedes the ATP-grasp domain in all superfamily members, and which usually occurs at the N terminus of the protein. The structure of the pre-ATP-grasp domain consists of α/β/α in three layers, and is possibly a rudiment form of the Rossmann-fold.
Protein Domain
Name: Carbamoyl-phosphate synthetase large subunit-like, ATP-binding domain
Type: Domain
Description: Carbamoyl phosphate synthase (CPSase) is a heterodimeric enzyme composed of a small and a large subunit (with the exception of CPSase III, see below). CPSase catalyses the synthesis of carbamoyl phosphate from biocarbonate, ATP and glutamine ( ) or ammonia ( ), and represents the first committed step in pyrimidine and arginine biosynthesis in prokaryotes and eukaryotes, and in the urea cycle in most terrestrial vertebrates [ , ]. CPSase has three active sites, one in the small subunit and two in the large subunit. The small subunit contains the glutamine binding site and catalyses the hydrolysis of glutamine to glutamate and ammonia. The large subunit has two homologous carboxy phosphate domains, both of which have ATP-binding sites; however, the N-terminal carboxy phosphate domain catalyses the phosphorylation of biocarbonate, while the C-terminal domain catalyses the phosphorylation of the carbamate intermediate []. The carboxy phosphate domain found duplicated in the large subunit of CPSase is also present as a single copy in the biotin-dependent enzymes acetyl-CoA carboxylase () (ACC), propionyl-CoA carboxylase ( ) (PCCase), pyruvate carboxylase ( ) (PC) and urea carboxylase ( ). Most prokaryotes carry one form of CPSase that participates in both arginine and pyrimidine biosynthesis, however certain bacteria can have separate forms. The large subunit in bacterial CPSase has four structural domains: the carboxy phosphate domain 1, the oligomerisation domain, the carbamoyl phosphate domain 2 and the allosteric domain [ ]. CPSase heterodimers from Escherichia coli contain two molecular tunnels: an ammonia tunnel and a carbamate tunnel. These inter-domain tunnels connect the three distinct active sites, and function as conduits for the transport of unstable reaction intermediates (ammonia and carbamate) between successive active sites []. The catalytic mechanism of CPSase involves the diffusion of carbamate through the interior of the enzyme from the site of synthesis within the N-terminal domain of the large subunit to the site of phosphorylation within the C-terminal domain.Eukaryotes have two distinct forms of CPSase: a mitochondrial enzyme (CPSase I) that participates in both arginine biosynthesis and the urea cycle; and a cytosolic enzyme (CPSase II) involved in pyrimidine biosynthesis. CPSase II occurs as part of a multi-enzyme complex along with aspartate transcarbamoylase and dihydroorotase; this complex is referred to as the CAD protein [ ]. The hepatic expression of CPSase is transcriptionally regulated by glucocorticoids and/or cAMP []. There is a third form of the enzyme, CPSase III, found in fish, which uses glutamine as a nitrogen source instead of ammonia []. CPSase III is closely related to CPSase I, and is composed of a single polypeptide that may have arisen from gene fusion of the glutaminase and synthetase domains []. This entry represents the ATP-binding domain found in the large subunit of carbamoyl phosphate synthase, as well as in other proteins, including acetyl-CoA carboxylases and pyruvate carboxylases.
Protein Domain
Name: Biotin carboxylase-like, N-terminal domain
Type: Domain
Description: This domain is structurally related to the PreATP-grasp domain. It is found at the N terminus of biotin carboxylase enzymes [ , , ], and propionyl-CoA carboxylase A chain [].
Protein Domain
Name: Carbamoyl-phosphate synthetase, large subunit oligomerisation domain
Type: Domain
Description: This entry represents the oligomerisation domain found in the large subunit of carbamoyl phosphate synthases as well as in certain other carboxy phosphate domain-containing enzymes. This domain forms a primarily α-helical fold [ ].
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom