Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 101 to 200 out of 30763 for seed protein

Category restricted to ProteinDomain (x)

0.045s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Name: RmlC-like cupin domain superfamily
Type: Homologous_superfamily
Description: RmlC (dTDP (deoxythymidine diphosphates)-4-dehydrorhamnose 3,5-epimerase; ) is a dTDP-sugar isomerase enzyme involved in the synthesis of L-rhamnose, a saccharide required for the virulence of some pathogenic bacteria [ ]. RmlC is a dimer, each monomer being formed from two β-sheets arranged in a β-sandwich, where the substrate-binding site is located between the two sheets of both monomers.Other protein families contain domains that share this fold, including glucose-6-phosphate isomerase ( ); germin, a metal-binding protein with oxalate oxidase and superoxide dismutases activities [ ]; auxin-binding protein []; seed storage protein 7S []; acireductone dioxygenase []; as well as three proteins that have metal-binding sites similar to that of germine, namely quercetin 2,3-dioxygenase () [ ], phosphomannose isomerase () [ ] and homogentisate dioxygenase () [ ], the last three sharing a 2-domain fold with storage protein 7s.
Protein Domain
Name: Factor of DNA methylation 1-5/IDN2
Type: Family
Description: RNA-directed DNA methylation (RdDM) is a biological process in which non-coding RNA molecules direct the addition of DNA methylation to specific DNA sequences. This entry represents a sub-group of SGS3-LIKE plant proteins that are components of RNA-directed DNA methylation pathway (RdDM), including FDM1-5 and IDN2 from Arabidopsis [, ]. RdDM has been implicated in a number of regulatory processes in plants, such as maintaining transposable element silencing and genome stability, affecting gamete formation and seed viability and protecting the plant from other biotic stresses [].
Protein Domain
Name: Formin-like family, plant
Type: Family
Description: Formins (formin homology proteins) proteins play a crucial role in the reorganisation of the actin cytoskeleton and associate with the fast-growing end (barbed end) of actin filaments [ , ]. This entry represents the formin homologues from plants. Seed plants have two formin clades with numerous paralogues []. They can be classified as class I and class II formins. Class I formins includes a N-terminal membrane insertion signal, a predicted extracytoplasmic Pro-rich stretch, a transmembrane region, and C-terminal FH1 and FH2 domains []. Though class II formins usually contain a N-terminal PTEN domain related to the human PTEN protein (implied in pathogenesis of the Parkinson disease) [], the N-termini of type-II plant formins do not contain any recognisable domain that can provide a clue to their biological function.
Protein Domain
Name: Necrogenic protein Nec1
Type: Family
Description: The Nec1 protein has necrogenic activity on excised potato tuber tissue, and the encoding gene is highly conserved in plant-pathogenic Streptomyces spp. The G+C content of nec1 indicates lateral transfer from an unrelated taxon, but its origins are unclear. Deletion analysis of nec1 demonstrated that the 151-amino-acid C-terminal region of the Nec1 protein is sufficient to confer necrogenic activity. Streptomyces turgidiscabies containing a nec1 deletion was greatly compromised in virulence on Arabidopsis thaliana (Mouse-ear cress), Nicotiana tabacum (Common tobacco), and Raphanus sativus (Radish) seedlings. The wild-type strain, S. turgidiscabies Car8, aggressively colonized and infected the root meristem of radish, whereas the delta-nec1 mutant Car811 did not. Taken together, the data suggest that Nec1 is a secreted virulence protein with a conserved plant cell target that acts early in plant infection [ ].
Protein Domain
Name: Proteinase inhibitor I7, squash
Type: Family
Description: The squash inhibitors form one of a number of serine proteinase inhibitor families. They belong to MEROPS inhibitor family I7, clan IE. They are generally annotated as either trypsin or elastase inhibitors (MEROPS peptidase family S1, ). The proteins, found exclusively in the seeds of the cucurbitaceae, e.g. Citrullus lanatus (watermelon), Cucumis sativus (cucumber), Momordica charantia (balsam pear), are approximately 30 residues in length and contain 6 Cys residues, which form 3 disulphide bonds []. The inhibitors function by being taken up by a serine protease (such as trypsin),which cleaves the peptide bond between Arg/Lys and Ile residues in the N-terminal portion of the protein [ , ]. Structural studies have shown that the inhibitor has an ellipsoidal shape, and is largely composed of β-turns []. The fold and Cys connectivityof the proteins resembles that of potato carboxypeptidase A inhibitor [ ].
Protein Domain
Name: RNA polymerase-associated protein Ctr9
Type: Family
Description: This entry includes budding yeast RNA polymerase-associated protein Ctr9 and its homologues from other yeasts, animals and plants. The homologue in fission yeast is known as tetratricopeptide repeat protein 1 (Tpr1) [ ].Budding yeast Ctr9 is part of the Paf1 complex, an RNA polymerase II-associated protein complex containing Paf1, Cdc73, Ctr9, Rtf1 and Leo1 [ ]. Paf1 complex is involved in histone modifications, transcription elongation and other gene expression processes that include transcript site selection []. Human Paf1 complex (Paf1C) consists of Paf1, Cdc73, Ctr9, Rtf1, Leo1 and Wdr61 (Ski8). As in yeast, the human Paf1C has a central role in co-transcriptional histone modifications [].Arabidopsis Paf1C related proteins such as VIP4 (Leo1), VIP5 (Rtf1), ELF7 (Paf1), ELF8 (Ctr9) and ATXR7 (Set1) are required for the induction of seed dormancy. They control both germination and flowering time [ ].
Protein Domain
Name: Metallothionein-like protein 3
Type: Family
Description: Metallothioneins (MTs) are small proteins with a high content of cysteine residues that bind various heavy metals. Plant MTs are classified into four types based on the arrangement of cysteine residues, and all are involved in copper and other metals homeostasis [ , ]. This entry represents MT3 that in Arabidopsis is predominantly expressed in leaf mesophyll cells. It functions as copper (Cu) and zinc (Zn) chelator and plays a role in Cu homeostasis, specifically in the remobilization of Cu from senescing leaves. The mobilization of Cu is important for seed development [, ].
Protein Domain
Name: Kunitz inhibitor STI-like superfamily
Type: Homologous_superfamily
Description: The Kunitz-type soybean trypsin inhibitor (STI) family consists mainly of proteinase inhibitors from Leguminosae seeds [ ]. They belong to MEROPS inhibitor family I3, clan IC. They exhibit proteinase inhibitory activity against serine proteinases; trypsin (MEROPS peptidase family S1, ) and subtilisin (MEROPS peptidase family S8, ), thiol proteinases (MEROPS peptidase family C1, ) and aspartic proteinases (MEROPS peptidase family A1, ) [ ]. STI has a beta-Trefoil type fold consisting of a closed barrel and a hairpin triplet, with internal pseudo threefold symmetry.The C-terminal domain of Clostridium spp. neurotoxins have an overfold that is very similar to that of the Kunitz STI family. The tetanus toxin binds to the gangliosides receptor, GT1b, and is composed of light and heavy chains, where the light chain is responsible for toxicity; the Kunitz-like domain is found on the C-terminal of the heavy chain, and is responsible for binding to sensitive cells [ ].
Protein Domain
Name: Peroxisomal targeting signal 2 receptor
Type: Family
Description: Peroxisomal proteins catalyse metabolic reactions. The import of proteins from the cytosol into the peroxisomes matrix depends on more than a dozen peroxin (PEX) proteins, among which PEX5 and PEX7 serve as receptors that shuttle proteins bearing one of two peroxisome-targeting signals (PTSs) into the organelle. PEX5 is the PTS1 receptor, while PEX7 is the PTS2 receptor. In plants, PEX7 depends on PEX5 binding to deliver PTS2 cargo into the peroxisome, and PEX7 also facilitates PEX5 accumulation and import of PTS1 cargo into peroxisomes [ , ]. This entry represents PEX7 from animals, fungi and plants. In plants it plays important roles in embryonic development, seedling establishment, and vegetative growth []. Budding yeast PEX7 homologue, Pas7, is necessary for import of thiolase (a PTS2-containing protein) []. In humans PEX7 has been linked to several diseases, such as Rhizomelic chondrodysplasia punctata 1 (RCDP1) [] and Peroxisome biogenesis disorder 9B (PBD9B) [].
Protein Domain
Name: Parallel beta-helix repeat-2
Type: Repeat
Description: This model represents a tandem pair of an approximately 22-amino acid (each) repeat homologous to the β-strand repeats that stack in a right-handed parallel β-helix in the periplasmic C-5 mannuronan epimerase, AlgA, of Pseudomonas aeruginosa. A homology domain consisting of a longer tandem array of these repeats is described in the SMART database as CASH (SM00722), and is found in many carbohydrate-binding proteins and sugar hydrolases. A single repeat is represented by SM00710. This TIGRFAMs model represents a flavor of the parallel β-helix-forming repeat based on prokaryotic sequences only in its seed alignment, although it also finds many eukaryotic sequences.
Protein Domain
Name: Potassium channel KAT/AKT
Type: Family
Description: Potassium (K+) is an essential nutrient for plant growth and development. This entry represents a group of potassium channel proteins from plants, including KAT1-3, SKOR, GORK and AKT1/2/5/6 from Arabidopsis [ ]. Together with AtHAK5, AKT1 mediates high-affinity K+ uptake into roots during seedling establishment and post-germination growth []. KAT1 is required for guard cell K+ uptake during light-induced stomatal opening []. AKT6, also known as SPIK, plays a role in K+ uptake in the growing pollen tube []. AKT1 has been shown to be regulated by the CBL1-CIPK23 complex [, ]. SKOR and GORK are both outward-rectifying potassium channels [, ].
Protein Domain
Name: SUA-like, OCRE domain
Type: Domain
Description: SUA is an RNA-binding protein located in the nucleus and expressed in all plant tissues. It functions as a splicing factor that influences seed maturation by controlling alternative splicing of ABI3. The suppression of the cryptic ABI3 intron indicates a role of SUA in mRNA processing. SUA also interacts with the prespliceosomal component U2AF65, the larger subunit of the conserved pre-mRNA splicing factor U2AF. SUA contains two RNA recognition motifs surrounding a zinc finger domain, an OCtamer REpeat (OCRE) domain, and a Gly-rich domain close to the C terminus [ ].The OCRE (OCtamer REpeat) domain contains five repeats of an 8-residue motif, which were shown to form β-strands. Based on the architectures of proteins containing OCRE domains, a role in RNA metabolism and/or signalling has been proposed [ ].
Protein Domain
Name: Avirulence Effector AvrLm4-7
Type: Domain
Description: AvrLm4-7 is found in Leptosphaeria maculans, an ascomycete fungus in the dothideomycete group which is responsible for stem canker (blackleg) of Brassica napus (oilseed rape, OSR) and other crucifers. AvrLm4-7 is one of six avirulence genes which encodes a small secreted protein strongly over-expressed at the onset of plant infection. This gene confers a dual recognition specificity by two distinct resistance genes of OSR, Rlm4 and Rlm7 and loss of AvrLm4 avirulence was demonstrated to be associated with a strong fitness cost. Structure and functional analysis of AvrLm4-7 protein show that it contains the motifs RAWG and RYRE, part of a well-structured protein region held together by disulfide bridges. Mutations in the RAWG motif or in the RYRE motif (especially mutations in both motifs) almost abolished the translocation of AvrLm4-7 into cells. Furthermore, loss of recognition of AvrLm4-7 by Rlm4 is caused by the mutation of a single glycine to an arginine residue located in a loop of the protein [ ].
Protein Domain
Name: Leo1-like protein
Type: Family
Description: In budding yeasts, Leo1 is part of the Paf1 complex, an RNA polymerase II-associated protein complex containing Paf1, Cdc73, Ctr9, Rtf1 and Leo1. Paf1 complex is involved in histone modifications, transcription elongation and other gene expression processes that include transcript site selection [ ]. This entry also includes Leo1 homologues from animals and plants. Human Leo1, also known as RDL, is a component of the human Paf1 complex (Paf1C), which consists of Paf1, Cdc73, Ctr9, Rtf1, Leo1 and Wdr61 (Ski8). As in yeast, the human Paf1C has a central role in co-transcriptional histone modifications [ ]. Human Leo1 promotes senescence of 2BS fibroblasts [].Arabidopsis Paf1C related proteins such as VIP4 (Leo1), VIP5 (Rtf1), ELF7 (Paf1), ELF8 (Ctr9) and ATXR7 (Set1) are required for the induction of seed dormancy. They control both germination and flowering time [ ].
Protein Domain
Name: Plant galacturonosyltransferase GAUT
Type: Family
Description: Galacturonosyltransferase 1 (GAUT1) is an alpha1,4-D-galacturonosyltransferase that transfers galacturonic acid from uridine 5'-diphosphogalacturonic acid onto the pectic polysaccharide homogalacturonan [ ]. The GAUT1-related gene family from Arabidopsis thaliana encodes 15 GAUT and 10 GAUT-like (GATL) proteins []. This entry represent the GAUT proteins. Mutants for GAUT genes indicate that GAUTs are involved in pectin and xylan biosynthesis. GAUTs 6, 8, 9, 10, 11, 12, 13, and 14 mutants result in aberrant wall composition. They show distinct patterns, suggesting that these GAUTs have at least six unique functions in pectin and/or xylan biosynthesis [ ]. GAUT12 (IRX8) is involved in the synthesis of cell wall polysaccharides; mutants in this gene are deficient in homogalacturonan and glucuronoxylan []. Similarly, GAUT8, also known as QUASIMODO1, affects homogalacturonan and xylan biosynthesis [, ]. GAUT11 is involved in the production of seed testa cell wall and mucilage [].
Protein Domain
Name: Mannose-6-phosphate isomerase, type II, C-terminal
Type: Domain
Description: Mannose-6-phosphate isomerase or phosphomannose isomerase ( ) (PMI) is the enzyme that catalyses the interconversion of mannose-6-phosphate and fructose-6-phosphate. In eukaryotes PMI is involved in the synthesis of GDP-mannose, a constituent of N- and O-linked glycans and GPI anchors and in prokaryotes it participates in a variety of pathways, including capsular polysaccharide biosynthesis and D-mannose metabolism. PMI's belong to the cupin superfamily whose functions range from isomerase and epimerase activities involved in the modification of cell wall carbohydrates in bacteria and plants, to non-enzymatic storage proteins in plant seeds, and transcription factors linked to congenital baldness in mammals [ ]. Three classes of PMI have been defined [].The type II phosphomannose isomerases are bifunctional enzymes . This entry covers the isomerase region of the protein [ ]. The guanosine diphospho-D-mannose pyrophosphorylase region is described in another InterPro entry (see ).
Protein Domain
Name: Acyl carrier protein, chloroplastic
Type: Family
Description: Acyl carrier protein (ACP) is a highly conserved cofactor protein required by Type II fatty acid synthases (FASs). Plant fatty acid biosynthesis occurs in both plastids and mitochondria [ ]. In Arabidopsis, eight gene loci can be recognized by sequence homology to encode ACP isoforms, five of which are plastidic ACPs (ptACPs), and three of which appear to be mtACPs [].This entry represents the ptACPs, which includes AtACP1-5 from Arabidopsis [ ]. From phylogenetic studies, they can be classified in two groups, one comprising AtACP1, AtACP2, AtACP3 and AtACP5, the other comprises AtACP4 and ACP homologues from related species. These proteins respond to different abiotic stresses such as salt, drought and deficiencies in nitrogen, phosphorus, potassium and iron []. AtACP1 (At3g05020), AtACP2 (At1g54580), and AtACP3 (At1g54630) are near constitutively expressed in leaves, roots, and seeds, whereas AtACP4 (At4g25050) is predominantly expressed in leaves and is induced upon illumination of plants and AtACP5 (At5g27200) is preferentially expressed in roots and down-regulated by salt stress [].
Protein Domain
Name: Mannose-6-phosphate isomerase
Type: Family
Description: Mannose-6-phosphate isomerase or phosphomannose isomerase ( ) (PMI) is the enzyme that catalyses the interconversion of mannose-6-phosphate and fructose-6-phosphate. In eukaryotes PMI is involved in the synthesis of GDP-mannose, a constituent of N- and O-linked glycans and GPI anchors and in prokaryotes it participates in a variety of pathways, including capsular polysaccharide biosynthesis and D-mannose metabolism. PMI's belong to the cupin superfamily whose functions range from isomerase and epimerase activities involved in the modification of cell wall carbohydrates in bacteria and plants, to non-enzymatic storage proteins in plant seeds, and transcription factors linked to congenital baldness in mammals [ ]. Three classes of PMI have been defined [].This group represents a mannose-6-phosphate isomerase.
Protein Domain
Name: Mannose-6-phosphate isomerase, Firmicutes type, short form
Type: Family
Description: Mannose-6-phosphate isomerase or phosphomannose isomerase ( ) (PMI) is the enzyme that catalyses the interconversion of mannose-6-phosphate and fructose-6-phosphate. In eukaryotes PMI is involved in the synthesis of GDP-mannose, a constituent of N- and O-linked glycans and GPI anchors and in prokaryotes it participates in a variety of pathways, including capsular polysaccharide biosynthesis and D-mannose metabolism. PMI's belong to the cupin superfamily whose functions range from isomerase and epimerase activities involved in the modification of cell wall carbohydrates in bacteria and plants, to non-enzymatic storage proteins in plant seeds, and transcription factors linked to congenital baldness in mammals [ ]. Three classes of PMI have been defined [].This group represents a mannose-6-phosphate isomerase, Firmicutes type, short form.
Protein Domain
Name: Translation elongation factor, selenocysteine-specific
Type: Family
Description: In prokaryotes, the incorporation of selenocysteine as the 21st amino acid, encoded by TGA, requires several elements: SelC is the tRNA itself, SelD acts as a donor of reduced selenium, SelA modifies a serine residue on SelC into selenocysteine, and SelB is a selenocysteine-specific translation elongation factor. 3' or 5' non-coding elements of mRNA have been found as probable structures for directing selenocysteine incorporation [ ].This family describes the elongation factor SelB, a close homologue of EF-Tu. It may function by replacing EF-Tu. A C-terminal domain not found in EF-Tu is in all SelB sequences in the seed alignment except that from Methanocaldococcus jannaschii (Methanococcus jannaschii). This family should not include an equivalent protein for eukaryotes.
Protein Domain
Name: RNA polymerase II associated factor Paf1
Type: Family
Description: In budding yeasts, Paf1 is part of the Paf1 complex, an RNA polymerase II-associated protein complex containing Paf1, Cdc73, Ctr9, Rtf1 and Leo1 [ ]. Paf1 complex is involved in histone modifications, transcription elongation and other gene expression processes that include transcript site selection []. This entry also includes Paf1 homologues from animals and plants. Human Paf1, also known as PD2 (pancreatic differentiation 2), is associated with tumorigenesis [ ]. Human Paf1 complex (Paf1C) consists of Paf1, Cdc73, Ctr9, Rtf1, Leo1 and Wdr61 (Ski8). As in yeast, the human Paf1C has a central role in co-transcriptional histone modifications []. Human Paf1 complex has a crucial role in the antiviral response []. Arabidopsis Paf1C related proteins such as VIP4 (Leo1), VIP5 (Rtf1), ELF7 (Paf1), ELF8 (Ctr9) and ATXR7 (Set1) are required for the induction of seed dormancy. They control both germination and flowering time [ ].
Protein Domain
Name: PIP2/PIPL1
Type: Family
Description: This entry includes PAMP-induced secreted peptide 2 (PIP2) and PAMP-INDUCED PEPTIDE-LIKE 1 (PIPL1, also known as CEP16 or PREPIPL1) from Arabidopsis, which are part of the PIP/PIPL family [ , ]. These secreted proteins contain two conserved core SGPS motifs at the C terminus and the GxGH motif at the extreme C terminus []. The double peptide motif might be processed into two different peptides or, alternatively, may act as a functional unit, being able to interact with distinct binding sites resulting in the activation of different pathways. PIP2 is involved in innate immune and stress responses. It also acts as a negative regulator of root growth []. PIPL1 is involved in seed development and was also induced by stress [].
Protein Domain
Name: RmlC-like jelly roll fold
Type: Homologous_superfamily
Description: RmlC (deoxythymidine diphosphates-4-dehydrorhamnose 3,5-epimerase; ) is a mainly beta class protein with a jelly roll-like topology. It is a dTDP-sugar isomerase enzyme involved in the synthesis of L-rhamnose, a saccharide required for the virulence of some pathogenic bacteria [ ]. This entry represents the domain with the jelly roll-like fold. Other protein families containing this domain include glucose-6-phosphate isomerase ( ); germin, a metal-binding protein with oxalate oxidase and superoxide dismutases activities [ ]; auxin-binding protein []; seed storage protein 7S []; acireductone dioxygenase []; as well as three proteins that have metal-binding sites similar to that of germine, namely quercetin 2,3-dioxygenase () [ ], phosphomannose isomerase () [ ] and homogentisate dioxygenase () [ ], the last three sharing a 2-domain fold with storage protein 7s.The cAMP-binding domains found in the cAMP receptor protein (CRP) family display a similar β-roll architecture consisting of eight antiparallel β-strands and three helical segments [ ]. These proteins include CooA, a CO-sensing haem protein that functions as a transcription activator [], and the CnbD (cyclic nucleotide binding domain) of the HCN cation channel in which cAMP binding modulates gating of the channel [].
Protein Domain
Name: WD-repeat protein SPA1/2/3/4
Type: Family
Description: In Arabidopsis, SPA1/2/3/4 play a central role in suppression of photomorphogenesis. SPA1 and SPA2 predominate in dark-grown seedlings, whereas SPA3 and SPA4 prevalently regulate the elongation growth in adult plants [ ]. SPAs contain a kinase-like domain, a coiled-coil domain and the WD-repeats. SPAs and COP1 (a ring finger E3 ubiquitin ligase) can form homo- and heterodimers via their respective coiled-coil domains, and the COP1/SPA complex forms a tetramer of two COP1 and two SPA proteins []. The SPA proteins can self-associate or interact with each other, forming a heterogeneous group of SPA-COP1 complexes []. Besides recognizing substrates, both COP1 and SPA bind DDB1 in the CUL4 complex through their C-terminal WD-repeat domains. They serve as DDB1-CUL4-associated factors (DCAFs) similar to other substrate adaptors in CUL4-based E3 ligases. SPA1 interacts with photoreceptor cry2 via its kinase-like domain, with cry1 via its WD-repeat domain and with phytochromes possibly via both []. SPAs have also been shown to regulate the phyB-PIF4 module at high ambient temperature [].
Protein Domain
Name: Proteinase inhibitor I20
Type: Family
Description: Members of the potato peptidase inhibitor II family are proteinase inhibitors that belong to MEROPS inhibitor family I20, clan IA and are restricted to plants. They inhibit serine peptidases belonging to MEROPS peptidase family S1 [ ] (). They have a multidomain structure [ ], which permits circular permutation of the sequences. It was been shown that some naturally occurring Pin2 proteins, have an `ancestral' circularly permuted structure []. Circular permutation/ rearrangements of sequences has also been observed between species, such as favin from Vicia faba and the lectin concanavalin A from Canavalia ensiformis [] or amongst members of the plant aspartyl proteinases and human lung surfactant proteins []. This family of proteinase inhibitors are present in seeds, leaves and other organs. Perhaps the best known representatives are the wound-induced proteinase inhibitors [ , ], which contain up to eight sequence-repeats (the `IP repeats'). The sequence of the IP repeats is quite variable, only the cysteines constituting the four disulphide bridges and a single proline residue are conserved throughout all the known repeat sequences. The structure of the proteinase inhibitor complex is known [].
Protein Domain
Name: E3 ubiquitin ligase UBR4, C-terminal
Type: Domain
Description: Proteins containing this domain includes E3 ubiquitin ligase UBR4 from animals, auxin transport protein BIG from plants and protein purity of essence from fruit flies.UBR4, also known as p600, is a member of the N-recognin family, which contains proven and predicted E3 ligases that recognize and degrade proteins containing destabilizing N termini [ ]. It is involved in anoikis, viral transformation and protein degradation [, ]. It also has roles in neurogenesis, neuronal migration, neuronal signaling and survival in mammalian brains [].BIG is required for auxin efflux and polar auxin transport (PAT) influencing auxin-mediated developmental responses (e.g. cell elongation, apical dominance, lateral root production, inflorescence architecture, general growth and development) [ ]. Auxin transport protein is also involved in the elongation of the pedicels and stem internodes through auxin action, the expression/modulation of light-regulated genes, repression of CAB1 and CAB3 genes in etiolated seedlings, etc [, , , , , , , ].Protein purity of essence, found in Drosophila melanogaster, has a role in growth of the perineurial glial layer of the larval peripheral nerve and it may have a role in male fertility and eye development or function [, , ].
Protein Domain
Name: NET domain
Type: Domain
Description: The bromodomain and extraterminal (BET) proteins are a class of transcriptional regulators whose members can be found in animals, plants and fungi. BET proteins are involved in diverse cellular phenomena such as meiosis, cell-cycle control, and homeosis and have been suggested to modulate chromatin structure and affect transcription via a sequence-independent mechanism. BET proteins are defined as having one (plants) or two (animals/yeast) bromodomains and an Extra Terminal (ET)domain. The ET domain consists of three separate regions, only one of which, the N-terminal ET (NET) domain is conserved in all BET proteins. The function of the NET domain is assumed to be protein binding [ , , , ].The structure of the NET domain comprises three α-helices and a characteristic loop region of an irregular but well-defined structure. The NET structure has an acidic patch that forms a continuousridge with a hydrophobic cleft. which may interact with other proteins and/or DNA [ ].Some proteins known to contain a NET domain include:Human RING3 (now designated Brd2)Murine MCAP (now designated Brd4)Drosophila FshYeast Bdf1 and Bdf2Arabidopsis imbibition-inducible (IMB1), whichplays a role in abscisic acid (ABA) and phytochrome A (phyA) mediated responses of seed germination.
Protein Domain
Name: E3 ubiquitin ligase UBR4-like
Type: Family
Description: This entry includes E3 ubiquitin ligase UBR4 from animals, auxin transport protein BIG from plants and protein purity of essence from fruit flies.UBR4, also known as p600, is a member of the N-recognin family, which contains proven and predicted E3 ligases that recognise and degrade proteins containing destabilising N termini [ ]. It is involved in anoikis, viral transformation and protein degradation [, ]. It also has roles in neurogenesis, neuronal migration, neuronal signaling and survival in mammalian brains [].BIG is required for auxin efflux and polar auxin transport (PAT) influencing auxin-mediated developmental responses (e.g. cell elongation, apical dominance, lateral root production, inflorescence architecture, general growth and development) [ ]. Auxin transport protein is also involved in the elongation of the pedicels and stem internodes through auxin action, the expression/modulation of light-regulated genes, repression of CAB1 and CAB3 genes in etiolated seedlings, etc [ , , , , , , , ].Protein purity of essence, found in Drosophila melanogaster, has a role in growth of the perineurial glial layer of the larval peripheral nerve and it may have a role in male fertility and eye development or function [, , ].
Protein Domain
Name: Tol-Pal system, TolA
Type: Family
Description: Tol proteins are involved in the translocation of group A colicins. Colicins are bacterial protein toxins, which are active against Escherichia coli and other related species. TolA is anchored to the cytoplasmic membrane by a single membrane spanning segment near the N terminus, leaving most of the protein exposed to the periplasm [ ].TolA couples the inner membrane complex of itself with TolQ and TolR to the outer membrane complex of TolB and OprL (also called Pal). Most of the length of the protein consists of low-complexity sequence that may differ in both length and composition from one species to another, complicating efforts to discriminate TolA (the most divergent gene in the tol-pal system) from paralogs such as TonB. Selection of members of the seed alignment and criteria for setting scoring cut-offs are based largely on conserved operon structure. The Tol-Pal complex is required for maintaining outer membrane integrity, and is also involved in transport (uptake) of colicins and filamentous DNA, and implicated in pathogenesis. Transport is energized by the proton motive force. TolA is an inner membrane protein that interacts with periplasmic TolB and with outer membrane porins OmpC, PhoE and LamB.
Protein Domain
Name: HeLo domain superfamily
Type: Homologous_superfamily
Description: This N-terminal domain, HeLo, has a prion-inhibitory effect in cis on its own prion-forming domain (PFD) and in trans on HET-s prion propagation [ ]. The domain is found exclusively in the fungal kingdom. Its structure, as it occurs in the heterokaryon incompatibility proteins HET-s and HET-S proteins, consists of two bundles of α-helices that pack into a single globular domain []. The domain boundary determined from its structure and from protease-resistance experiments overlaps with the C-terminal prion-forming domain of HET-s []. The HeLo domains of HET-s and HET-S are very similar and their few differences (and not the prion-forming domains) determine the compatibility-phenotype of the fungi in which the proteins are expressed.The mechanism of the HeLo domain-function in heterokaryon-incompatibility is still under investigation, however the HeLo domain is found in similar protein architectures as other cell death and apoptosis-inducing domains. The only other HeLo protein to which a function has been associated is LopB from Leptosphaeria maculans [ ]. Although its specific role in L. maculans is unknown, LopB- mutants have impaired ability to form lesions on oilseed rape. The HeLo domain is not related to the HET domain () which is another domain involved in heterokaryon incompatibility.
Protein Domain
Name: SWEET sugar transporter
Type: Family
Description: This family contains specific sugar efflux transporters that are essential for the maintenance of animal blood glucose levels, plant nectar production, and plant seed and pollen development. In many organisms it mediates glucose transport; in Arabidopsis it is necessary for pollen viability; and two of the rice homologues are specifically exploited by bacterial pathogens for virulence by means of direct binding of a bacterial effector to the SWEET promoter [ ]. Homologues of SWEETs have been identified in bacteria [].The founding member of the SWEET family, MtN3, was identified as a nodulin-specific EST in the legume Medicago truncatula [ ]. Another protein in this family may be involved in activation and expression of recombination activation genes (RAGs) []. This family contains a region of two transmembrane helices that is found in two copies in most members of the family.
Protein Domain
Name: 1-Cys peroxiredoxin
Type: Family
Description: This subfamily of peroxiredoxin (known as 1-cys PRX) composed of PRXs containing only one conserved cysteine, which serves as the peroxidatic cysteine. They are homodimeric thiol-specific antioxidant (TSA) proteins that confer a protective role in cells by reducing and detoxifying hydrogen peroxide, peroxynitrite, and organic hydroperoxides [ ]. As with all other PRXs, a cysteine sulfenic acid intermediate is formed upon reaction of 1-cys PRX with its substrates. Having no resolving cysteine, the oxidized enzyme is resolved by an external small-molecule or protein reductant such as thioredoxin or glutaredoxin. Similar to typical 2-cys PRX, 1-cys PRX forms a functional dimeric unit with a B-type interface, as well as a decameric structure which is stabilized in the reduced form of the enzyme. Other oligomeric forms, tetramers and hexamers, have also been reported [, , ]. Mammalian 1-cys PRX is localized cellularly in the cytosol and is expressed at high levels in brain, eye, testes and lung []. The seed-specific plant 1-cys PRXs protect tissues from reactive oxygen species during desiccation and are also called rehydrins [, ].
Protein Domain
Name: Lipocalin Blc-like
Type: Domain
Description: This entry represents the lipocalin/cytosolic fatty-acid binding domain of Bcl and similar proteins predominantly found in bacteria and plants. Escherichia coli bacterial lipocalin (Blc, also known as YjeL) is an outer membrane lipoprotein involved in the storage or transport of lipids necessary for membrane maintenance under stressful conditions. Blc has a binding preference for lysophospholipids [ ]. This entry also includes eukaryotic lipocalins such as Arabidopsis thaliana temperature-induced lipocalin-1 (TIL) which is involved in thermotolerance, oxidative, salt, drought and high light stress tolerance, and is needed for seed longevity by ensuring polyunsaturated lipids integrity [, , , , , ].These proteins have a large β-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.
Protein Domain
Name: Prion-inhibition and propagation, HeLo domain
Type: Domain
Description: This N-terminal domain, HeLo, has a prion-inhibitory effect in cis on its own prion-forming domain (PFD) and in trans on HET-s prion propagation [ ]. The domain is found exclusively in the fungal kingdom. Its structure, as it occurs in the heterokaryon incompatibility proteins HET-s and HET-S proteins, consists of two bundles of α-helices that pack into a single globular domain []. The domain boundary determined from its structure and from protease-resistance experiments overlaps with the C-terminal prion-forming domain of HET-s [ ]. The HeLo domains of HET-s and HET-S are very similar and their few differences (and not the prion-forming domains) determine the compatibility-phenotype of the fungi in which the proteins are expressed.The mechanism of the HeLo domain-function in heterokaryon-incompatibility is still under investigation, however the HeLo domain is found in similar protein architectures as other cell death and apoptosis-inducing domains. The only other HeLo protein to which a function has been associated is LopB from Leptosphaeria maculans [ ]. Although its specific role in L. maculans is unknown, LopB- mutants have impaired ability to form lesions on oilseed rape. The HeLo domain is not related to the HET domain () which is another domain involved in heterokaryon incompatibility.
Protein Domain
Name: Proteinase inhibitor I12, Bowman-Birk
Type: Domain
Description: This family of eukaryotic proteinase inhibitors, belongs to MEROPS inhibitor family I12, clan IF. They predominantly inhibit serine peptidases of the S1 family ( ) [ ]. They play a role in defense response against pathogens and insects, but they also have been studied as therapeutic treatment in cancer and inflammatory disorders []. Exceptionally, cowpea trypsin inhibitor inhibits a cathepsin L-like cysteine proteinase CPL-1 from the nematode Heterodera glycines [].The Bowman-Birk inhibitor family [ , ] is one of the numerous families of serine proteinase inhibitors. They have a duplicated structure and generally possess two distinct inhibitory sites. These inhibitors are primarily found in plants and in particular in the seeds of legumes as well as in cereal grains. In cereals they exist in two forms, one of which is a duplication of the basic structure []. Proteins of the Bowman-Birk inhibitor family of serine proteinase inhibitors interact with the enzymes they inhibit via an exposed surface loop that adopts the canonical proteinase inhibitory conformation. The resulting non-covalent complex renders the proteinase inactive. This inhibition mechanism is common for the majority of serine proteinase inhibitor proteins and many analogous examples are known. A particular feature of the Bowman-Birk inhibitor protein, however, is that the interacting loop is a particularly well-defined disulphide-linked short β-sheet region [ , , ].
Protein Domain
Name: Bowman-Birk type proteinase inhibitor
Type: Homologous_superfamily
Description: This family of eukaryotic proteinase inhibitors, belongs to MEROPS inhibitor family I12, clan IF. They predominantly inhibit serine peptidases of the S1 family ( ) [ ]. They play a role in defense response against pathogens and insects, but they also have been studied as therapeutic treatment in cancer and inflammatory disorders []. Exceptionally, cowpea trypsin inhibitor inhibits a cathepsin L-like cysteine proteinase CPL-1 from the nematode Heterodera glycines [].The Bowman-Birk inhibitor family [ , ] is one of the numerous families of serine proteinase inhibitors. They have a duplicated structure and generally possess two distinct inhibitory sites. These inhibitors are primarily found in plants and in particular in the seeds of legumes as well as in cereal grains. In cereals they exist in two forms, one of which is a duplication of the basic structure []. Proteins of the Bowman-Birk inhibitor family of serine proteinase inhibitors interact with the enzymes they inhibit via an exposed surface loop that adopts the canonical proteinase inhibitory conformation. The resulting non-covalent complex renders the proteinase inactive. This inhibition mechanism is common for the majority of serine proteinase inhibitor proteins and many analogous examples are known. A particular feature of the Bowman-Birk inhibitor protein, however, is that the interacting loop is a particularly well-defined disulphide-linked short β-sheet region [ , , ].
Protein Domain
Name: Defensin, plant
Type: Family
Description: The following small plant proteins are evolutionary related:Gamma-thionins from Triticum aestivum (Wheat) endosperm (gamma-purothionins) and gamma-hordothionins from Hordeum vulgare(Barley) are toxic to animal cells and inhibit protein synthesis in cell free systems [ ].A flower-specific thionin (FST) from Nicotiana tabacum (Common Tobacco)[ ].Antifungal proteins (AFP) from the seeds of Brassicaceae species such as radish, mustard, turnip and Arabidopsis thaliana (Thale Cress)[ ].Inhibitors of insect alpha-amylases from sorghum [ ].Probable protease inhibitor P322 from Solanum tuberosum (Potato).A germination-related protein from Vigna unguiculata (Cowpea) [ ].Anther-specific protein SF18 from sunflower. SF18 is a protein that contains a gamma-thionin domain at its N terminus and a proline-rich C-terminal domain.Glycine max (Soybean) sulphur-rich protein SE60 [ ].Vicia faba (Broad bean) antibacterial peptides fabatin-1 and -2.In their mature form, these proteins generally consist of about 45 to 50 amino-acid residues. As shown in the following schematic representation, these peptides contain eight conserved cysteines involved in disulphide bonds.+-------------------------------------------+ | +-------------------+ || | | | xxCxxxxxxxxxxCxxxxxCxxxCxxxxxxxxxCxxxxxxCxCxxxC| | | | +---|----------------+ |+------------------+ 'C': conserved cysteine involved in a disulphide bond. The folded structure of Gamma-purothionin is characterised by a well-defined 3-stranded anti-parallel β-sheet and a short α-helix [ ]. Three disulphide bridges are located in the hydrophobic core between the helix and sheet, forming a cysteine-stabilised α-helical motif. This structure differs from that of the plant alpha- and beta- thionins, but is analogous to scorpion toxins and insect defensins.
Protein Domain
Name: Pex19 protein
Type: Family
Description: Peroxisome(s) form an intracellular compartment, bounded by a typical lipid bilayer membrane. Peroxisome functions are often specialised by organism and cell type; two widely distributed and well-conserved functions are H2O2-based respiration and fatty acid beta-oxidation. Other functions include ether lipid (plasmalogen) synthesis and cholesterol synthesis in animals, the glyoxylate cycle in germinating seeds ("glyoxysomes"), photorespiration in leaves, glycolysis in trypanosomes ("glycosomes"), and methanol and/or amine oxidation and assimilation in some yeasts.PEX genes encode the machinery ("peroxins") required to assemble the peroxisome. Membrane assembly and maintenance requires three of these (peroxins 3, 16, and 19) and may occur without the import of the matrix (lumen) enzymes. Matrix protein import follows a branched pathway of soluble recycling receptors, with one branch for each class of peroxisome targeting sequence (two are well characterised), and a common trunk for all. At least one of these receptors, Pex5p, enters and exits peroxisomes as it functions. Proliferation of the organelle is regulated by Pex11p. Peroxisome biogenesis is remarkably conserved among eukaryotes. A group of fatal, inherited neuropathologies are recognised as peroxisome biogenesis diseases. Pex19 is involved in membrane assembly and maintenance and functions as a receptor and chaperone of peroxisomal membrane proteins (PMPs) [ ].
Protein Domain
Name: QUIRKY-like
Type: Family
Description: This entry includes a group of plant proteins, such as QUIRKY and FTIP1/3/4/7 from Arabidopsis [ , ]. These are Multiple C2 domain and Transmembrane region Proteins (MCTPs) which are involved in Ca2 signalling at the membrane. Plant-MCTPs are composed of a variable N-terminal sequence, four C2 domains, two transmembrane regions (TMRs), and a short C-terminal sequence. It is one of four protein classes that are anchored to membranes via a transmembrane region; the others being synaptotagmins, extended synaptotagmins, and ferlins. MCTPs are the only membrane-bound C2 domain proteins that contain two functional TMRs. MCTPs are unique in that they bind Ca2 but not phospholipids. QUIRKY may contribute to plant organ organogenesis mediated by the receptor-like kinase STRUBBELIG and may play a role in Ca2-dependent signaling and membrane trafficking [ ]. It has been shown to be required for the appropriate spatial expression of several epidermal cell fate regulators []. FTIP1 is involved in the export of FT from the phloem companion cells to the sieve elements through the plasmodesmata. It regulates flowering time under long days []. FTIP3/4 play an essential role in mediating proliferation and differentiation of shoot stem cells []. FTIP7 promotes nuclear translocation of the transcription factor OSH1 and reduces auxin levels at late stage of anther development, after meiosis of microspore mother cells. It is necessary for normal anther dehiscence and seed setting [].
Protein Domain
Name: Mannose-1-phosphate guanylyltransferase/mannose-6-phosphate isomerase
Type: Family
Description: This enzyme is known to be bifunctional, as both mannose-6-phosphate isomerase ( ) (PMI) and mannose-1-phosphate guanylyltransferase ( ) in Pseudomonas aeruginosa [ ], Xanthomonas campestris [, ], and Acetobacter xylinus. The literature on the enzyme from Escherichia coli attributes mannose-6-phosphate isomerase activity to an adjacent gene, but the present sequence has not been shown to lack the activity. The PMI domain lies at the C-terminal. Mannose-6-phosphate isomerase or phosphomannose isomerase (PMI) is the enzyme that catalyses the interconversion of mannose-6-phosphate and fructose-6-phosphate. In eukaryotes PMI is involved in the synthesis of GDP-mannose, a constituent of N- and O-linked glycans and GPI anchors and in prokaryotes it participates in a variety of pathways, including capsular polysaccharide biosynthesis and D-mannose metabolism. PMI's belong to the cupin superfamily whose functions range from isomerase and epimerase activities involved in the modification of cell wall carbohydrates in bacteria and plants, to non-enzymatic storage proteins in plant seeds, and transcription factors linked to congenital baldness in mammals [ ]. Three classes of PMI have been defined [].
Protein Domain
Name: Expansin, cellulose-binding-like domain superfamily
Type: Homologous_superfamily
Description: Expansins are secreted proteins of 25 to 27 Kd that were isolated first from young cucumber seedling and subsequently from other plant tissues. Expression of expansin genes correlates with growth of cells. Increase in expansin content also occurs during fruit ripening. Expansins act on the cell wall to promote its extensibility. The model for its mechanism of action postulates that expansins break non-covalent bonds between cell-wall polysaccharides, thereby permitting pressure dependent expansion of the cell [ , ].Group-I pollen allergens of grasses have limited but significant sequence homology to expansin. These proteins are the main causative agent of hay fever and seasonal asthma induced by grass pollen. Extracts containing group-I allergens are also active in loosening cell-walls. Group-I pollen allergens and related proteins in vegetative tissues have been classified as beta-expansins, whereas the earlier discovered expansins are now referred as α-expansins [ ].Expansins consist of two domains closely packed and aligned so as to form a long, shallow groove with potential to bind a glycan backbone of ~10 sugarresidues. The N-terminal cysteine-rich domain has distant sequence similarity to family-45 endoglucanases (EG45-like domain). The ~90-residue C-terminal domain may function as a cellulose-binding domain (CBD). It is composed of eight β-strands assembled into two antiparallel β-sheets. The two β-sheets are at slight angles to each other and form a β-sandwich similar to the Ig fold [].This entry represents the expansin CBD-like domain superfamily.
Protein Domain
Name: Pex19, C-terminal domain superfamily
Type: Homologous_superfamily
Description: Peroxisome(s) form an intracellular compartment, bounded by a typical lipid bilayer membrane. Peroxisome functions are often specialised by organism and cell type; two widely distributed and well-conserved functions are H2O2-based respiration and fatty acid beta-oxidation. Other functions include ether lipid (plasmalogen) synthesis and cholesterol synthesis in animals, the glyoxylate cycle in germinating seeds ("glyoxysomes"), photorespiration in leaves, glycolysis in trypanosomes ("glycosomes"), and methanol and/or amine oxidation and assimilation in some yeasts.PEX genes encode the machinery ("peroxins") required to assemble the peroxisome. Membrane assembly and maintenance requires three of these (peroxins 3, 16, and 19) and may occur without the import of the matrix (lumen) enzymes. Matrix protein import follows a branched pathway of soluble recycling receptors, with one branch for each class of peroxisome targeting sequence (two are well characterised), and a common trunk for all. At least one of these receptors, Pex5p, enters and exits peroxisomes as it functions. Proliferation of the organelle is regulated by Pex11p. Peroxisome biogenesis is remarkably conserved among eukaryotes. A group of fatal, inherited neuropathologies are recognised as peroxisome biogenesis diseases. Pex19 is involved in membrane assembly and maintenance and functions as a receptor and chaperone of peroxisomal membrane proteins (PMPs) [ ].This superfamily represents the C-terminal domain of Pex19, which is assembled in a three-helical bundle and represents the mPTS (PMP-targeting signal) binding site [ ].
Protein Domain
Name: Mannose-6-phosphate isomerase, type I
Type: Family
Description: Mannose-6-phosphate isomerase or phosphomannose isomerase ( ) (PMI) is the enzyme that catalyses the interconversion of mannose-6-phosphate and fructose-6-phosphate. In eukaryotes PMI is involved in the synthesis of GDP-mannose, a constituent of N- and O-linked glycans and GPI anchors and in prokaryotes it participates in a variety of pathways, including capsular polysaccharide biosynthesis and D-mannose metabolism. PMI's belong to the cupin superfamily whose functions range from isomerase and epimerase activities involved in the modification of cell wall carbohydrates in bacteria and plants, to non-enzymatic storage proteins in plant seeds, and transcription factors linked to congenital baldness in mammals [ ]. Three classes of PMI have been defined [].Type I includes eukaryotic PMI and the enzyme encoded by the manA gene in enterobacteria. PMI has a bound zinc ion, which is essential for activity.A crystal structure of PMI from Candida albicansshows that the enzyme has three distinct domains [ ]. The active site lies in the central domain, contains a single essential zinc atom, and forms a deep, open cavity of suitable dimensions to contain M6P or F6P The central domain is flanked by a helical domain on one side and a jelly-roll like domain on the other.
Protein Domain
Name: Histone-lysine N-methyltransferase Set1-like
Type: Family
Description: The COMPASS complex (complex proteins associated with Set1) is conserved in yeasts and in other eukaryotes up to humans. This entry represents Set1 and its homologues. Set1 is a methyltransferase and the catalytic component of the COMPASS that produces trimethylated histone H3 at Lys(4). The yeast COMPASS (Set1C) complex specifically mono-, di- and trimethylates histone H3 to form H3K4me1/2/3, which subsequently plays a role in telomere length maintenance and transcription elongation regulation [ , , ]. In yeasts, the Set1C complex consists of Set1(2), Bre2(2), Spp1(2), Sdc1(1), Shg1(1), Swd1(1), Swd2(1), and Swd3(1) [, , , ].In animals, SETD1A/B are histone methyltransferases that produce mono-, di-, and trimethylated histone H3 at 'Lys-4. However, if 'Lys-9' residue is already methylated, 'Lys-4' will not be. The 'Lys-4' methylation is a tag for epigenetic transcriptional activation [ , ]. The animal COMPASS complex is composed of at least the catalytic subunit (SETD1A or SETD1B), WDR5, WDR82, RBBP5, ASH2L/ASH2, CXXC1/CFP1, HCFC1 and DPY30 []. ATXR7, the Arabidopsis homologue to Set1, is required for the expression of the flowering repressors FLC and MADS-box genes of the MAF family [, ]. ATXR7 is also involved in the control of seed dormancy and germination [].
Protein Domain
Name: Urease, alpha subunit
Type: Family
Description: Urease (urea amidohydrolase, ) is a nickel-binding enzyme that catalyses the hydrolysis of urea to form ammonia and carbamate [ ]. It is mainly found in plant seeds, microorganisms and invertebrates. In plants, urease is a hexamer of identical chains, but the subunit composition of urease from different sources varies [], in bacteria [] it consists of either two or three different subunits (alpha, beta and gamma).Urease binds two nickel ions per subunit; four histidine, an aspartate and a carbamated-lysine serve as ligands to these metals; an additional histidine is involved in the catalytic mechanism [ ]. The urease domain forms an (alpha beta)(8) barrel structure with structural similarity to other metal-dependent hydrolases, such as adenosine and AMP deaminase and phosphotriesterase. Urease is unique among nickel metalloenzymes in that it catalyses a hydrolysis rather than a redox reaction.The orthologous protein is known as the alpha subunit (ureC) in most other bacteria.In Helicobacter pylori, the gamma and beta domains are fused and called the alpha subunit ( ). The catalytic subunit (called beta or B) has the same organisation as the Klebsiella alpha subunit. Jack bean (Canavalia ensiformis) urease has a fused gamma-beta-alpha organisation ( ). This entry describes the urease alpha subunit UreC (designated beta or B chain, UreB in Helicobacter species).
Protein Domain
Name: Basic leucine zipper 8/43
Type: Family
Description: This entry represents a group of plant basic leucine zipper proteins (bZIPs), including AtbZIP8 and AtbZIP43. AtbZIP43 may act as positive regulator of BHLH109, which is associated with somatic embryogenesis (SE) induction [ ].The basic (region) leucine zippers (bZIPs) are evolutionarily conserved transcription factors in eukaryotic organisms. In plants bZIPs regulate processes including pathogen defence, light and stress signalling, seed maturation and flower development [ ]. The bZIP domain consists of a basic DNA-binding region and the adjacent ZIP domain. The ZIP domain consists of heptad repeats of leucine (L) or related hydrophobic amino acids. The DNA-binding region is a basic region of ~16 amino acid residues containing a nuclear localization signal followed by an invariant N-x7-R/K motif that contacts the DNA The heptad repeat of leucines or other bulky hydrophobic amino acids positioned exactly nine amino acids towards the C terminus, creating an amphipathic helix. To bind DNA, two subunits adhere via interactions between the hydrophobic sides of their helices, which creates a superimposing coiled-coil structure. The ability to form homo- and heterodimers is influenced by the electrostatic attraction and repulsion of polar residues flanking the hydrophobic interaction surface of the helices [ ]. As bZIPs generally perform as dimers, heterodimerisation results in an enormous regulatory flexibility [].
Protein Domain
Name: Chaperone DnaK
Type: Family
Description: Molecular chaperones are a diverse family of proteins that function to protect proteins in the intracellular milieu from irreversible aggregation during synthesis and in times of cellular stress. The bacterial molecular chaperone DnaK is an enzyme that couples cycles of ATP binding, hydrolysis, and ADP release by an N-terminal ATP-hydrolysing domain to cycles of sequestration and release of unfolded proteins by a C-terminal substrate binding domain. In prokaryotes, the grpE protein is a co-chaperone for DnaK, and acts as a nucleotide exchange factor, stimulating the rate of ADP release 5000-fold [ ]. DnaK is itself a weak ATPase; ATP hydrolysis by DnaK is stimulated by its interaction with another co-chaperone, DnaJ. Thus the co-chaperones DnaJ and GrpE are capable of tightly regulating the nucleotide-bound and substrate-bound state of DnaK in ways that are necessary for the normal housekeeping functions and stress-related functions of the DnaK molecular chaperone cycle.Members of this family are the chaperone DnaK, of the DnaK-DnaJ-GrpE chaperone system. All members of the seed alignment were taken from completely sequenced bacterial or archaeal genomes and (except for the Mycoplasma sequence) found clustered with other genes of this systems. This entry excludes DnaK homologues that are not DnaK itself, such as the heat shock cognate protein HscA ( ). However, it is not designed to distinguish among DnaK paralogs in eukaryotes. Note that a number of DnaK genes have shadow ORFs in the same reverse (relative to dnaK) reading frame, a few of which have been assigned glutamate dehydrogenase activity. The significance of this observation is unclear; the lengths of such shadow ORFs are highly variable as if the presumptive protein product is not conserved.
Protein Domain
Name: Expansin, cellulose-binding-like domain
Type: Domain
Description: Expansins are secreted proteins of 25 to 27 Kd that were isolated first from young cucumber seedling and subsequently from other plant tissues. Expression of expansin genes correlates with growth of cells. Increase in expansin content also occurs during fruit ripening. Expansins act on the cell wall to promote its extensibility. The model for its mechanism of action postulates that expansins break non-covalent bonds between cell-wall polysaccharides, thereby permitting pressure dependent expansion of the cell [ , ].Group-I pollen allergens of grasses have limited but significant sequence homology to expansin. These proteins are the main causative agent of hay fever and seasonal asthma induced by grass pollen. Extracts containing group-I allergens are also active in loosening cell-walls. Group-I pollen allergens and related proteins in vegetative tissues have been classified as beta-expansins, whereas the earlier discovered expansins are now referred as α-expansins [].Expansin-like proteins are also found in some fungi. In Trichoderma reesei an expansin-like protein (Cel12A) acts as a glycoside hydrolase on xyloglucan and 1-4 beta-glucan. These hydrolytic actions differ from the action by expansins, which induce wall extension by a non-hydrolytic mechanism [ ].Expansins consist of two domains closely packed and aligned so as to form a long, shallow groove with potential to bind a glycan backbone of ~10 sugarresidues. The N-terminal cysteine-rich domain has distant sequence similarity to family-45 endoglucanases (EG45-like domain). The ~90-residue C-terminal domain may function as a cellulose-binding domain (CBD). It is composed of eight β-strands assembled into two antiparallel β-sheets. The two β-sheets are at slight angles to each other and form a β-sandwich similar to the Ig fold [].This entry represents the expansin C-terminal CBD-like domain.
Protein Domain
Name: QUIRKY-like, fourth C2 domain
Type: Domain
Description: This entry represents the fourth C2 domain, referred to as C2D, from QUIRKY and related proteins. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions [ , ].This entry includes a group of plant proteins, such as QUIRKY and FTIP1/3/4/7 from Arabidopsis [ , ]. These are Multiple C2 domain and Transmembrane region Proteins (MCTPs) which are involved in Ca2 signalling at the membrane. Plant-MCTPs are composed of a variable N-terminal sequence, four C2 domains, two transmembrane regions (TMRs), and a short C-terminal sequence. It is one of four protein classes that are anchored to membranes via a transmembrane region; the others being synaptotagmins, extended synaptotagmins, and ferlins. MCTPs are the only membrane-bound C2 domain proteins that contain two functional TMRs. MCTPs are unique in that they bind Ca2 but not phospholipids. QUIRKY may contribute to plant organ organogenesis mediated by the receptor-like kinase STRUBBELIG and may play a role in Ca2-dependent signaling and membrane trafficking [ ]. It has been shown to be required for the appropriate spatial expression of several epidermal cell fate regulators []. FTIP1 is involved in the export of FT from the phloem companion cells to the sieve elements through the plasmodesmata. It regulates flowering time under long days []. FTIP3/4 play an essential role in mediating proliferation and differentiation of shoot stem cells []. FTIP7 promotes nuclear translocation of the transcription factor OSH1 and reduces auxin levels at late stage of anther development, after meiosis of microspore mother cells. It is necessary for normal anther dehiscence and seed setting [].
Protein Domain
Name: QUIRKY-like, second C2 domain
Type: Domain
Description: This entry represents the second C2 domain, referred to as C2B, from QUIRKY and related proteins. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions [ , ].This entry includes a group of plant proteins, such as QUIRKY and FTIP1/3/4/7 from Arabidopsis [ , ]. These are Multiple C2 domain and Transmembrane region Proteins (MCTPs) which are involved in Ca2 signalling at the membrane. Plant-MCTPs are composed of a variable N-terminal sequence, four C2 domains, two transmembrane regions (TMRs), and a short C-terminal sequence. It is one of four protein classes that are anchored to membranes via a transmembrane region; the others being synaptotagmins, extended synaptotagmins, and ferlins. MCTPs are the only membrane-bound C2 domain proteins that contain two functional TMRs. MCTPs are unique in that they bind Ca2 but not phospholipids. QUIRKY may contribute to plant organ organogenesis mediated by the receptor-like kinase STRUBBELIG and may play a role in Ca2-dependent signaling and membrane trafficking [ ]. It has been shown to be required for the appropriate spatial expression of several epidermal cell fate regulators []. FTIP1 is involved in the export of FT from the phloem companion cells to the sieve elements through the plasmodesmata. It regulates flowering time under long days []. FTIP3/4 play an essential role in mediating proliferation and differentiation of shoot stem cells []. FTIP7 promotes nuclear translocation of the transcription factor OSH1 and reduces auxin levels at late stage of anther development, after meiosis of microspore mother cells. It is necessary for normal anther dehiscence and seed setting [].
Protein Domain
Name: QUIRKY-like, third C2 domain
Type: Domain
Description: This entry represents the third C2 domain, referred to as C2C, from QUIRKY and related proteins. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions [ , ].This entry includes a group of plant proteins, such as QUIRKY and FTIP1/3/4/7 from Arabidopsis [ , ]. These are Multiple C2 domain and Transmembrane region Proteins (MCTPs) which are involved in Ca2 signalling at the membrane. Plant-MCTPs are composed of a variable N-terminal sequence, four C2 domains, two transmembrane regions (TMRs), and a short C-terminal sequence. It is one of four protein classes that are anchoredto membranes via a transmembrane region; the others being synaptotagmins, extended synaptotagmins, and ferlins. MCTPs are the only membrane-bound C2 domain proteins that contain two functional TMRs. MCTPs are unique in that they bind Ca2 but not phospholipids. QUIRKY may contribute to plant organ organogenesis mediated by the receptor-like kinase STRUBBELIG and may play a role in Ca2-dependent signaling and membrane trafficking [ ]. It has been shown to be required for the appropriate spatial expression of several epidermal cell fate regulators []. FTIP1 is involved in the export of FT from the phloem companion cells to the sieve elements through the plasmodesmata. It regulates flowering time under long days []. FTIP3/4 play an essential role in mediating proliferation and differentiation of shoot stem cells []. FTIP7 promotes nuclear translocation of the transcription factor OSH1 and reduces auxin levels at late stage of anther development, after meiosis of microspore mother cells. It is necessary for normal anther dehiscence and seed setting [].
Protein Domain
Name: Protein translocase subunit SecD
Type: Family
Description: Secretion across the inner membrane in some Gram-negative bacteria occurs via the preprotein translocase pathway. Proteins are produced in the cytoplasm as precursors, and require a chaperone subunit to direct them to the translocase component [ ]. From there, the mature proteins are either targeted to the outer membrane, or remain as periplasmic proteins. The translocase protein subunits are encoded on the bacterial chromosome. The translocase itself comprises 7 proteins, including a chaperone protein (SecB), an ATPase (SecA), an integral membrane complex (SecCY, SecE and SecG), and two additional membrane proteins that promote the release of the mature peptide into the periplasm (SecD and SecF) [ ]. The chaperone protein SecB [] is a highly acidic homotetrameric protein that exists as a "dimer of dimers"in the bacterial cytoplasm. SecB maintains preproteins in an unfolded state after translation, and targets these to the peripheral membrane protein ATPase SecA for secretion []. Together with SecY and SecG, SecE forms a multimeric channel through which preproteins are translocated, using both proton motive forces and ATP-driven secretion. The latter is mediated by SecA. The structure of the Escherichia coli SecYEG assembly revealed a sandwich of two membranesinteracting through the extensive cytoplasmic domains [ ]. Each membrane is composed of dimers of SecYEG. Themonomeric complex contains 15 transmembrane helices. This entry describes the SecD family of transport proteins, which are parts of the Sec protein translocase complex. Members of this family are highly variable in length immediately after the well-conserved motif LGLGLXGG at the amino-terminal end of this model. Archaeal homologues are not included in the seed. SecD from Mycobacterium tuberculosis has a long Pro-rich insert. SecD interacts with the SecYEG preprotein conducting channel. SecDF uses the proton motive force (PMF) to complete protein translocation after the ATP-dependent function of SecA [ ].
Protein Domain
Name: Zf-FLZ domain
Type: Domain
Description: Zinc fingers are a ubiquitous class of protein domain with considerable variation in structure and function. The FCS-type zinc finger is a highly diverged group of C2-C2 zinc finger which is present in animals, prokaryotes and viruses, but not in plants. It is named after the conserved phenylalanine and serine residues associated with the third cysteine. The FCS-type zinc finger is a structurally diverse family which accommodate both nucleic-protein and protein-protein interaction zinc fingers. The FCS-Like Zinc finger (FLZ) domain is a plant specific domain found in all taxa except algae. FLZ domain containing proteins are bryophytic in origin and this protein family is expanded in higher plants. Although the molecular functions of the FLZ protein family members in general are not well understood, many of the members are attributed to plant growth and development, stress mitigation, sugar signaling and senescence. The FLZ-type zinc finger is likely to be involved in protein-protein interaction [ , , ].The FLZ-type zinc finger is predicted to form an α-β-alpha secondary structure composed of an N-terminal short α-helix, a beta hairpin followed by a longer C-terminal alpha helix. Four highly conserved cysteine residues in the FLZ-type zinc finger are believed to bind zinc in a tetrahedral coordination [ , ].Some proteins known to contain a FLZ-type zinc finger are listed below [ ]:Arabidopsis thaliana MEDIATOR OF ABA-REGULATED DORMANCY 1 (MARD1) or FLZ9, involved in absissic acid (ABA)-mediated seed dormancy and induced during senescence.Arabidopsis thaliana INCREASED RESISTANCE TO MYZUS PERSICAE (IRM1) or FLZ4, constitutive overexpression of IRM1 results in mechanical barriers that make it difficult for M. persicae to reach the phloem and subsequently reduces its population size.Wheat salt related hypothetical protein (TaSRHP), overexpression of TaSHRP results in enhanced resistance to salt and drought stress.
Protein Domain
Name: Papain-like cysteine endopeptidase
Type: Domain
Description: This entry represents homologues of papain, from peptidase family C1 subfamily A, including the mammalian CPs (cathepsins F, H, L, K, O, S, V, X and W) [ ]. Cathepsins B and C, which differ in their activation peptides, are not included here []. Papain is an endopeptidase with specific substrate preferences, primarily for bulky hydrophobic or aromatic residues at the S2 subsite, a hydrophobic pocket in papain that accommodates the P2 sidechain of the substrate (the second residue away from the scissile bond). Most members of the papain subfamily are endopeptidases. Some exceptions to this rule can be explained by specific details of the catalytic domains like the occluding loop in cathepsin B which confers an additional peptidyl-dipeptidase activity and the mini-chain of cathepsin H resulting in an N-terminal exopeptidase activity []. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds. Parasitic CPs act extracellularly to help invade tissues and cells, to hatch or to evade the host immune system [ ]. Mammalian CPs are primarily lysosomal enzymes with the exception of cathepsin W, which is retained in the endoplasmic reticulum []. They are responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. In addition to its inhibitory role, the propeptide is required for proper folding of the newly synthesized enzyme and its stabilization in denaturing pH conditions. Residues within the propeptide region also play a role in the transport of the proenzyme to lysosomes or acidified vesicles [ ]. Also included in this entry are proteins classified as non-peptidase homologues, which lack peptidase activity because active site residues are missing.
Protein Domain
Name: RelB antitoxin/Antitoxin DinJ
Type: Family
Description: Plasmids may be maintained stably in bacterial populations through the action of addiction modules, in which a toxin and antidote are encoded in a cassette on the plasmid. In any daughter cell that lacks the plasmid, the toxin persists and is lethal after the antidote protein is depleted. Toxin/antitoxin pairs are also found on main chromosomes, and likely represent selfish DNA. Sequences in the seed for this alignment all were found adjacent to toxin genes. Several toxin/antitoxin pairs may occur in a single species. RelE and RelB form a toxin-antitoxin system; RelE cleaves mRNA during translation on the ribosome [ , , ]. RelB binds and inhibits RelE and it regulates transcription by operator binding and conditional cooperativity controlled by RelE. RelE and RelB form a V-shaped heterotetrameric complex which has a ribbon-helix-helix (RHH) dimerization domain at the apex. [].DinJ is an antitoxin component of a toxin-antitoxin (TA) module. It is a labile antitoxin that counteracts the effect of the YafQ toxin [ ]. It forms a heterotetrameric complex with YafQ and the structure of this complex revealed that the N-terminal region of DinJ folds into a ribbon-helix-helix motif that dimerises for DNA recognition, and the C-terminal portion of each DinJ wraps around a YafQ molecule []. Together, they they bind their own promoter, and by analogy to other TA modules probably repress its expression. Cell death governed by the mazEF and dinJ-yafQ TA modules seems to play a role in biofilm formation [, , ].
Protein Domain
Name: Protein ROH1-like
Type: Family
Description: ROH1 is an interactor of the exocyst subunit Exo70A1, and has been shown to be required for seed coat mucilage deposition [ ].
Protein Domain
Name: AP2/ERF domain
Type: Domain
Description: Ethylene is an endogenous plant hormone that influences many aspects of plant growth and development. Some defense related genes that are induced by ethylene contain a cis-regulatory element known as the Ethylene-Responsive Element (ERE) [ ]. Sequence analysis on various ERE regions has identified a short motif rich in G/C nucleotides, the GCC-box, essential for the response to ethylene. This short motif is recognised by a family of transcrition factors, the ERE binding factors (ERF) [].ERF proteins contain a domain of around 60 amino acids which is also found in the APETALA2 (AP2) protein [ ]. This AP2/ERF domain has been shown in various proteins to be necessary and sufficient to bind the GCC-box [ ].The structure of the AP2/ERF domain in complex with the target DNA has been solved [ ]. The structure resembles that of bacteriophage integrases and the methyl-CpG-binding domain (MBD): a three-stranded β-sheet and an alpha helix almost parallel to the β-sheet. It contacts DNA via Arg and Trp residues located in the β-sheet. Some proteins known to contain an AP2/ERF domain include:Arabidopsis thaliana ERF1 to 6. Tobacco ethylene-responsive element-binding proteins (EREBPs), homologues of ERF proteins. Arabidopsis thaliana AP2 protein. It regulates meristeme identity, floral organ specification and seed coat development. Arabidopsis thaliana C-repeat/dehydration-responsive element (DRE) binding factor 1 (CBF1 or DREB1) and DREB2. They bind a GCC-box-like element found in dehydratation responsive element. Binding to this element mediates cold-inducible transcription. Arabidopsis thaliana and maize abscisic acid (ABA)-insensitive 4 (ABI4) proteins. They bind to a GCC-box-like element found in ABA-responsive genes.Octadecanoid-derivative responsive catharenthus AP2-domain (ORCA2) protein. It binds a GCC-box-like element in the jasmonate responsive element of Str promoter. Tomato Pto-interacting proteins 4 to 6 (Pti4 to Pti6). Pti5 and 6 bind a GCC-box-like element in regulatory regions of various pathogenesis-related (PR) genes. Trichodesmium erythraeum, Tetrahymena thermophila, Enterobacteria phage RB49 and bacteriophage Felix 01 HNH endonucleases. HNH endonucleases are homing endonucleases that move extensively via lateral gene transfer [ ]. This entry represents the AP2/ERF domain.
Protein Domain
Name: AP2/ERF domain superfamily
Type: Homologous_superfamily
Description: Ethylene is an endogenous plant hormone that influences many aspects of plant growth and development. Some defense related genes that are induced by ethylene contain a cis-regulatory element known as the Ethylene-Responsive Element (ERE) [ ]. Sequence analysis on various ERE regions has identified a short motif rich in G/C nucleotides, the GCC-box, essential for the response to ethylene. This short motif is recognised by a family of transcrition factors, the ERE binding factors (ERF) [].ERF proteins contain a domain of around 60 amino acids which is also found in the APETALA2 (AP2) protein [ ]. This AP2/ERF domain has been shown in various proteins to be necessary and sufficient to bind the GCC-box [].The structure of the AP2/ERF domain in complex with the target DNA has been solved [ ]. The structure resembles that of bacteriophage integrases and the methyl-CpG-binding domain (MBD): a three-stranded β-sheet and an alpha helix almost parallel to the β-sheet. It contacts DNA via Arg and Trp residues located in the β-sheet. Some proteins known to contain an AP2/ERF domain include:Arabidopsis thaliana ERF1 to 6. Tobacco ethylene-responsive element-binding proteins (EREBPs), homologues of ERF proteins. Arabidopsis thaliana AP2 protein. It regulates meristeme identity, floral organ specification and seed coat development. Arabidopsis thaliana C-repeat/dehydration-responsive element (DRE) binding factor 1 (CBF1 or DREB1) and DREB2. They bind a GCC-box-like element found in dehydratation responsive element. Binding to this element mediates cold-inducible transcription. Arabidopsis thaliana and maize abscisic acid (ABA)-insensitive 4 (ABI4) proteins. They bind to a GCC-box-like element found in ABA-responsive genes.Octadecanoid-derivative responsive catharenthus AP2-domain (ORCA2) protein. It binds a GCC-box-like element in the jasmonate responsive element of Str promoter. Tomato Pto-interacting proteins 4 to 6 (Pti4 to Pti6). Pti5 and 6 bind a GCC-box-like element in regulatory regions of various pathogenesis-related (PR) genes. Trichodesmium erythraeum, Tetrahymena thermophila, Enterobacteria phage RB49 and bacteriophage Felix 01 HNH endonucleases. HNH endonucleases are homing endonucleases that move extensively via lateral gene transfer [ ]. This entry represents the AP2/ERF domain superfamily.
Protein Domain
Name: Sialic acid O-acyltransferase NeuD-like
Type: Family
Description: This entry includes a group of acetyltransferases, such as NeuD sialic acid O-acetyltransferase enzymes from Escherichia coli and Streptococcus agalactiae (group B strep) [ , , ], UDP-N-acetylbacillosamine N-acetyltransferase pglD from Campylobacter jejuni subsp. jejuni [, ] and GDP-perosamine N-acetyltransferase perB from Escherichia coli O157:H7 []. This group is composed of mostly uncharacterized proteins containing an N-terminal helical subdomain followed by a LbH domain. The alignment contains 6 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV].-X-[LIV]-[GAED]-X). Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity [ ].The neuD gene is often observed in close proximity to the neuABC genes for the biosynthesis of CMP-N-acetylneuraminic acid (CMP-sialic acid), and NeuD sequences from these organisms were used to construct the seed for this model. Nevertheless, there are numerous instances of sequences identified by this model which are observed in a different genomic context (although almost universally in exopolysaccharide biosynthesis-related loci), as well as in genomes for which the biosynthesis of sialic acid (SA) has not been demonstrated. Even in the cases where the association with SA biosynthesis is strong, it is unclear in the literature whether the biological substrate is SA itself, CMP-SA, or a polymer containing SA. Similarly, it is unclear to what extent the enzyme has a preference for acetylation at the 7, 8 or 9 positions. In the absence of evidence of association with SA, members of this family may be involved with the acetylation of differing sugar substrates, or possibly the delivery of alternative acyl groups. The closest related sequences to this family (and those used to root the phylogenetic tree constructed to create this model) are believed to be succinyltransferases involved in lysine biosynthesis.
Protein Domain
Name: Transcription factor GTE1
Type: Family
Description: This entry represents the transcription factor GTE1 from plants. Arabidopsis GTE1 is a transcription activator that plays a role in the promotion of seed germination by both negatively and positively regulating the abscisic acid (ABA) and phytochrome A (phyA) transduction pathways, respectively [ ].
Protein Domain
Name: O-FUCOSYLTRANSFERASE1-like
Type: Family
Description: This entry represents a group of putative plant O-Fucosyltransferases (POFTs), including O-FUCOSYLTRANSFERASE1 (AtOFT1, At3g05320) from Arabidopsis. Interestingly, oft1 mutant pollen tubes are ineffective at penetrating the stigma-style interface leading to a drastic reduction in seed set and a nearly 2000-fold reduction in pollen transmission [ , ].
Protein Domain
Name: Zinc finger protein GIS3/ZFP5/ZFP6
Type: Family
Description: This entry represents a group of plant putative transcription factors, including GIS3/ZFP5/ZFP6 from Arabidopsis. GIS3/ZFP5/ZFP6 have been shown to regulates trichome initiation [ , , ]. ZFP5 also regulates root hair initiation and morphogenesis [], while ZFP6 also acts as negative regulator of abscisic acid (ABA) signaling during germination and early seedling development [].
Protein Domain
Name: Dual specificity protein phosphatase PHS1
Type: Family
Description: PHS1 is dual-specificity phophosphatase from plants. It specifically interacts with two MAPKs, MPK12 and MPK18. PHS1-MPK18 signalling probably regulates cortical microtubule functions [ , ]. It is also a negative regulator of abscisic acid (ABA) signalling []. Plant hormone ABA control numeros processes, such as dormancy and germination of seeds, senescence and resistance to abiotic stresses.
Protein Domain
Name: CRISPR-associated protein Cas5p
Type: Family
Description: The CRISPR-Cas system is a prokaryotic defense mechanism against foreign genetic elements. The key elements of this defense system are the Cas proteins and the CRISPR RNA. Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes [ ]. CRISPRs appear to provide acquired resistance against mobile genetic elements (viruses, transposable elements and conjugative plasmids). CRISPR clusters contain sequences complementary to antecedent mobile elements and target invading nucleic acids. CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA).The defense reaction is divided into three stages. In the adaptation stage, the invader DNA is cleaved, and a piece of it is selected to be integrated as a new spacer into the CRISPR locus, where it is stored as an identity tag for future attacks by this invader. During the second stage (the expression stage), the CRISPR RNA (pre-crRNA) is transcribed and subsequently processed into the mature crRNAs. In the third stage (the interference stage), Cas proteins, together with crRNAs, identify and degrade the invader [ , , ].The CRISPR-Cas systems have been sorted into three major classes. In CRISPR-Cas types I and III, the mature crRNA is generally generated by a member of the Cas6 protein family. Whereas in system III the Cas6 protein acts alone, in some class I systems it is part of a complex of Cas proteins known as Cascade (CRISPR-associated complex for antiviral defense). The Cas6 protein is an endoribonuclease necessary for crRNA production whereas the additional Cas proteins that form the Cascade complex are needed for crRNA stability [ ]. Members of this protein family are cas, or CRISPR-associated, proteins. The two sequences in the alignment seed are found within cas gene clusters that are adjacent to CRISPR DNA repeats in two members of the order Bacteroidales, Porphyromonas gingivalis W83 and Bacteroides forsythus ATCC 43037. This cas protein family is unique to the Pging (Porphyromonas gingivalis) subtype, but shows some sequence similarity to genes of the Cas5 type (see ).
Protein Domain
Name: Basic helix-loop-helix (bHLH) transcription factors ALC-like, plant
Type: Family
Description: This entry represents a group of plant helix-loop-helix (bHLH) transcription factors, including ALC, PIFs (phy-interacting factors) and SPATULA from Arabidopsis [ , ]. ALC enables cell separation in fruit dehiscence, the processes in which the fruit opens and releases the seed []. PIF1 regulates chlorophyll biosynthesis to optimise the greening process []. SPATULA plays a role in floral organogenesis [].
Protein Domain
Name: Phospholipase A1 PLIP1/2/3, chloroplastic
Type: Family
Description: This entry includes a group of plant glycerolipid A1 lipases, including PLIP1/2/3 from Arabidopsis. PLIP1 is a plastid phospholipase A1 that releases polyunsaturated fatty acids from chloroplast phosphatidylglycerol, leading to the export of the fatty acids to the ER for seed oil biosynthesis [ ]. PLIP2/3 are also present in the chloroplasts. They respond to ABA and are involved in jasmonic acid biosynthesis [].
Protein Domain
Name: tRNA pseudouridine synthase II, TruB
Type: Family
Description: Pseudouridine synthases catalyse the isomerisation of uridine to pseudouridine (Psi) in a variety of RNA molecules, and may function as RNA chaperones. Pseudouridine is the most abundant modified nucleotide found in all cellular RNAs. There are four distinct families of pseudouridine synthases that share no global sequence similarity, but which do share the same fold of their catalytic domain(s) and uracil-binding site and are descended from a common molecular ancestor. The catalytic domain consists of two subdomains, each of which has an α+β structure that has some similarity to the ferredoxin-like fold (note: some pseudouridine synthases contain additional domains). The active site is the most conserved structural region of the superfamily and is located between the two homologous domains. These families are [ , ]:Pseudouridine synthase I, TruA.Pseudouridine synthase II, TruB, which contains and additional C-terminal PUA domain.Pseudouridine synthase RsuA. RluB, RluE and RluF are also part of this family.Pseudouridine synthase RluA. TruC, RluC and RluD belong to this family.Pseudouridine synthase TruD, which has a natural circular permutation in the catalytic domain, as well as an insertion of a family-specific α+β subdomain.TruB is responsible for the pseudouridine residue present in the T loops of virtually all tRNAs. TruB recognises the preformed 3-D structure of the T loop primarily through shape complementarity. It accesses its substrate uridyl residue by flipping out the nucleotide and disrupts the tertiary structure of tRNA [ ].This model is built on a seed alignment of bacterial proteins only. Saccharomyces cerevisiae protein YNL292w (Pus4) has been shown to be the pseudouridine 55 synthase of both cytosolic and mitochondrial compartments, active at no other position on tRNA and the only enzyme active at that position in the species. A distinct yeast protein YLR175w, (centromere/microtubule-binding protein CBF5) is an rRNA pseudouridine synthase, and the archaeal set is much more similar to CBF5 than to Pus4. It is unclear whether the archaeal proteins found by this model are tRNA pseudouridine 55 synthases like TruB, rRNA pseudouridine synthases like CBF5, or (as suggested by the absence of paralogs in the Archaea) both. CBF5 likely has additional, eukaryotic-specific functions.
Protein Domain
Name: L-aspartate dehydrogenase, archaeal
Type: Family
Description: This entry represents L-aspartate dehydrogenase, as shown for the NADP-dependent enzyme TM_1643 of Thermotoga maritima. Members lack homology to NadB, the aspartate oxidase ( ) of most mesophilic bacteria (described by ), which this enzyme replaces in the generation of oxaloacetate from aspartate for the NAD biosynthetic pathway. All members of the seed alignment are found adjacent to other genes of NAD biosynthesis, although other uses of L-aspartate dehydrogenase may occur.
Protein Domain
Name: Myb family transcription factor HRS1-like
Type: Family
Description: This entry represents a group of plant Myb family transcription factors, including HHO1-6, HRS1 and EFM from Arabidopsis. HRS1 represses primary root development in response to phosphate deficiency conditions, only when nitrate is present [ ]. It is also required for suppressing abscisic acid (ABA) signalling in germinating embryo axis, which promotes the timely germination of seeds []. EFM functions as a flowering repressor, directly repressing FT expression in a dosage-dependent manner in the leaf vasculature [].
Protein Domain
Name: Nif11-like leader peptide
Type: Domain
Description: This entry describes a conserved, fairly long (about 65 residue) leader peptide region for a family of putative ribosomal natural products (RNP) of small size. Members of the seed alignment (most sequences scoring better than 54 bits to the HMMER 2 model) tend to have the Gly-Gly motif as the last two residues of the matched region. This is a cleavage site for a combination processing/export ABC transporter with a peptidase domain. Members include the prochlorosins, lantipeptides from Prochlorococcus [ , ].
Protein Domain
Name: Phytol/farnesol kinase
Type: Family
Description: This entry includes a group of kinases from plants and bacteria, including phytol kinase l and farnesol kinase from Arabidopsis. Phytol kinase 1, also known as Vte5 (Vitamin E pathway gene 5, At5g04490), catalyzes the conversion of phytol to phytol monophosphate (PMP) in the presence of CTP or UTP. It is involved in seed tocopherol biosynthesis [ ].Farnesol kinase, also known as FOLK (At5g58560), can phosphorylate farnesol using an NTP donor. It is involved in negative regulation of abscisic acid (ABA) signaling [].
Protein Domain
Name: 2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-succinyltransferase, gammaproteobacteria
Type: Family
Description: 2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-succinyltransferase (DapD) is involved in the succinylated branch of the "lysine biosynthesis via diaminopimelate (DAP)"pathway ( ). This entry represents the gammaproteobacteria family of DapD sequences, which is the most closely related to the actinobacterial DapD family represented by . All of the genes evaluated for the seed of this model are found in genomes where the downstream desuccinylase is present, but known DapD genes are absent. Additionally, many of the genes identified by this model are found proximal to genes involved in this lysine biosynthesis pathway.
Protein Domain
Name: Putative C-S lyase
Type: Family
Description: Members of this subfamily are probable C-S lyases from a family of pyridoxal phosphate-dependent enzymes that tend to be (mis)annotated as probable aminotransferases. One member is PatB of Bacillus subtilis, a proven C-S-lyase. Another is the virulence factor cystalysin from Treponema denticola, whose hemolysin activity may stem from H2S production. Members of the seed alignment occur next to examples of the enzyme 5-histidylcysteine sulfoxide synthase, from ovothiol A biosynthesis, and would be expected to perform a C-S cleavage of 5-histidylcysteine sulfoxide to leave 1-methyl-4-mercaptohistidine (ovothiol A) [ , , ].
Protein Domain
Name: Zeaxanthin epoxidase
Type: Family
Description: This entry represents the enzyme zeaxanthin epoxidase ( ), which is involved in the epoxidation of zeaxanthin as part of the biosynthesis of the plant hormone abscisic acid (ABA). ABA is a sesquiterpenoid (15-carbon) which is partially produced via the mevalonic pathway in chloroplasts and other plastids (therefore its biosynthesis primarily occurs in the leaves). The production of ABA is accentuated by stresses such as water loss and freezing temperatures. The enzyme zeaxanthin epoxidase converts zeaxanthin into antheraxanthin and subsequently into violaxanthin. This enzyme also acts on beta-cryptoxanthin. Zeaxanthin epoxidase plays an important role in resistance to stresses, seed development and dormancy [ ].
Protein Domain
Name: Histone-lysine N-methyltransferase SETD1A/B-like, SET domain
Type: Domain
Description: In animals, SETD1A/B are histone methyltransferases that produce mono-, di-, and trimethylated histone H3 at 'Lys-4. However, if 'Lys-9' residue is already methylated, 'Lys-4' will not be. The 'Lys-4' methylation is a tag for epigenetic transcriptional activation [ , ]. The animal COMPASS complex is composed of at least the catalytic subunit (SETD1A or SETD1B), WDR5, WDR82, RBBP5, ASH2L/ASH2, CXXC1/CFP1, HCFC1 and DPY30 []. ATXR7, the Arabidopsis homologue to Set1, is required for the expression of the flowering repressors FLC and MADS-box genes of the MAF family [, ]. ATXR7 is also involved in the control of seed dormancy and germination [].This entry represents the SET domain found in SETD1A/B and its homologues.
Protein Domain
Name: Peroxisomal adenine nucleotide carrier 1/2
Type: Family
Description: This entry represents a group of peroxisomal adenine nucleotide transporters, including the Peroxisomal adenine nucleotide carrier 1/2 from Arabidopsis thaliana (AtPNC1/2) and the Peroxisomal adenine nucleotide transporter 1 from Saccharomyces cerevisiae (ANT1). Members of this family are found in plants and some fungal species.AtPNC1/2 catalyses the counter exchange of ATP with ADP or AMP. It is required for the beta-oxidation reactions involved in auxin biosynthesis and for the conversion of seed-reserved triacylglycerols into sucrose that is necessary for growth before the onset of photosynthesis [ , ]. In yeast, transport of ATP into the peroxisome is required for beta-oxidation of medium-chain fatty acids, thus required for growth on medium-chain fatty acids, pH gradient formation in peroxisomes and for normal peroxisome proliferation [ , ].
Protein Domain
Name: Plant bZIP transcription factors
Type: Family
Description: This family is composed of a group of plant bZIP transcription factors with similarity to OsbZIP46, which regulates abscisic acid (ABA) signalling-mediated drought tolerance in rice [ , ]. Plant bZIPs are involved in developmental and physiological processes in response to stimuli/stresses such as light, hormones, and temperature changes. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes []. This entry also includes ABI5 from Arabidopsis. ABI5 is a transcription factor that participates in ABA-regulated gene expression during seed development and subsequent vegetative stage by acting as the major mediator of ABA repression of growth [ , ]. It is also involved in the sugar signalling response in plants [].
Protein Domain
Name: Urease active site
Type: Active_site
Description: Urease (urea amidohydrolase, ) is a nickel-binding enzyme that catalyses the hydrolysis of urea to carbon dioxide and ammonia []. Historically, it was the first enzyme to be crystallized (in 1926). It is mainly found in plant seeds, microorganisms and invertebrates. In plants, urease is a hexamer of identicalchains. In bacteria [ ], it consists of either two or three different subunits(alpha, beta and gamma). Urease binds two nickel ions per subunit; four histidine, an aspartate and acarbamated-lysine serve as ligands to these metals; an additional histidine is involved in the catalytic mechanism []. The urease domain forms an (alphabeta)(8) barrel structure with structural similarity to other metal-dependent hydrolases, such as adenosine and AMP deaminase and phosphotriesterase.This entry represents a conserved region that contains the active site histidine.
Protein Domain
Name: Urease nickel binding site
Type: Binding_site
Description: Urease (urea amidohydrolase, ) is a nickel-binding enzyme that catalyses the hydrolysis of urea to carbon dioxide and ammonia []. Historically, it was the first enzyme to be crystallized (in 1926). It is mainly found in plant seeds, microorganisms and invertebrates. In plants, urease is a hexamer of identicalchains. In bacteria [ ], it consists of either two or three different subunits(alpha, beta and gamma). Urease binds two nickel ions per subunit; four histidine, an aspartate and acarbamated-lysine serve as ligands to these metals; an additional histidine is involved in the catalytic mechanism []. The urease domain forms an (alphabeta)(8) barrel structure with structural similarity to other metal-dependent hydrolases, such as adenosine and AMP deaminase and phosphotriesterase.This entry represents a conserved region that contains two histidines that bind one of the nickel ions.
Protein Domain
Name: Transcription factor IND-like
Type: Family
Description: This entry represents a group of bHLH transcription factors from plants, including IND, HEC1/2/3 and RHD6 from Arabidopsis. IND is required for seed dispersal [ ], while HEC1/2/3 are required for the female reproductive tract development and fertility []. IND interacts with another bHLH transcription factor SPATULA (SPT), and together they regulate genes involved in modulating auxin transport[ ]. RDH6 is a transcription factor that is specifically required for the development of root hairs. It integrates a jasmonate (JA) signaling pathway that stimulates root hair growth []. This entry also includes LF (LATE FLOWERING) and LAX PANICLE 1 from rice. LF regulates flowering time [], while LAX1 is a transcription factor that may regulate organogenesis in postembryonic development. It is involved in the regulation of shoot branching by controlling axillary meristem initiation [].
Protein Domain
Name: Gurmarin/antifungal peptide
Type: Homologous_superfamily
Description: Gurmarin is a sweet taste-suppressing polypeptide from the Indian-originated tree Gymnema sylvestre (Gurmar). Gurmarin acts to selectively inhibit the neural response to sweet stimuli in rats. The crystal structure of Gumarin reveals a disulphide-bound fold containing an antiparallel β-hairpin [ ]. The aromatic residues that form a hydrophobic cluster in gurmarin are thought to be a possible functional site for the interaction of gurmarin with the taste receptors [].Gurmarin is structurally related to the antifungal peptide PAFP-S from the seeds of the pokeweed Phytolacca americana, and to the antifungal peptide Alo3 from the insect Acrocinus longimanus. PAFP-S exhibits a broad spectrum of antifungal activity, including inhibition of saprophytic fungi and some plant pathogens. The amphiphilic surfaces of PAFP-S is thought to be the main functional site for interacts with biomembranes [ ]. Insect peptides are key elements of innate immunity against bacteria and fungi. Alo-1, Alo-2 and Alo-3 show high sequence identity, and are active against Candida species [].
Protein Domain
Name: Ascorbate oxidase, second cupredoxin domain
Type: Domain
Description: Ascorbate oxidase catalyzes the oxidation of ascorbic acid to dehydroascorbic acid. This multicopper oxidase (MCO) is found in cucurbitaceous plants such as pumpkin, cucumber, and melon. It can detect levels of ascorbic acid and eliminate it. The biological function of ascorbate oxidase is still not clear [ ]; some studies suggest that it may play a crucial role in cell elongation and enlargement []. In pumpkin, its expression is increased during callus growth, fruit development and seedling elongation [].MCOs couple oxidation of substrates with reduction of dioxygen to water. Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper centre. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper centre and transferring them to the active site trinuclear copper centre [ ]. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper.
Protein Domain
Name: GDSL esterase/lipase GLIP1-5/GLL25
Type: Family
Description: GDSL-type esterases/lipases are hydrolytic enzymes with multifunctional properties, such as broad substrate specificity, regiospecificity, and stereoselectivity [ ]. They have been identified in microbes and many plants. They have diverse physical functions such as affecting the germination rate and early growth of seedlings subjected to high concentrations of glucose, or being involved in biotic stress responses [].This entry represents a group of plant GDSL esterases/lipases, including GLIP1-6 and GLL23/25 from Arabidopsis. GDSL LIPASE-LIKE 1 (GLIP1) modulates systemic immunity through the regulation of ethylene signaling components [ ]. GLIP1 and GLIP3 have been shown to contribute to the plant resistance to Botrytis cinerea []. GLIP2 is involved in the resistance to Erwinia carotovora via the negative regulation of auxin signaling []. GLL23 and GLL25 (also known as GOLD36), have lost the conserved active site 'GDSL' motif and have no lipase activity. GLL23 is involved in the control of the PYK10 complex size and possibly substrate specificity [], while GLL25 is Involved in organisation of the endomembrane system and is required for endoplasmic reticulum morphology and organelle distribution[ ].
Protein Domain
Name: Aspartic peptidase A1 family
Type: Family
Description: Peptidase family A1, also known as the pepsin family, contains peptidases with bilobed structures [ , ]. The two domains most probably evolved from the duplication of an ancestral gene encoding a primordial domain []. The active site is formed from an aspartic acid residue from each domain. Each aspartic acid occurs within a motif with the sequence D(T/S)G(T/S). Exceptionally, in the histoaspactic peptidase from Plasmodium falciparum, one of the Asp residues is replaced by His [ ]. A third essential residue, Tyr or Phe, is found on the N-terminal domain only in a β-hairpin loop known as the "flap"; this residue is important for substrate binding, and most members of the family have a preference for a hydrophobic residue in the S1 substrate binding pocket. Most members of the family are active at acidic pH, but renin is unusually active at neutral pH. Family A1 peptidases are found predominantly in eukaryotes (but examples are known from bacteria [ , ]). Currently known eukaryotic aspartyl peptidases and homologues include the following:Vertebrate gastric pepsins A ( ), gastricsin ( , also known pepsin C), chymosin ( ; formerly known as rennin), and cathepsin E ( ). Pepsin A is widely used in protein sequencing because of its limited and predictable specificity. Chymosin is used to clot milk for cheese making. Lysosomal cathepsin D ( ). Renin ( ) which functions in control of blood pressure by generating angiotensin I from angiotensinogen in the plasma. Memapsins 1 ( ; also known as BACE 2) and 2 ( ; also known as BACE) are membrane-bound and are able to perform one of the two cleavages (the beta-cleavage, hence they are also known as beta-secretases) in the beta-amyloid precursor to release the the amyloid-beta peptide, which accumulates in the plaques of Alzheimer's disease patients. Fungal peptidases such as aspergillopepsin A ( ), candidapepsin ( ), mucorpepsin ( ; also known as Mucorrennin), endothiapepsin ( ), polyporopepsin ( ), and rhizopuspepsin ( ) are secreted for sapprophytic protein digestion. Fungal saccharopepsin ( ) (proteinase A) (gene PEP4) is implicated in post-translational regulation of vacuolar hydrolases. Yeast barrierpepsin ( ) (gene BAR1); a protease that cleaves alpha-factor and thus acts as an antagonist of the mating pheromone. Fission yeast Sxa1 may be involved in degrading or processing the mating pheromones [ ].In plants, phytepsin ( ) degrades seed storage proteins and nepenthesin (EC 3.4.23.12) from a pitcher plant digests insect proteins. Also are included Aspartic proteinase 36 and Aspartic proteinase 39, which contribute to pollen and ovule development and have an important role in plant development in Arabidopsis [ ].Plasmepsins ( and ) from Plasmodium species are important for the degradation of host haemoglobin. Non-peptidase homologues where one or more active site residues have been replaced, include mammalian pregnancy-associated glycoproteins, an allergen from a cockroach, and a xylanase inhibitor [ ].
Protein Domain
Name: Ascorbate oxidase, first cupredoxin domain
Type: Domain
Description: Ascorbate oxidase catalyzes the oxidation of ascorbic acid to dehydroascorbic acid. This multicopper oxidase (MCO) is found in cucurbitaceous plants such as pumpkin, cucumber, and melon. It can detect levels of ascorbic acid and eliminate it. The biological function of ascorbate oxidase is still not clear [ ]; some studies suggest that it may play a crucial role in cell elongation and enlargement []. In pumpkin, its expression is increased during callus growth, fruit development and seedling elongation [].MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper centre and transferring them to the active site trinuclear copper centre. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3 [ ].
Protein Domain
Name: Urease alpha subunit, C-terminal
Type: Domain
Description: Urease (urea amidohydrolase, ) a nickel-binding enzyme that catalyses the hydrolysis of urea to form ammonia and carbamate [ ]. It is mainly found in plant seeds, microorganisms and invertebrates. In plants, urease is a hexamer of identical chains, but the subunit composition of urease from different sources varies []; in bacteria [] it consists of either two or three different subunits (alpha, beta and gamma).Urease binds two nickel ions per subunit; four histidine, an aspartate and a carbamated-lysine serve as ligands to these metals; an additional histidine is involved in the catalytic mechanism [ ]. The urease domain forms an (alpha beta)(8) barrel structure with structural similarity to other metal-dependent hydrolases, such as adenosine and AMP deaminase (see ) and phosphotriesterase see ). Urease is unique among nickel metalloenzymes in that it catalyses a hydrolysis rather than a redox reaction. In Helicobacter pylori, the gamma and beta domains are fused and called the alpha subunit ( ). The catalytic subunit (called beta or B) has the same organisation as the Klebsiella alpha subunit. Jack bean (Canavalia ensiformis) urease has a fused gamma-beta-alpha organisation ( ). This entry describes the C-terminal domain of urease alpha subunit UreC (designated beta or UreB in Helicobacter species).Urease ( ) belongs to MEROPS peptidase family M38 (clan MJ).
Protein Domain
Name: Sirohaem synthase, N-terminal
Type: Domain
Description: Bacterial sulphur metabolism depends on the iron-containing porphinoid sirohaem. CysG is a multi-functional enzyme with S-adenosyl-L-methionine (SAM)-dependent bismethyltransferase, dehydrogenase and ferrochelatase activities. CysG synthesizes sirohaem from uroporphyrinogen III via reactions which encompass two branchpoint intermediates in tetrapyrrole biosynthesis, diverting flux first from protoporphyrin IX biosynthesis and then from cobalamin (vitamin B12) biosynthesis. CysG is a dimer. Its dimerisation region is 74 residues long, and acts to hold the two structurally similar protomers held together asymmetrically through a number of salt-bridges across complementary residues within the dimerisation region [ ]. CysG dimerisation produces a series of active sites, accounting for CysG's multi-functionality, catalysing four diverse reactions:Two SAM-dependent methylationsNAD+-dependent tetrapyrrole dehydrogenationMetal chelationThis group represent a subfamily of CysG N-terminal region-related sequences. All sequences in the seed alignment for this model are N-terminal regions of known or predicted sirohaem synthases. The C-terminal region of each is uroporphyrin-III C-methyltransferase ( ), which catalyses the first step committed to the biosynthesis of either sirohaem or cobalamin (vitamin B12) rather than protohaem (haem). Functionally these sequences complete the process of oxidation and iron insertion to yield sirohaem. Sirohaem is a cofactor for nitrite and sulphite reductases, so sirohaem synthase is CysG of cysteine biosynthesis in some organisms.
Protein Domain
Name: Ascorbate oxidase, third cupredoxin domain
Type: Domain
Description: Ascorbate oxidase catalyzes the oxidation of ascorbic acid to dehydroascorbic acid. This multicopper oxidase (MCO) is found in cucurbitaceous plants such as pumpkin, cucumber, and melon. It can detect levels of ascorbic acid and eliminate it. The biological function of ascorbate oxidase is still not clear [ ]; some studies suggest that it may play a crucial role in cell elongation and enlargement []. In pumpkin, its expression is increased during callus growth, fruit development and seedling elongation [].MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper centre and transferring them to the active site trinuclear copper centre. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3 [ ].
Protein Domain
Name: Succinyldiaminopimelate transaminase, DapC
Type: Family
Description: Two lysine biosynthesis pathways evolved separately in organisms, the diaminopimelic acid (DAP) and aminoadipic acid (AAA) pathways. The DAP pathway synthesizes L-lysine from aspartate and pyruvate, and diaminopimelic acid is an intermediate. This pathway is utilised by most bacteria, some archaea, some fungi, some algae, and plants. The AAA pathway synthesizes L-lysine from alpha-ketoglutarate and acetyl coenzyme A (acetyl-CoA), and alpha-aminoadipic acid is an intermediate. This pathway is utilised by most fungi, some algae, the bacterium Thermus thermophilus, and probably some archaea, such as Sulfolobus, Thermoproteus, and Pyrococcus. No organism is known to possess both pathways [ ].There four known variations of the DAP pathway in bacteria: the succinylase, acetylase, aminotransferase, and dehydrogenase pathways. These pathways share the steps converting L-aspartate to L-2,3,4,5- tetrahydrodipicolinate (THDPA), but the subsequent steps leading to the production of meso-diaminopimelate, the immediate precursor of L-lysine, are different [ ].The succinylase pathway acylates THDPA with succinyl-CoA to generate N-succinyl-LL-2-amino-6-ketopimelate and forms meso-DAP by subsequent transamination, desuccinylation, and epimerization. This pathway is utilised by proteobacteria and many firmicutes and actinobacteria. The acetylase pathway is analogous to the succinylase pathway but uses N-acetyl intermediates. This pathway is limited to certain Bacillus species, in which the corresponding genes have not been identified. The aminotransferase pathway converts THDPA directly to LL-DAP by diaminopimelate aminotransferase (DapL) without acylation. This pathway is shared by cyanobacteria, Chlamydia, the archaeon Methanothermobacter thermautotrophicus, and the plant Arabidopsis thaliana. The dehydrogenase pathway forms meso-DAP directly from THDPA, NADPH, and NH4 _ by using diaminopimelate dehydrogenase (Ddh). This pathway is utilised by some Bacillus and Brevibacterium species and Corynebacterium glutamicum. Most bacteria use only one of the four variants, although certain bacteria, such as C. glutamicum and Bacillus macerans, possess both the succinylase and dehydrogenase pathways.The four sequences which make up the seed for this model are not closely related, although they are all members of the family of aminotransferases and are more closely related to each other than to anything else. Additionally, all of them are found in the vicinity of genes involved in the biosynthesis of lysine via the diaminopimelate pathway ( ), although this amounts to a separation of 12 genes in the case of Sulfurihydrogenibium azorense Az-Fu1. None of these genomes contain another strong candidate for this role in the pathway. Note: the detailed information included in the record includes the assertions that the enzyme uses the pyridoxal pyrophosphate cofactor, which is consistent with the family, and the assertion that the amino group donor is L-glutamate, which is undetermined for the sequences in this clade.
Protein Domain
Name: Protein transport protein Got1
Type: Family
Description: Got1 is required for the fusion of ER-derived transport vesicles with the Golgi complex [ ].
Protein Domain
Name: Protein transport protein SEC31-like
Type: Family
Description: This entry contains proteins with WD40 repeats and includes protein transport protein SEC31, which is a component of the coat protein complex II (COPII). COPII promotes the formation of transport vesicles from the endoplasmic reticulum by deformating the endoplasmic reticulum membrane into vesicles and selectiing cargo molecules [ ]. In humans and other mammals, there are two SEC31 proteins known as SEC31A and SEC31B [].
Protein Domain
Name: Protein transport protein SEC31
Type: Family
Description: Sec31 is involved in COPII coat formation as it forms through the sequential binding of three cytoplasmic proteins: Sar1, Sec23/24 and Sec13/31. Sec13/31 is recruited by the pre-budding complex and polymerisation of Sec13/31 occurs to form an octahedral cage that is the outer shell of the COPII coat [ ]. Sec13/31 is a hetero-tetramer which is organised as a linear array of α-solenoid and β-propeller domains to form a rod in which twenty-four copies assemble to form the COPII cub-octahedron [].
Protein Domain
Name: Protein transport protein Sec23
Type: Family
Description: Vesicular carriers mediate a continuous flux of proteins and lipids between endoplasmic reticulum (ER) and the Golgi. Anterograde and retrograde transport is mediated by distinct sets of cytosolic coat proteins, the COPI and COPII coats, respectively, which act on the membrane to capture cargo proteins into nascent vesicles [ ].Sec23 is a component of the coat protein complex II (COPII). Polymerization of the coat requires the recruitment of the Sec13/Sec31 complex (coat outer shell) by the Sec23/Sec24 complex. The Sec23/Sec24 coat complex then sorts the fusion machinery (SNAREs) into vesicles as they bud from the ER. Sec23 has been shown to interact in a sequential manner with other proteins (Sar1, TRAPPI and Hrr25) to control the direction of anterograde membrane flow [ ].This entry also includes the budding yeast Sec23 paralogue, Nel1 (YHR035W). It is a GTPase-activating protein for Sar1 and does not function as a subunit of the coat protein complex II (COPII) coat [ ].
Protein Domain
Name: Protein transport protein SecG/Sec61-beta/Sbh
Type: Family
Description: This family includes preprotein translocase subunit SecG, protein transport protein Sec61 subunit beta and Sbh1.A conserved heterotrimeric integral membrane protein complex--the Sec61 complex (eukaryotes) or SecY complex (prokaryotes)--forms a protein-conducting channel that allows polypeptides to be transferred across (or integrated into) the endoplasmic reticulum (eukaryotes) or across the cytoplasmic membrane (prokaryotes) [ , ]. This complex is itself a part of a larger translocase complex.The alpha subunits ( ), called Sec61alpha in mammals, Sec61p in Saccharomyces cerevisiae (Baker's yeast), and SecY in prokaryotes, and the gamma subunits, called Sec61gamma in mammals, Sss1p in S. cerevisiae, and SecE in prokaryotes, show significant sequence conservation. Both subunits are required for cell viability in S. cerevisiae and Escherichia coli. The beta subunits, called Sec61beta in mammals, Sbh in S. cerevisiae, and SecG in archaea, are not essential for cell viability. They are similar in eukaryotes and archaea, but show no obvious homology to the corresponding SecG subunits in bacteria. SecY forms the channel pore, and it is the cross-linking partner of polypeptide chains passing through the membrane [ ]. SecY and SecE constitute the high-affinity SecA-binding site on the membrane []. The channel is a passive conduit for polypeptides. It must therefore associate with other components that provide a driving force. The partner proteins in bacteria and eukaryotes differ. In bacteria, the translocase complex comprises 7 proteins [ ], including a chaperone protein (SecB), an ATPase (SecA), an integral membrane complex (SecY, SecE and SecG), and two additional membrane proteins that promote the release of the mature peptide into the periplasm (SecD and SecF). The SecA ATPase interacts dynamically with the SecYEG integral membrane components to drive the transmembrane movement of newly synthesized preproteins []. In S. cerevisiae (and probably in all eukaryotes), the full translocase comprises another membrane protein subcomplex (the tetrameric Sec62/63p complex), and the lumenal protein BiP, a member of the Hsp70 family of ATPases. BiP promotes translocation by acting as a molecular ratchet, preventing the polypeptide chain from sliding back into the cytosol [].
Protein Domain
Name: Vacuolar protein sorting-associated protein 41
Type: Family
Description: This entry represents the eukaryotic Vacuolar protein sorting-associated protein 41 (Vps41), a subunit of the homotypic vacuole fusion and vacuole protein sorting (HOPS) complex, which is essential for membrane docking and fusion at the Golgi-to-endosome and endosome-to-vacuole stages of protein transport [ , , ]. This protein interacts with Caspase-8, which plays a key role in apoptosis and development []. In humans, Vps41 variants prevent the formation of a functional HOPS complex, causing disorders such as dystonia associated with lysosomal abnormalities and neurodegenerative diseases [ , , ].
Protein Domain
Name: Ribosomal protein L1/ribosomal biogenesis protein
Type: Family
Description: Ribosomal protein L1 is the largest protein from the large ribosomal subunit. The L1 protein contains two domains: 2-layer alpha/beta domain and a 3-layer alpha/beta domain (interrupts the first domain). The two domains cycle between open and closed conformations via a hinge motion. In Escherichia coli, L1 is known to bind to the 23S rRNA. The RNA-binding site of L1 is highly conserved, with both mRNA and rRNA binding the same binding site. Like several other large ribosomal subunit proteins, L1 displays RNA chaperone activity [ , , , , , ]. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities [, ], groups:Eubacterial L1Algal and plant chloroplast L1Cyanelle L1Archaebacterial L1Vertebrate L10AYeast Utp30, Rpl1a, Rpl1b and Mrpl1.This entry also matches ribosome biogenesis proteins, such as Cic1, which associates with the proteasome and is required for the degradation of specific substrates [ ], and for the synthesis of 60S ribosome subunits [].
Protein Domain
Name: Vacuolar protein sorting-associated protein 16
Type: Family
Description: This group represents a vacuolar protein sorting-associated protein 16 (Vps16). Vps16 may play a role in vesicle-mediated protein trafficking to endosomal/lysosomal compartments and in membrane docking/fusion reactions [ , , ].
Protein Domain
Name: Protein CHAPERONE-LIKE PROTEIN OF POR1-like
Type: Family
Description: This entry includes proteins from plants and bacteria. The plant members CHAPERONE-LIKE PROTEIN OF POR1 (CPP1) and Protein CHLOROPLAST J-LIKE DOMAIN 1 (CJD1) have a J-like domain and three transmembrane domains. CPP1 is an essential protein for chloroplast development, plays a role in the regulation of POR (light-dependent protochlorophyllide oxidoreductase) stability and function [ , , ]. CJD1 may be involved in the regulation of the fatty acid metabolic process in chloroplasts, especially chloroplastic galactolipids monogalactosyldiacylglycerol (MGDG) and digalactosyldiacylglycerol (DGDG) [].
Protein Domain
Name: Vacuolar protein sorting-associated protein 29
Type: Family
Description: This entry represents Vacuolar protein sorting-associated 29 (Vps29) from animals, yeasts and plants. Vps29 is an essential component of the retromer complex, a conserved complex required in endosome-to-Golgi retrograde transport [ ].
Protein Domain
Name: Sec-independent protein translocase protein TatA/E
Type: Family
Description: Translocation of proteins across the two membranes of Gram-negative bacteria can be carried out via a number of routes. Most proteins marked for export carry a secretion signal at their N terminus, and are secreted by the general secretory pathway. The signal peptide is cleaved as they pass through the outer membrane. Other secretion systems include the type III system found in a select group of Gram-negative plant and animal pathogens, and the CagA system of Helicobacter pylori [ ].In some bacterial species, however, there exists a system that operates independently of the Sec pathway []. It selectively translocates periplasmic-bound molecules that are synthesised with, or are in close association with, "partner"proteins bearing an (S/T)RRXFLK twin arginine motif at the N terminus. The pathway is therefore termed the Twin-Arginine Translocation or TAT system. Surprisingly, the four components that make up the TAT system are structurally and mechanistically related to a pH-dependent import system in plant chloroplast thylakoid membranes []. Thegene products responsible for the Sec-independent pathway are called TatA, TatB, TatC and TatE.TatA and TatE are highly related proteins and appear to overlap in functionality [ ]. Translocation occurred in single mutants of either TatA or TatE, though much less efficiently, but double mutants showed no detectable translocation.
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom