Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 1 to 100 out of 116 for seed

Category restricted to ProteinDomain (x)

<< First    < Previous  |  Next >    Last >>
0.029s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Name: Embryo-specific ATS3
Type: Family
Description: This is a family of plant seed-specific proteins identified in Arabidopsis thaliana (Mouse-ear cress). ATS3 (Arabidopsis thaliana seed gene 3) is expressed in a pattern similar to the Arabidopsis seed storage protein genes [ ].
Protein Domain
Name: SEED MATURATION PROTEIN 1
Type: Family
Description: Secondary dormancy is an adaptive trait arising in previously non-dormant seeds due to unfavourable environmental conditions during germination. The SEED MATURATION PROTEIN1 (SMP1; AT3G12960) is involved in seed maturation and dormancy maintenance after high temperature fluctuation [ ].
Protein Domain
Name: Protein-(glutamine-N5) methyltransferase, release factor-specific
Type: Family
Description: Members of this protein family are HemK, a protein once thought to be involved in heme biosynthesis but now recognised to be a protein-glutamine methyltransferase that modifies the peptide chain release factors [ ]. All members of the seed alignment are encoded next to the release factor 1 gene (prfA) and confirmed by phylogenetic analysis. However, the family is diverse enough that even many members of the seed alignment do not score above the seed alignment, which was set high enough to exclude all instances of PrmB.
Protein Domain
Name: Small hydrophilic plant seed protein, conserved site
Type: Conserved_site
Description: This entry represents a conserved site in hydrophilic plant seed proteins that are structurally related:Arabidopsis thaliana proteins GEA1 and GEA6Cotton late embryogenesis abundant (LEA) protein D-19Carrot EMB-1 proteinBarley LEA proteins B19.1A, B19.1B, B19.3 and B19.4Maize late embryogenesis abundant protein Emb564Radish late seed maturation protein p8B6Rice embryonic abundant protein Emp1Sunflower 10 Kd late embryogenesis abundant protein (DS10)Wheat Em proteinsThese proteins may play a role in equipping the seed for survival, maintaining a minimal level of hydration in the dry organism and preventing the denaturation of cytoplasmic components [ , ]. They may also play a role during imbibition by controlling water uptake.
Protein Domain
Name: Small hydrophilic plant seed protein
Type: Family
Description: This entry contains the plant LEA (late embryogenesis abundant) proteins, which are small hydrophilic plant seed proteins that are structurally related. These proteins contains from 83 to 153 amino acid residues and may play a role[ , ] in equipping the seed for survival, maintaining a minimal level ofhydration in the dry organism and preventing the denaturation of cytoplasmic components. They may also play a role during imbibition by controlling wateruptake.
Protein Domain
Name: Cupin 1
Type: Domain
Description: This entry represents the conserved β-barrel fold of the 'cupin' superfamily ('cupa' is the Latin term for a small barrel). This family contains 11S and 7S plant seed storage proteins, and germins. Plant seed storage proteins provide the major nitrogen source for the developing plant [ , , ].This domain can also be found as a central component of many microbial proteins including certain types of phosphomannose isomerase, polyketide synthase, epimerase, and dioxygenase [ ].
Protein Domain
Name: Jacalin-like lectin domain
Type: Domain
Description: The jacalin-like mannose-binding lectin domain has a β-prism fold consisting of three 4-stranded β-sheets, with an internal pseudo 3-fold symmetry. Some proteins with this domain stimulate distinct T- and B- cell functions, such as the plant lectin jacalin, which binds to the T-antigen and acts as an agglutinin. The domain can occur in tandem-repeat arrangements with up to six copies, and in architectures combined with a variety of other functional domains. While the family was initially named after an abundant protein found in the jackfruit seed, taxonomic distribution is not restricted to plants. The domain is also found in the salt-stress induced protein from rice and an animal prostatic spermine-binding protein. Proteins containing this domain include:Jacalin, a tetrameric plant seed lectin and agglutinin from Artocarpus heterophyllus (jackfruit), which is specific for galactose [ ].Artocarpin, a tetrameric plant seed lectin from A. heterophyllus [ ].Lectin MPA, a tetrameric plant seed lectin and agglutinin from Maclura pomifera (Osage orange), [ ].Heltuba lectin, a plant seed lectin and agglutinin from Helianthus tuberosus (Jerusalem artichoke) [ ].Agglutinin from Calystegia sepium (Hedge bindweed) [ ].Griffithsin, an anti-viral lectin from red algae (Griffithsia species) [ ].
Protein Domain
Name: Jacalin-like lectin domain superfamily
Type: Homologous_superfamily
Description: The jacalin-like mannose-binding lectin domain has a β-prism fold consisting of three 4-stranded β-sheets, with an internal pseudo 3-fold symmetry. Some proteins with this domain stimulate distinct T- and B- cell functions, such as the plant lectin jacalin, which binds to the T-antigen and acts as an agglutinin. The domain can occur in tandem-repeat arrangements with up to six copies, and in architectures combined with a variety of other functional domains. While the family was initially named after an abundant protein found in the jackfruit seed, taxonomic distribution is not restricted to plants. The domain is also found in the salt-stress induced protein from rice and an animal prostatic spermine-binding protein. Proteins containing this domain include:Jacalin, a tetrameric plant seed lectin and agglutinin from Artocarpus heterophyllus (jackfruit), which is specific for galactose [ ].Artocarpin, a tetrameric plant seed lectin from A. heterophyllus [ ].Lectin MPA, a tetrameric plant seed lectin and agglutinin from Maclura pomifera (Osage orange), [ ].Heltuba lectin, a plant seed lectin and agglutinin from Helianthus tuberosus (Jerusalem artichoke) [ ].Agglutinin from Calystegia sepium (Hedge bindweed) [ ].Griffithsin, an anti-viral lectin from red algae (Griffithsia species) [ ].
Protein Domain
Name: Late embryogenesis abundant protein, SMP subgroup
Type: Family
Description: LEA (late embryogenesis abundant) proteins were first identified in land plants. Plant LEA proteins have been found to accumulate to high levels during the last stage of seed formation (when a natural desiccation of the seed tissues takes place) and during periods of water deficit in vegetative organs. Later, LEA homologues have also been found in various species [ , ]. They have been classified into several subgroups in Pfam and according to Bray and Dure [].
Protein Domain
Name: Cereal allergen/alpha-amylase inhibitor, rice type
Type: Family
Description: Seeds of cereals contain a variety of serine protease and alpha-amylase inhibitors. These inhibitors can be grouped into families based on structural similarities. Rice seed allergenic proteins (RA) have sequence homology to seed trypsin/alpha-amylase inhibitors. Some have serine peptidase activity or alpha-amylase, and a few are bifunctional. The proteins contain ~10 cysteine residues, all of which are involved in disulphide bond formation [ ].This majority of sequences in this family are from Oryza sativa (Rice), exceptions are from Hordeum vulgare (Barley) and Triticum aestivum (Wheat). The majority are annotated either as alpha-amylase inhibitors or seed allergens and all belong to the MEROPS inhibitor family I6, clan IJ. There is no direct evidence to suggest that they can inhibit serine peptidases belonging to MEROPS peptidase S1 [ ], and studies on a closely related alpha-amylase inhibitor from Secale cereale (Rye) demonstrates no activity against trypsin, and illustrates the necessity of exercising caution in assigning function based on sequence comparisons [].The rice seed allergenic proteins are encoded by a multigene family consisting of at least four members. A conserved sequence similar to a motif identified in rice glutelin promoters was observed in the 5' region of the two genes. RA genes are specifically expressed in ripening seeds and they accumulate maximally 15-20 days after flowering [ ].
Protein Domain
Name: Jacalin-like lectin domain, plant
Type: Domain
Description: Jacalin-like lectins are sugar-binding protein domains mostly found in plants. They adopt a β-prism topology consistent with a circularly permuted three-fold repeat of a structural motif. Proteins containing this domain may bind mono- or oligosaccharides with high specificity. The domain can occur in tandem-repeat arrangements with up to six copies, and in architectures combined with a variety of other functional domains. While the family was initially named after an abundant protein found in the jackfruit seed, taxonomic distribution is not restricted to plants. The domain is also found in the salt-stress induced protein from rice and an animal prostatic spermine-binding protein. Proteins containing this domain include:Jacalin, a tetrameric plant seed lectin and agglutinin from Artocarpus heterophyllus (jackfruit), which is specific for galactose [ ].Artocarpin, a tetrameric plant seed lectin from A. heterophyllus [ ].Lectin MPA, a tetrameric plant seed lectin and agglutinin from Maclura pomifera (Osage orange), [ ].Heltuba lectin, a plant seed lectin and agglutinin from Helianthus tuberosus (Jerusalem artichoke) [ ].Agglutinin from Calystegia sepium (Hedge bindweed) [ ].Griffithsin, an anti-viral lectin from red algae (Griffithsia species) [ ].Ipomoelin, a Jacalin-related lectin from sweet potato (Ipomoea batatas cv. Tainung 57) [ ]. This entry refers to jacalin-like lectin domains found in plants.
Protein Domain
Name: Protein ROH1-like
Type: Family
Description: ROH1 is an interactor of the exocyst subunit Exo70A1, and has been shown to be required for seed coat mucilage deposition [ ].
Protein Domain
Name: Late embryogenesis abundant protein, SMP subgroup domain
Type: Domain
Description: LEA (late embryogenesis abundant) proteins were first identified in land plants. Plant LEA proteins have been found to accumulate to high levels during the last stage of seed formation (when a natural desiccation of the seed tissues takes place) and during periods of water deficit in vegetative organs. Later, LEA homologues have also been found in various species [, ]. They have been classified into several subgroups in Pfam and according to Bray and Dure [].This entry represents Pfam SMP, or D-34 from Dure, or group 6 from Bray.
Protein Domain
Name: Mannan endo-1,4-beta-mannosidase-like
Type: Family
Description: This family includes mannan endo-1,4-beta-mannosidases from plants and fungi (). They catalyse the random hydrolysis of (1->4)-beta-D-mannosidic linkages in mannans, galactomannans and glucomannans and are crucial for depolymerization of seed galactomannans and wood galactoglucomannans [ , ]. The deconstruction of the plant cell wall has had increasing importance as a key biological process in the development of a sustainable biofuel industry []. In plants, they are related to the process of weakening the tissues surrounding the embryo during seed germination [].This entry also includes cellulase domain-containing proteins from bacteria.
Protein Domain
Name: Late embryogenesis abundant protein, LEA_2 subgroup
Type: Domain
Description: LEA (late embryogenesis abundant) proteins were first identified in land plants. Plant LEA proteins have been found to accumulate to high levels during the last stage of seed formation (when a natural desiccation of the seed tissues takes place) and during periods of water deficit in vegetative organs. Later, LEA homologues have also been found in various species [ , ]. They have been classified into several subgroups in Pfam and according to Bray and Dure [].This entry represents Pfam LEA_2, or LEA14 (D-95) from Dure. The structure of Arabidopsis LEA14 has been revealed [ ].
Protein Domain
Name: Hydrophobic seed protein domain
Type: Domain
Description: This domain has a four-helix bundle structure. It contains four disulfide bonds, of which three function to keep the C- and N-terminal parts of the molecule in place [ ].
Protein Domain
Name: Zein seed storage protein
Type: Family
Description: Alpha-prolamins are the major seed storage proteins of species of the grass tribe Andropogonea. They are unusually rich in glutamine, proline, alanine, and leucine residues and their sequences show a series of tandem repeats presumed to be the result of multiple intragenic duplication [ ]. In Zea mays (Maize), the 22kDa and 19kDa zeins are encoded by a large multigene family and are the major seed storage proteins accounting for 70% of the total zein fraction. Structurally the 22kDa and 19kDa zeins are composed of nine adjacent, topologically antiparallel helices clustered within a distorted cylinder. The 22kDa alpha-zeins are encoded by 23 genes [ ]; twenty-two of the members are found in a roughly tandem array forming a dense gene cluster. The expressed genes in the cluster are interspersed with nonexpressed genes. Interestingly, some of the expressed genes differ in their transcriptional regulation. Gene amplification appears to be in blocks of genes explaining the rapid and compact expansion of the cluster during the evolution of maize.
Protein Domain
Name: Late embryogenesis abundant protein 1/2/D7
Type: Family
Description: Late embryogenesis abundant proteins (LEAs) are late embryonic proteins abundant in higher plant seed embryos and their function is not known. This entry includes LEA1/2 from Cicer arietinum (Chickpea) and LEAD7 from Gossypium hirsutum (Upland cotton).
Protein Domain
Name: Late embryogenesis abundant protein, LEA_1 subgroup
Type: Family
Description: LEA (late embryogenesis abundant) proteins were first identified in land plants. Plant LEA proteins have been found to accumulate to high levels during the last stage of seed formation (when a natural desiccation of the seed tissues takes place) and during periods of water deficit in vegetative organs. Later, LEA homologues have also been found in various species [, ]. They have been classified into several subgroups in Pfam and according to Bray and Dure [].This entry represents Pfam LEA_1, or D-113 from Dure, or group 4 from Bray. Proteins in this entry include LEA6, LEA18 and LEA46 from Arabidopsis. They may play roles in the adaptive process to water deficit in higher plants [ ].
Protein Domain
Name: Alkaline ceramidase TOD1/Probable hexosyltransferase MUCI70
Type: Family
Description: The entry represents a group of proteins mainly found in plants, including MUCI70 and TOD1 from Arabidopsis. They share a Rossmann-like fold found in glycosyltransferases.MUCI70 is a predicted glycosyltransferase essential for the accumulation of seed mucilage, a gelatinous wall rich in unbranched rhamnogalacturonan I (RG I), and for shaping the surface morphology of seeds [ , ]. Together with IRX14, itis required for xylan and pectin synthesis in seed coat epidermal (SCE) cells.TOD1 is an endoplasmic reticulum ceramidase that catalyses the hydrolysis of ceramides into sphingosine and free fatty acids at alkaline pH (e.g. pH 9.5) [ ]. It is involved in the regulation of turgor pressure in guard cells and pollen tubes [, ].
Protein Domain
Name: E3 ubiquitin-protein ligase BIG BROTHER
Type: Family
Description: Protein BIG BROTHER is an E3 ubiquitin-protein ligase that limits organ size, and possibly seed size, in a dose-dependent manner. It may limit the duration of organ growth and ultimately organ size by actively degrading critical growth stimulators [].
Protein Domain
Name: mRNA export factor GLE1-like
Type: Family
Description: This family includes human protein GLE1 ( ) and its homologues. This protein is localised at the nuclear pore complexes and functions in poly(A)+ RNA export to the cytoplasm [ , ]. In Arabidopsis, it is required for seed viability [].
Protein Domain
Name: Late embryogenesis abundant protein, LEA_3 subgroup
Type: Family
Description: LEA (late embryogenesis abundant) proteins were first identified in land plants. Plant LEA proteins have been found to accumulate to high levels during the last stage of seed formation (when a natural desiccation of the seed tissues takes place) and during periods of water deficit in vegetative organs. Later, LEA homologues have also been found in various species [ , ]. They have been classified into several subgroups in Pfam and according to Bray and Dure [].This entry represents Pfam LEA_3, or LEA5 (D-73) from Dure. Proteins in this entry includes LEA-5 from Citrus sinensis [ ], whose expression is induced by salt, drought and heat stress []. This entry also includes At4g02380 (SAG21), At1g02820 (LEA2), At3g53770 (LEA37) and At4g15910 (LEA41) from Arabidopsis [].
Protein Domain
Name: Late embryogenesis abundant protein, LEA_5 subgroup
Type: Family
Description: LEA (late embryogenesis abundant) proteins were first identified in land plants. Plant LEA proteins have been found to accumulate to high levels during the last stage of seed formation (when a natural desiccation of the seed tissues takes place) and during periods of water deficit in vegetative organs. Later, LEA homologues have also been found in various species [ , ]. They have been classified into several subgroups in Pfam and according to Bray and Dure [].This entry represents Pfam LEA_5, or D-19 from Dure, or group 1 from Bray. Proteins in this entry includes EM1 (At3g51810) and EM6 (At2g40170) from Arabidopsis [ ]. This entry also includes some bacterial hydrophilins. Some proteins in this entry also contain the KGG motif ().
Protein Domain
Name: Transcription factor GTE1
Type: Family
Description: This entry represents the transcription factor GTE1 from plants. Arabidopsis GTE1 is a transcription activator that plays a role in the promotion of seed germination by both negatively and positively regulating the abscisic acid (ABA) and phytochrome A (phyA) transduction pathways, respectively [ ].
Protein Domain
Name: GEM-like protein
Type: Family
Description: This entry represents the GEM and GEM-like (GER 1-8) proteins from plants. They contain a GRAM domain. GEM binds to phospholipids and its expression can be regulated by ABA [ ]. GER5 has been shown to be involved in seed development and inflorescence architecture [].
Protein Domain
Name: Late embryogenesis abundant protein Lea14-like
Type: Family
Description: LEA (late embryogenesis abundant) proteins were first identified in land plants. Plant LEA proteins have been found to accumulate to high levels during the last stage of seed formation (when a natural desiccation of the seed tissues takes place) and during periods of water deficit in vegetative organs. Later, LEA homologues have also been found in various species [ , ]. They have been classified into several subgroups in Pfam and according to Bray and Dure [].This family includes Lea14-like proteins which contain a Water Stress and Hypersensitive response (WHy) domain, a region of unknown function found in several plant proteins involved in either the response to water stress or the response to bacterial infection []. This entry also includes WHy domain-containing proteins from bacteria. Their function is not clear.
Protein Domain
Name: Vicilin, N-terminal
Type: Domain
Description: This region is found in plant seed storage proteins, N-terminal to the Cupin domain ( ). In Macadamia integrifolia (Macadamia nut) ( ), this region is processed into peptides of approximately 50 amino acids containing a C-X-X-X-C-(10-12)X-C-X-X-X-C motif. These peptides exhibit antimicrobial activity in vitro[ ].
Protein Domain
Name: O-FUCOSYLTRANSFERASE1-like
Type: Family
Description: This entry represents a group of putative plant O-Fucosyltransferases (POFTs), including O-FUCOSYLTRANSFERASE1 (AtOFT1, At3g05320) from Arabidopsis. Interestingly, oft1 mutant pollen tubes are ineffective at penetrating the stigma-style interface leading to a drastic reduction in seed set and a nearly 2000-fold reduction in pollen transmission [ , ].
Protein Domain
Name: Late embryogenesis abundant protein, LEA_4 subgroup
Type: Family
Description: LEA (late embryogenesis abundant) proteins were first identified in land plants. Plant LEA proteins have been found to accumulate to high levels during the last stage of seed formation (when a natural desiccation of the seed tissues takes place) and during periods of water deficit in vegetative organs. Later, LEA homologues have also been found in various species [ , ]. They have been classified into several subgroups in Pfam and according to Bray and Dure [].This entry represents Pfam LEA_4, or D-7, D-29 from Dure. Proteins in this entry includes LEA3 from wheat [ , ], ECP63 and At3g53040 from Arabidopsis. Their function is not clear. However, ECP63 has been linked to BHLH109-mediated regulation of somatic embryogenesis, and At3g53040 may be involved in the re-establishment of desiccation tolerance in germinated seeds [ ]. This entry also includes uncharacterised proteins from bacteria.
Protein Domain
Name: 11-S seed storage protein, plant
Type: Family
Description: Plant seed storage proteins, whose principal function appears to be the major nitrogen source for the developing plant, can be classified, on the basis oftheir structure, into different families. 11S-type globulins are non-glycosylated proteins which form hexameric structures [ , ]. Each of the subunits in the hexamer is itself composed of an acidic and a basic chain derived from a single precursor and linked by a disulphide bond. This structure is shown in the followingrepresentation. +-------------------------+ | |xxxxxxxxxxxCxxxxxxxxxxxxxxxxxxxxxxNGxCxxxxxxxxxxxxxxxxxxxxxxx |------Acidic-subunit-------------||-----Basic-subunit------||-----------------About-480-to-500-residues-----------------| 'C': conserved cysteine involved in a disulphide bond.Members of the 11-S family include pea and broad bean legumins, oil seed rapecruciferin, rice glutelins, cotton beta-globulins, soybean glycinins, pumpkin 11-S globulin, oat globulin, sunflower helianthinin G3, etc.This family represents the precursor protein which is cleaved into the two chains. These proteins contain two β-barrel domains.This family is a member of the 'cupin' superfamily on the basis of their conserved barrel domain ('cupa' is the Latin termfor a small barrel).
Protein Domain
Name: Napin/ 2S seed storage protein/Conglutin
Type: Family
Description: This entry represents a group of plant seed storage proteins, including Napin, 2S seed storage protein and Conglutin.Napins are low-molecular weight, basic storage proteins synthesised in rape-seed embryos during seed maturation [, ]. Sequence comparisons have revealed that Napin belongs to a diverseprotein family, which includes major allergens, trypsin inhibitors and natural anti-fungal proteins. Napin comprises 2 polypeptide chains (MW 9000 and 4000) held together by disulphide bonds. The protein isinitially synthesised as a precursor of 178 residues, which is proteolytically cleaved to generate mature Napin chains, with 86 and 29 residues respectively [].Some of the proteins in this family are allergens, with cores that are very resistant to proteolytic digestion and to elevated temperatures of up to 100 degrees C [ , , , ]. Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans [WHO/IUIS Allergen Nomenclature SubcommitteeKing T.P., Hoffmann D., Loewenstein H., Marsh D.G., Platts-Mills T.A.E., Thomas W. Bull. World Health Organ. 72:797-806(1994)]. This nomenclature system is defined by a designation that is composed of the first three letters of the genus; a space; the first letter of thespecies name; a space and an arabic number. In the event that two species names have identical designations, they are discriminated from one anotherby adding one or more letters (as necessary) to each species designation.
Protein Domain
Name: DDB1- and CUL4-associated factor 8-like
Type: Family
Description: This entry represents a group of WD and tetratricopeptide repeats proteins, including DCAF5/6/8 and DCAF8L1/DCAF8L2/WDTC1 from humans. They may function as a substrate receptor for CUL4-DDB1 E3 ubiquitin-protein ligase complex [ ].Proteins in this entry also include protein ALTERED SEED GERMINATION 2 (ASG2) from Arabidopsis. It is a negative regulator of ABA signalling [ ].
Protein Domain
Name: DUF642 L-galactono-1,4-lactone-responsive gene 2-like domain
Type: Domain
Description: This entry represents a domain found twice in the protein DUF642 L-galactono-1,4-lactone-responsive gene 2 from Arabidopsis thaliana (DGR2) and similar proteins found in plants and bacteria. DGR2 is involved in the regulation of testa rupture during seed germination [ ] and in the development of roots and rosettes []. This domain was previously known as DUF642.
Protein Domain
Name: Heat shock factor binding 1
Type: Family
Description: Heat shock factor binding protein 1 (HSBP1) interacts with the oligomerization domain of heat shock factor 1 (Hsf1), suppressing Hsf1's transcriptional activity following stress. It plays an essential role during early mouse and zebrafish embryonic development [ ]. In the plant Arabidopsis, heat shock factor-binding protein (HSBP) is required for acquired thermotolerance but not basal thermotolerance [] and for seed development [].
Protein Domain
Name: Putative glutamate/gamma-aminobutyrate antiporter
Type: Family
Description: Members of this protein family are putative putative glutamate/gamma-aminobutyrate antiporters. Each member of the seed alignment is found adjacent to a glutamate decarboxylase, which converts glutamate (Glu) to gamma-aminobutyrate (GABA). However, the majority belong to genome contexts with a glutaminase (converts Gln to Glu) as well as the decarboxylase that converts Glu to GABA. The specificity of the transporter remains uncertain.
Protein Domain
Name: Transcription regulator, HipB-like
Type: Family
Description: Members of this family belong to a clade of helix-turn-helix DNA-binding proteins. Members are similar in sequence to the HipB protein of Escherichia coli. Genes for members of the seed alignment for this protein family were found to be closely linked to genes encoding proteins related to HipA. The HibBA operon appears to have some features in common with toxin-antitoxin post-segregational killing systems.
Protein Domain
Name: Basic helix-loop-helix (bHLH) transcription factors ALC-like, plant
Type: Family
Description: This entry represents a group of plant helix-loop-helix (bHLH) transcription factors, including ALC, PIFs (phy-interacting factors) and SPATULA from Arabidopsis [ , ]. ALC enables cell separation in fruit dehiscence, the processes in which the fruit opens and releases the seed []. PIF1 regulates chlorophyll biosynthesis to optimise the greening process []. SPATULA plays a role in floral organogenesis [].
Protein Domain
Name: Phospholipase A1 PLIP1/2/3, chloroplastic
Type: Family
Description: This entry includes a group of plant glycerolipid A1 lipases, including PLIP1/2/3 from Arabidopsis. PLIP1 is a plastid phospholipase A1 that releases polyunsaturated fatty acids from chloroplast phosphatidylglycerol, leading to the export of the fatty acids to the ER for seed oil biosynthesis [ ]. PLIP2/3 are also present in the chloroplasts. They respond to ABA and are involved in jasmonic acid biosynthesis [].
Protein Domain
Name: Protein EMBRYONIC FLOWER 1
Type: Family
Description: EMBRYONIC FLOWER1 (EMF1) is a plant-specific protein that encodes a transcriptional regulator. It is involved in plant Polycomb-mediated gene repression and also targets flower homeotic genes directly [ , , ]. EMF1 regulates developmental phase transitions, as well as specifies cell fates, during vegetative development [, , ]. It also regulates additional gene programs, including photosynthesis, seed development, hormone, stress, and cold signalling [].
Protein Domain
Name: BTB/POZ and MATH domain-containing protein 1-6
Type: Family
Description: This entry represents a group of BTB/POZ and MATH domain-containing proteins mostly from plants, including BPM1-6 from Arabidopsis. They are part of the Cullin E3 ubiquitin ligase complexes and are known to bind at least three families of transcription factors: ERF/AP2 class I, homeobox-leucine zipper and R2R3 MYB. BPMs play an important role in plant flowering, seed development and abiotic stress response [ ].
Protein Domain
Name: L-aspartate dehydrogenase, archaeal
Type: Family
Description: This entry represents L-aspartate dehydrogenase, as shown for the NADP-dependent enzyme TM_1643 of Thermotoga maritima. Members lack homology to NadB, the aspartate oxidase ( ) of most mesophilic bacteria (described by ), which this enzyme replaces in the generation of oxaloacetate from aspartate for the NAD biosynthetic pathway. All members of the seed alignment are found adjacent to other genes of NAD biosynthesis, although other uses of L-aspartate dehydrogenase may occur.
Protein Domain
Name: Homeobox-DDT domain protein RLT1/2
Type: Family
Description: The ISWI chromatin remodeling complexes are widely present in eukaryotic species. This entry represents the plant ISWI binding proteins, the DDT domain proteins, including RLT1 and RLT2 from Arabidopsis [ ]. AtISWI physically interacts with RLTs, and this prevents plants from activating the vegetative-to-reproductive transition early by regulating several key genes that contribute to flower timing []. RLT2 may also be involved in the transcriptional regulation of endogenous seed genes [].
Protein Domain
Name: BURP domain
Type: Domain
Description: The BURP domain was named after the proteins in which it was first identified: BNM2, USP, RD22, and PG1beta. It is found in the C terminus of a number of plant cell wall proteins, which are defined not only by the BURP domain, but also by the overall similarity in their modular construction. The BURP domain-containing proteins consists of either three or four modules: (i) an N-terminal hydrophobic domain - a presumptive transit peptide, joined to (ii) a short conserved segment or other short segment, (iii) an optional segment consisting of repeated units which is unique to each member, and (iv) the C-terminal BURP domain. Although the BURP domain proteins share primary structural features, their expression patterns and the conditions under which they are expressed differ. The presence of the conserved BURP domain in diverse plant proteins suggests an important and fundamental functional role for this domain []. It is possible that the BURP domain represents a general motif for localization of proteins within the cell wall matrix. The other structural domains associated with the BURP domain may specify other target sites for intermolecular interactions [].Some proteins known to contain a BURP domain are listed below [ , , ]:Brassica protein BNM2, which is expressed during the induction of microspore embryogenesis.Field bean USPs, abundant non-storage seed proteins with unknown function.Soybean USP-like proteins ADR6 (or SALI5-4A), an auxin-repressible, aluminium-inducible protein and SALI3-2, a protein that is up-regulated by aluminium.Soybean seed coat BURP-domain protein 1 (SCB1). It might play a role in the differentiation of the seed coat parenchyma cells.Arabidopsis RD22 drought induced protein.Maize ZRP2, a protein of unknown function in cortex parenchyma.Tomato PG1beta, the beta-subunit of polygalacturonase isozyme 1 (PG1), which is expressed in ripening fruits.Cereal RAFTIN. It is essential specifically for the maturation phase of pollen development.
Protein Domain
Name: BURP domain-containing protein
Type: Family
Description: The BURP domain was named after the proteins in which it was first identified: BNM2, USP, RD22, and PG1beta. It is found in the C terminus of a number of plant cell wall proteins, which are defined not only by the BURP domain, but also by the overall similarity in their modular construction. The BURP domain-containing proteins consists of either three or four modules: (i) an N-terminal hydrophobic domain - a presumptive transit peptide, joined to (ii) a short conserved segment or other short segment, (iii) an optional segment consisting of repeated units which is unique to each member, and (iv) the C-terminal BURP domain. Although the BURP domain proteins share primary structural features, their expression patterns and the conditions under which they are expressed differ. The presence of the conserved BURP domain in diverse plant proteins suggests an important and fundamental functional role for this domain [ ]. It is possible that the BURP domain represents a general motif for localization of proteins within the cell wall matrix. The other structural domains associated with the BURP domain may specify other target sites for intermolecular interactions [].Some proteins known to contain a BURP domain are listed below [ , , ]:Brassica protein BNM2, which is expressed during the induction of microspore embryogenesis.Field bean USPs, abundant non-storage seed proteins with unknown function.Soybean USP-like proteins ADR6 (or SALI5-4A), an auxin-repressible, aluminium-inducible protein and SALI3-2, a protein that is up-regulated by aluminium.Soybean seed coat BURP-domain protein 1 (SCB1). It might play a role in the differentiation of the seed coat parenchyma cells.Arabidopsis RD22 drought induced protein.Maize ZRP2, a protein of unknown function in cortex parenchyma.Tomato PG1beta, the beta-subunit of polygalacturonase isozyme 1 (PG1), which is expressed in ripening fruits.Cereal RAFTIN. It is essential specifically for the maturation phase of pollen development.
Protein Domain
Name: 3-carboxy-cis,cis-muconate cycloisomerase
Type: Family
Description: Proteins in this entry are 3-carboxy-cis,cis-muconate cycloisomerases, which catalyse the second step in the protocatechuate degradation to beta-ketoadipate and then to succinyl-CoA and acetyl-CoA. 4-hydroxybenzoate, 3-hydroxybenzoate, and vanillate can all be converted in one step to protocatechuate. All members of the seed alignment for this entry were chosen from within protocatechuate degradation operons of at least three genes of the pathway, and from genomes with the complete pathway through beta-ketoadipate [ ].
Protein Domain
Name: Protein Sip5/DA2
Type: Family
Description: This entry includes a group of fungal and plant proteins, including Sip5 from budding yeasts and DA2 from Arabidopsis. The function of Sip5 is not clear. It interacts with both the Reg1/Glc7 protein phosphatase and the Snf1 protein kinase [ ]. DA2 (At1g78420) is an E3 ubiquitin-protein ligase involved in the regulation of organ and seed size [ , ]. This entry also includes DA2-like proteins such as At1g17145 and GW2 [].
Protein Domain
Name: WD repeat-containing protein 89
Type: Family
Description: This entry represents a group of WD repeat-containing proteins, including WDR89 from humans and GTS1 from Arabidopsis. Human WD repeat-containing protein 89 (WDR89) is an uncharacterized protein containing six WD repeats. GTS1 (also known as protein GIGANTUS 1) is highly expressed during embryo development and involved in the control of plant growth development by acting as a negative regulator of seed germination, cell division in meristematic regions, plant growth and overall biomass accumulation [ ].
Protein Domain
Name: 11-S seed storage protein, conserved site
Type: Conserved_site
Description: Plant seed storage proteins, whose principal function appears to be the major nitrogen source for the developing plant, can be classified, on the basis oftheir structure, into different families. 11-S are non-glycosylated proteins which form hexameric structures [, ]. Each of the subunits in the hexamer isitself composed of an acidic and a basic chain derived from a single precursor and linked by a disulphide bond. This structure is shown in the followingrepresentation. +-------------------------+ | |xxxxxxxxxxxCxxxxxxxxxxxxxxxxxxxxxxNGxCxxxxxxxxxxxxxxxxxxxxxxx |------Acidic-subunit-------------||-----Basic-subunit------||-----------------About-480-to-500-residues-----------------| 'C': conserved cysteine involved in a disulphide bond.Members of the 11-S family include pea and broad bean legumins, oil seed rapecruciferin, rice glutelins, cotton beta-globulins, soybean glycinins, pumpkin 11-S globulin, oat globulin, sunflower helianthinin G3, etc.This family represents the precursor protein which is cleaved into the two chains. These proteins contain two β-barrel domains.This family is a member of the 'cupin' superfamily on the basis of their conserved barrel domain ('cupa' is the Latin termfor a small barrel). The signature pattern for this family includes the conserved cleavage site between the acidic and basic subunits (Asn-Gly) and a proximal cysteine residue which is involved in the inter-chain disulphide bond.
Protein Domain
Name: Alpha-Amylase Inhibitors (AAI), Lipid Transfer (LT) and Seed Storage (SS) Protein
Type: Family
Description: This entry represents a protein family unique to higher plants that includes cereal-type alpha-amylase inhibitors [ ], lipid transfer proteins [], seed storage proteins, and similar proteins []. Proteins in this family are known to play important roles, in defending plants from insects and pathogens, lipid transport between intracellular membranes, and nutrient storage []. Many proteins of this family have been identified as allergens in humans []. These proteins contain a common pattern of eight cysteines that form four disulfide bridges.
Protein Domain
Name: Centrosome-associated protein 350
Type: Family
Description: Centrosome-associated protein 350 (CEP350 or CAP350) plays an essential role in centriole growth by stabilising a procentriolar seed composed of, at least, SASS6 and CENPJ [ ]. It is required for anchoring microtubules to the centrosomes and for the integrity of the microtubule network []. It also stabilises Golgi-associated microtubules and maintains a continuous pericentrosomal Golgi ribbon [].CEP350 possesses a CAP-Gly domain which is targeted to the centrosome or the Golgi-like network and binds microtubules through an N-terminal basic region [ ].
Protein Domain
Name: Phytol/farnesol kinase
Type: Family
Description: This entry includes a group of kinases from plants and bacteria, including phytol kinase l and farnesol kinase from Arabidopsis. Phytol kinase 1, also known as Vte5 (Vitamin E pathway gene 5, At5g04490), catalyzes the conversion of phytol to phytol monophosphate (PMP) in the presence of CTP or UTP. It is involved in seed tocopherol biosynthesis [ ].Farnesol kinase, also known as FOLK (At5g58560), can phosphorylate farnesol using an NTP donor. It is involved in negative regulation of abscisic acid (ABA) signaling [].
Protein Domain
Name: Nif11-like leader peptide
Type: Domain
Description: This entry describes a conserved, fairly long (about 65 residue) leader peptide region for a family of putative ribosomal natural products (RNP) of small size. Members of the seed alignment (most sequences scoring better than 54 bits to the HMMER 2 model) tend to have the Gly-Gly motif as the last two residues of the matched region. This is a cleavage site for a combination processing/export ABC transporter with a peptidase domain. Members include the prochlorosins, lantipeptides from Prochlorococcus [ , ].
Protein Domain
Name: Dehydrin
Type: Family
Description: LEA (late embryogenesis abundant) proteins were first identified in land plants. Plant LEA proteins have been found to accumulate to high levels during the last stage of seed formation (when a natural desiccation of the seed tissues takes place) and during periods of water deficit in vegetative organs. Later, LEA homologues have also been found in various species [ , ]. They have been classified into several subgroups in Pfam and according to Bray and Dure [].Dehydrin has been classified as part of the LEA family (D-11 from Dure, or group 2 from Bray) [ ]. Dehydrins contribute to freezing stress tolerance in plants and it was suggested that this could be partly due to their protective effect on membranes [].Dehydrins share a number of structural features. One of the most notable features is the presence, in their central region, of a continuous run offive to nine serines followed by a cluster of charged residues. Such a region has been found in all known dehydrins so far with the exception of peadehydrins. A second conserved feature is the presence of two copies of a lysine-rich octapeptide; the first copy is located just after the clusterof charged residues that follows the poly-serine region and the second copy is found at the C-terminal extremity.
Protein Domain
Name: Nuclear pore complex protein NUP214
Type: Family
Description: Nuclear pore complex protein NUP214 (also known as Protein LONO1 and Protein EMBRYO DEFECTIVE 1011) is a component of the nuclear pore complex in Arabidopsis, required for mature mRNA export from the nucleus to the cytoplasm, essential for normal embryogenesis and seed viability [ ]. It is important during early embryogenesis, being involved in the first asymmetrical cell division of the zygote and regulates the number and planes of cell divisions required for generating the normal embryo proper and suspensor, apical-basal axis, cotyledons and meristem [, ].
Protein Domain
Name: Zinc finger protein 830-like
Type: Family
Description: Zinc finger protein 830 (ZNF830; also known as coiled-coil domain-containing protein 16 or CCDC16) contains a C2H2-type zinc finger and a coiled-coil region. It is a component of the XAB2 complex, which binds RNA [ ], and a component of a pentameric intron-binding complex that is pre-assembled and then incorporated into the spliceosome [].This entry also includes protein ABA AND ROS SENSITIVE 1 (ARS1) from Arabidopsis. It is essential for seed germination and ROS homeostasis in response to ABA and oxidative stress [ ].
Protein Domain
Name: Factor of DNA methylation 1-5/IDN2
Type: Family
Description: RNA-directed DNA methylation (RdDM) is a biological process in which non-coding RNA molecules direct the addition of DNA methylation to specific DNA sequences. This entry represents a sub-group of SGS3-LIKE plant proteins that are components of RNA-directed DNA methylation pathway (RdDM), including FDM1-5 and IDN2 from Arabidopsis [, ]. RdDM has been implicated in a number of regulatory processes in plants, such as maintaining transposable element silencing and genome stability, affecting gamete formation and seed viability and protecting the plant from other biotic stresses [].
Protein Domain
Name: Beta-lysine N-acetyltransferase
Type: Family
Description: Members of this protein family are GNAT family acetyltransferases, based on a seed alignment in which every member is associated with a lysine 2,3-aminomutase family protein, usually as the adjacent gene. This family includes AblB, the enzyme beta-lysine acetyltransferase that completes the two-step synthesis of the osmolyte (compatible solute) N-epsilon-acetyl-beta-lysine; all members of the family may have this function [ ]. Note that N-epsilon-acetyl-beta-lysine has been observed only in methanogenic archaea (e.g. Methanosarcina) but that this model, paired with Lysine-2,3-aminomutase (), suggests a much broader distribution.
Protein Domain
Name: Protein DA1-like domain
Type: Domain
Description: Proteins containing this domain include protein DA1 and its homologues. In Arabidopsis thaliana, DA1 is a ubiquitin-activated endopeptidase that limits final seed and organ size by restricting the period of cell proliferation [ ]. DA1 is activated by the RING E3 ligases Big Brother and DA2, both of which are then inactivated by cleavage by the active peptidase. DA1 also cleaves and inactivates deubiquitinase UBP15 and transcription factors TCP15 and TCP22, all of which promote cell proliferation. Presence of an HEXXH motif, which when mutated leads to inactivity, suggests that DA1 is a metalloendopeptidase [].
Protein Domain
Name: 2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-succinyltransferase, gammaproteobacteria
Type: Family
Description: 2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-succinyltransferase (DapD) is involved in the succinylated branch of the "lysine biosynthesis via diaminopimelate (DAP)"pathway ( ). This entry represents the gammaproteobacteria family of DapD sequences, which is the most closely related to the actinobacterial DapD family represented by . All of the genes evaluated for the seed of this model are found in genomes where the downstream desuccinylase is present, but known DapD genes are absent. Additionally, many of the genes identified by this model are found proximal to genes involved in this lysine biosynthesis pathway.
Protein Domain
Name: Zinc finger protein 3-like
Type: Family
Description: C2H2 zinc finger proteins (ZFPs) constitute an abundant family of nucleic acid binding proteins in the genomes of higher and lower eukaryotes. This entry represents a group of plant ZFPs, including ZFP3, KNUCKLES and LATE FLOWERING from Arabidopsis. ZFP3 is a putative transcriptional regulator negatively regulating ABA suppression of seed germination in Arabidopsis. Together with other ZFPs, it regulates light and ABA responses during germination and early seedling development [ , ]. KNUCKLES plays an important role in the termination of floral meristem activity [] and LATE FLOWERING controls the transition to flowering [].
Protein Domain
Name: Metallothionein-like protein 3
Type: Family
Description: Metallothioneins (MTs) are small proteins with a high content of cysteine residues that bind various heavy metals. Plant MTs are classified into four types based on the arrangement of cysteine residues, and all are involved in copper and other metals homeostasis [ , ]. This entry represents MT3 that in Arabidopsis is predominantly expressed in leaf mesophyll cells. It functions as copper (Cu) and zinc (Zn) chelator and plays a role in Cu homeostasis, specifically in the remobilization of Cu from senescing leaves. The mobilization of Cu is important for seed development [, ].
Protein Domain
Name: Protein crowded nuclei
Type: Family
Description: CROWDED NUCLEI (CRWN) proteins, also known as LITTLE NUCLEI, are important architectural components of plant nuclei and are thought to play diverse roles in heterochromatin organization and the control of nuclear morphology [ , ]. CRWN proteins regulate ABA-controlled seed germination by modulating the degradation of protein ABI5. CRWN3 has been shown to colocalize with ABI5 in nuclear bodies, where it might participate in its degradation [].Proteins in this entry also includes NMCP1A/B from rice. NMCP1B, also known as OsNMCP1, regulates drought resistance and root growth through chromatin accessibility [ ].
Protein Domain
Name: Putative C-S lyase
Type: Family
Description: Members of this subfamily are probable C-S lyases from a family of pyridoxal phosphate-dependent enzymes that tend to be (mis)annotated as probable aminotransferases. One member is PatB of Bacillus subtilis, a proven C-S-lyase. Another is the virulence factor cystalysin from Treponema denticola, whose hemolysin activity may stem from H2S production. Members of the seed alignment occur next to examples of the enzyme 5-histidylcysteine sulfoxide synthase, from ovothiol A biosynthesis, and would be expected to perform a C-S cleavage of 5-histidylcysteine sulfoxide to leave 1-methyl-4-mercaptohistidine (ovothiol A) [ , , ].
Protein Domain
Name: C-terminal binding protein AN-like
Type: Family
Description: This entry represents a group of plant proteins, including ANGUSTIFOLIA (AN) from Arabidopsis. AN is related to C-terminal binding protein/brefeldin A ADP-ribosylated substrate (CtBP/BARS) with an important role in animal development. It plays a major role in microtubule-dependent cell morphogenesis. The phenotype of the AN mutants include narrow cotyledons, narrow rosette leaves, twisted seed pods (siliques) and less-branched trichome. Moreover, it has been shown to be involved in a wide range of biological processes, including facilitating both abiotic and biotic stress tolerance through ROS-mediated redox activity [ ].
Protein Domain
Name: Cereal seed allergen/grain softness/trypsin and alpha-amylase inhibitor
Type: Family
Description: The seeds of cereals contain numerous serine protease and alpha-amylase inhibitors. These inhibitors can be grouped into families based on structural similarities and many are described as seed allergens. This family of cereal (monocotyledon) allergens, trypsin/alpha-amylase inhibitors [ ] belong to MEROPS inhibitor family I6, clan IJ. Some are known to be serine protease inhibitors, active against S1 peptidases () [ ]. For some there is no direct evidence to suggest they any can inhibit serine peptidases and studies on the alpha-amylase inhibitor from Secale cereale (Rye) demonstrates no activity against trypsin, and illustrates the necessity of exercising caution in assigning function based on sequence comparisons [].They consists of proteins of about 120 amino acids which contain 10 cysteine residues, all of which are involved in disulphide bonds. Some of these inhibitors are specific to trypsin, others to alpha-amylase, and a few are bifunctional. The schematic representation of the structure of these inhibitors is shown below: +----------------------------+ +----------+| +-+ || || | | | xxCxxxxxxCxxxCxxxxxxCCxxxCxCxxxxxxxxxxxxxCxxxxxxxxCxxxxxxxCxxxx| | | | | +---------------------------+ |+-------------------------------------------------------+ 'C': conserved cysteine involved in a disulphide bond.The 3D structure of the bifunctional alpha-amylase/trypsin inhibitor (RBI) from seeds of Eleusine coracana (Indian finger millet) has been determined in solution using multidimensional 1H and 15N NMR spectroscopy [ ]. The inhibitor forms a globular 4-helix motif with a simple 'up-and-down' topology, and includes a short anti-parallel β-sheet [].
Protein Domain
Name: Parallel beta-helix repeat-2
Type: Repeat
Description: This model represents a tandem pair of an approximately 22-amino acid (each) repeat homologous to the β-strand repeats that stack in a right-handed parallel β-helix in the periplasmic C-5 mannuronan epimerase, AlgA, of Pseudomonas aeruginosa. A homology domain consisting of a longer tandem array of these repeats is described in the SMART database as CASH (SM00722), and is found in many carbohydrate-binding proteins and sugar hydrolases. A single repeat is represented by SM00710. This TIGRFAMs model represents a flavor of the parallel β-helix-forming repeat based on prokaryotic sequences only in its seed alignment, although it also finds many eukaryotic sequences.
Protein Domain
Name: Bacteriocin, class IIb, lactobin A/cerein 7B family
Type: Family
Description: Members of this protein family are described variably as bacteriocins per se, one chain of a two-chain bacteriocin, or bacteriocin enhancer proteins. All members of the seed alignment occur in paired gene contexts with another member of the same protein family. This family includes bacteriocins that appear not to undergo post-translational modification, other than cleavage at a Gly-Gly motif coupled to sec-independent export. For many members, the N-terminal bacteriocin cleavage motif region is recognised by . C-terminal to the cleavage motif, these proteins are hydrophobic and low in complexity, consistent with pore-forming activity as a mechanism of bacteriocin action.
Protein Domain
Name: Zeaxanthin epoxidase
Type: Family
Description: This entry represents the enzyme zeaxanthin epoxidase ( ), which is involved in the epoxidation of zeaxanthin as part of the biosynthesis of the plant hormone abscisic acid (ABA). ABA is a sesquiterpenoid (15-carbon) which is partially produced via the mevalonic pathway in chloroplasts and other plastids (therefore its biosynthesis primarily occurs in the leaves). The production of ABA is accentuated by stresses such as water loss and freezing temperatures. The enzyme zeaxanthin epoxidase converts zeaxanthin into antheraxanthin and subsequently into violaxanthin. This enzyme also acts on beta-cryptoxanthin. Zeaxanthin epoxidase plays an important role in resistance to stresses, seed development and dormancy [ ].
Protein Domain
Name: Bacterial ice-nucleation, octamer repeat
Type: Repeat
Description: Certain Gram-negative bacteria express proteins that enable them to promote nucleation of ice at relatively high temperatures (above -5C) [ , ]. These proteins are localised at the outer membrane surface and can cause frost damage to many plants. The primary structure of the proteins contains a highly repetitive domain that dominates the sequence. The domain comprises a number of 48-residue repeats, which themselves contain 3 blocks of 16 residues, the first 8 of which are identical. It is thought that the repetitive domain may be responsible for aligning water molecules in the seed crystal.[.........48.residues.repeated.domain..........] / / | | \ \AGYGSTxTagxxssli AGYGSTxTagxxsxlt AGYGSTxTaqxxsxlt [16.residues...][16.residues...] [16.residues...]
Protein Domain
Name: VQ motif-containing protein 5/9/14
Type: Family
Description: This entry includes a group of VQ motif-containing proteins from plants, including VQ5/9/14 from Arabidopsis. VQ14, also known as HAIKU1, regulates endosperm growth and seed size in Arabidopsis [ ].In general, Arabidopsis VQPs interacted specifically with the C-terminal WRKY domains of group I and the sole WRKY domains of group IIc WRKY transcription factors [ , ]. Arabidopsis VQPs reported to control stress responses include the calmodulin (CaM)-binding protein CamBP25 and VQ9, which regulate osmotic and salinity tolerance, respectively, the sigma factor binding proteins SIB1 and SIB2, which act as activators of WRKY33 in plant defence, and the negative regulator of the jasmonate defence pathway [].
Protein Domain
Name: Protein DA1-like
Type: Family
Description: This entry includes DA1 and DAR1-7 from Arabidopsis and LIM domain-containing protein HDR3 from rice [ ]. This entry also includes uncharacterised proteins from bacteria.In Arabidopsis thaliana, DA1 is a ubiquitin-activated endopeptidase that limits final seed and organ size by restricting the period of cell proliferation [ ]. DA1 is activated by the RING E3 ligases Big Brother and DA2, both of which are then inactivated by cleavage by the active peptidase. DA1 also cleaves and inactivates deubiquitinase UBP15 and transcription factors TCP15 and TCP22, all of which promote cell proliferation. Presence of an HEXXH motif, which when mutated leads to inactivity, suggests that DA1 is a metalloendopeptidase [].
Protein Domain
Name: Cereal seed allergen/trypsin and alpha-amylase inhibitor, conserved site
Type: Conserved_site
Description: The seeds of cereals contain numerous serine protease and alpha-amylase inhibitors. These inhibitors can be grouped into families based on structural similarities. This domain identifies sequences belonging to the cereal (monocotyledon) trypsin/alpha-amylase inhibitor family [ ]. It includes those annotated solely as seed allergens or alpha-amyalse inhibitors []. Many belong to MEROPS inhibitor family I6, clan IJ. Some are known to be inhibit trypsin (an S1 peptidase, ) [ ]. For some there is no direct evidence to suggest they any can inhibit trypsin or any other serine peptidase. Studies on the alpha-amylase inhibitor from Secale cereale (Rye) has demonstrated no activity against trypsin, and illustrates the necessity of exercising caution in assigning function based on sequence comparisons [ ]. The cereal trypsin/alpha-amylase inhibitor family consists of proteins of about 120 amino acids which contain 10 cysteine residues, all of which are involved in disulphide bonds [ ]. The schematic representation of the structure of these inhibitors is shown below:+----------------------------+ +----------+| +-+ || || | | | xxCxxxxxxCxxxCxxxxxxCCxxxCxCxxxxxxxxxxxxxCxxxxxxxxCxxxxxxxCxxxx| | | | | +---------------------------+ |+-------------------------------------------------------+ 'C': conserved cysteine involved in a disulphide bond.The 3D structure of the bifunctional alpha-amylase/trypsin inhibitor (RBI) from seeds of Eleusine coracana (Indian finger millet) has been determined in solution using multidimensional 1H and 15N NMR spectroscopy [ ]. The inhibitor forms a globular 4-helix motif with a simple 'up-and-down' topology, and includes a short anti-parallel β-sheet [].
Protein Domain
Name: Glycosyl hydrolases 36
Type: Family
Description: This family consists of several galactinol-sucrose galactosyltransferase proteins, also known as raffinose synthases, which is a widespread oligosaccharide in plant seeds and other tissues. Raffinose synthase ( ) is the key enzyme that channels sucrose into the raffinose oligosaccharide pathway [ ]. Raffinose family oligosaccharides (RFOs) are ubiquitous in plant seeds and are thought to play critical roles in the acquisition of tolerance to desiccation and seed longevity. Raffinose synthases are alkaline alpha-galactosidases and are solely responsible for RFO breakdown in germinating maize seeds, whereas acidic galactosidases appear to have other functions []. Glycoside hydrolase family 36 can be split into 11 families, GH36A to GH36K []. This family includes enzymes from GH36C.
Protein Domain
Name: Translation elongation factor, selenocysteine-specific
Type: Family
Description: In prokaryotes, the incorporation of selenocysteine as the 21st amino acid, encoded by TGA, requires several elements: SelC is the tRNA itself, SelD acts as a donor of reduced selenium, SelA modifies a serine residue on SelC into selenocysteine, and SelB is a selenocysteine-specific translation elongation factor. 3' or 5' non-coding elements of mRNA have been found as probable structures for directing selenocysteine incorporation [ ].This family describes the elongation factor SelB, a close homologue of EF-Tu. It may function by replacing EF-Tu. A C-terminal domain not found in EF-Tu is in all SelB sequences in the seed alignment except that from Methanocaldococcus jannaschii (Methanococcus jannaschii). This family should not include an equivalent protein for eukaryotes.
Protein Domain
Name: Anaerobic sulphatase maturase, radical SAM
Type: Family
Description: Members of this protein family are radical SAM family enzymes, maturases that prepare the oxygen-sensitive radical required in the active site of anaerobic sulphatases. This maturase role has led to many misleading legacy annotations suggesting that this enzyme maturase is instead a sulphatase regulatory protein. All members of the seed alignment are radical SAM enzymes encoded next to or near an anaerobic sulphatase. Note that a single genome may encode more than one sulphatase/maturase pair.Proteins in this entry include ChuR from Bacteroides thetaiotaomicron and AnSME (also known as CPF_0616) from Clostridium perfringens. ChuR is involved in 'Ser-type' sulfatase maturation under anaerobic conditions [ ], while AnSME is involved in 'Cys-type' sulfatase maturation under anaerobic conditions [].
Protein Domain
Name: Trihelix transcription factor ASIL1/2-like
Type: Family
Description: This entry represents a group of plant transcription factors, including ASIL1/2m FIP2 and ENAP1/2 from Arabidopsis. ASIL1 functions as a negative regulator of embryonic traits in seedlings and contributes to the maintenance of precise temporal control of seed filling []. ASIL2 is a trihelix transcription factor that acts downstream of miRNAs to repress the maturation program early in embryogenesis []. FIP2 is a FRI interacting protein. FRI up-regulates expression of the floral repressor FLOWERING LOCUS C (FLC) []. ENAP1 associates with chromatin regions associated with ethylene-responses, preserving this regions in the absence of ethylene, while in its presence, ENAP1 interacts with EIN2. This promotes histone acetylation mainly on H3K14 and H3K23, which positively regulates ethylene-responsive genes [, ].
Protein Domain
Name: Histone-lysine N-methyltransferase SETD1A/B-like, SET domain
Type: Domain
Description: In animals, SETD1A/B are histone methyltransferases that produce mono-, di-, and trimethylated histone H3 at 'Lys-4. However, if 'Lys-9' residue is already methylated, 'Lys-4' will not be. The 'Lys-4' methylation is a tag for epigenetic transcriptional activation [ , ]. The animal COMPASS complex is composed of at least the catalytic subunit (SETD1A or SETD1B), WDR5, WDR82, RBBP5, ASH2L/ASH2, CXXC1/CFP1, HCFC1 and DPY30 []. ATXR7, the Arabidopsis homologue to Set1, is required for the expression of the flowering repressors FLC and MADS-box genes of the MAF family [, ]. ATXR7 is also involved in the control of seed dormancy and germination [].This entry represents the SET domain found in SETD1A/B and its homologues.
Protein Domain
Name: Plant bZIP transcription factors
Type: Family
Description: This family is composed of a group of plant bZIP transcription factors with similarity to OsbZIP46, which regulates abscisic acid (ABA) signalling-mediated drought tolerance in rice [ , ]. Plant bZIPs are involved in developmental and physiological processes in response to stimuli/stresses such as light, hormones, and temperature changes. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes []. This entry also includes ABI5 from Arabidopsis. ABI5 is a transcription factor that participates in ABA-regulated gene expression during seed development and subsequent vegetative stage by acting as the major mediator of ABA repression of growth [ , ]. It is also involved in the sugar signalling response in plants [].
Protein Domain
Name: PIP2/PIPL1
Type: Family
Description: This entry includes PAMP-induced secreted peptide 2 (PIP2) and PAMP-INDUCED PEPTIDE-LIKE 1 (PIPL1, also known as CEP16 or PREPIPL1) from Arabidopsis, which are part of the PIP/PIPL family [ , ]. These secreted proteins contain two conserved core SGPS motifs at the C terminus and the GxGH motif at the extreme C terminus []. The double peptide motif might be processed into two different peptides or, alternatively, may act as a functional unit, being able to interact with distinct binding sites resulting in the activation of different pathways. PIP2 is involved in innate immune and stress responses. It also acts as a negative regulator of root growth []. PIPL1 is involved in seed development and was also induced by stress [].
Protein Domain
Name: RING finger protein Unkempt-like
Type: Family
Description: This entry represents a group of RING finger proteins from animals and plants, including Unkempt and related proteins. Unkempt is an evolutionary conserved RNA-binding protein that regulates translation of its target genes and is required for the establishment of the early bipolar neuronal morphology. It carries six CCCH zinc fingers (Znfs) forming two compact clusters, Znf1-3 and Znf4-6, that recognise distinct trinucleotide RNA substrates. These clusters, recognise an unexpectedly short stretch of RNA sequence-only three consecutive ribonucleotides-with a varying degree of specificity. Znf1-3 binds to the UUA motif of RNA substrates [].Proteins in this entry also include Zinc finger CCCH domain-containing proteins from Arabidopsis. They are involved in regulating stress responses [ ], light-dependent seed germination [] and embryogenesis [].
Protein Domain
Name: Formin-like family, plant
Type: Family
Description: Formins (formin homology proteins) proteins play a crucial role in the reorganisation of the actin cytoskeleton and associate with the fast-growing end (barbed end) of actin filaments [ , ]. This entry represents the formin homologues from plants. Seed plants have two formin clades with numerous paralogues []. They can be classified as class I and class II formins. Class I formins includes a N-terminal membrane insertion signal, a predicted extracytoplasmic Pro-rich stretch, a transmembrane region, and C-terminal FH1 and FH2 domains []. Though class II formins usually contain a N-terminal PTEN domain related to the human PTEN protein (implied in pathogenesis of the Parkinson disease) [], the N-termini of type-II plant formins do not contain any recognisable domain that can provide a clue to their biological function.
Protein Domain
Name: Cell division coordinator CpoB, C-terminal
Type: Domain
Description: Members of this protein family are the product of one of seven genes regularly clustered in operons to encode the proteins of the tol-pal system, which is critical for maintaining the integrity of the bacterial outer membrane. The gene for this periplasmic protein has been designated orf2 and ybgF, which is then renamed CpoB (Coordinator of PG synthesis and OM constriction, associated with PBP1B). All members of the seed alignment were from unique tol-pal gene regions from completed bacterial genomes. The architecture of this protein is a signal sequence, a low-complexity region usually rich in Asn and Gln, and a well-conserved region with tandem repeats that resemble the tetratricopeptide (TPR) repeat, involved in protein-protein interaction [].Escherichia CpoB coordinates PBP1B and the Tol machines to maintain cell envelope integrity during division [ ].
Protein Domain
Name: SUA-like, OCRE domain
Type: Domain
Description: SUA is an RNA-binding protein located in the nucleus and expressed in all plant tissues. It functions as a splicing factor that influences seed maturation by controlling alternative splicing of ABI3. The suppression of the cryptic ABI3 intron indicates a role of SUA in mRNA processing. SUA also interacts with the prespliceosomal component U2AF65, the larger subunit of the conserved pre-mRNA splicing factor U2AF. SUA contains two RNA recognition motifs surrounding a zinc finger domain, an OCtamer REpeat (OCRE) domain, and a Gly-rich domain close to the C terminus [ ].The OCRE (OCtamer REpeat) domain contains five repeats of an 8-residue motif, which were shown to form β-strands. Based on the architectures of proteins containing OCRE domains, a role in RNA metabolism and/or signalling has been proposed [ ].
Protein Domain
Name: Transcription factor IND-like
Type: Family
Description: This entry represents a group of bHLH transcription factors from plants, including IND, HEC1/2/3 and RHD6 from Arabidopsis. IND is required for seed dispersal [ ], while HEC1/2/3 are required for the female reproductive tract development and fertility []. IND interacts with another bHLH transcription factor SPATULA (SPT), and together they regulate genes involved in modulating auxin transport[ ]. RDH6 is a transcription factor that is specifically required for the development of root hairs. It integrates a jasmonate (JA) signaling pathway that stimulates root hair growth []. This entry also includes LF (LATE FLOWERING) and LAX PANICLE 1 from rice. LF regulates flowering time [], while LAX1 is a transcription factor that may regulate organogenesis in postembryonic development. It is involved in the regulation of shoot branching by controlling axillary meristem initiation [].
Protein Domain
Name: RmlC-like cupin domain superfamily
Type: Homologous_superfamily
Description: RmlC (dTDP (deoxythymidine diphosphates)-4-dehydrorhamnose 3,5-epimerase; ) is a dTDP-sugar isomerase enzyme involved in the synthesis of L-rhamnose, a saccharide required for the virulence of some pathogenic bacteria [ ]. RmlC is a dimer, each monomer being formed from two β-sheets arranged in a β-sandwich, where the substrate-binding site is located between the two sheets of both monomers.Other protein families contain domains that share this fold, including glucose-6-phosphate isomerase ( ); germin, a metal-binding protein with oxalate oxidase and superoxide dismutases activities [ ]; auxin-binding protein []; seed storage protein 7S []; acireductone dioxygenase []; as well as three proteins that have metal-binding sites similar to that of germine, namely quercetin 2,3-dioxygenase () [ ], phosphomannose isomerase () [ ] and homogentisate dioxygenase () [ ], the last three sharing a 2-domain fold with storage protein 7s.
Protein Domain
Name: SWEET sugar transporter
Type: Family
Description: This family contains specific sugar efflux transporters that are essential for the maintenance of animal blood glucose levels, plant nectar production, and plant seed and pollen development. In many organisms it mediates glucose transport; in Arabidopsis it is necessary for pollen viability; and two of the rice homologues are specifically exploited by bacterial pathogens for virulence by means of direct binding of a bacterial effector to the SWEET promoter [ ]. Homologues of SWEETs have been identified in bacteria [].The founding member of the SWEET family, MtN3, was identified as a nodulin-specific EST in the legume Medicago truncatula [ ]. Another protein in this family may be involved in activation and expression of recombination activation genes (RAGs) []. This family contains a region of two transmembrane helices that is found in two copies in most members of the family.
Protein Domain
Name: Leo1-like protein
Type: Family
Description: In budding yeasts, Leo1 is part of the Paf1 complex, an RNA polymerase II-associated protein complex containing Paf1, Cdc73, Ctr9, Rtf1 and Leo1. Paf1 complex is involved in histone modifications, transcription elongation and other gene expression processes that include transcript site selection [ ]. This entry also includes Leo1 homologues from animals and plants. Human Leo1, also known as RDL, is a component of the human Paf1 complex (Paf1C), which consists of Paf1, Cdc73, Ctr9, Rtf1, Leo1 and Wdr61 (Ski8). As in yeast, the human Paf1C has a central role in co-transcriptional histone modifications [ ]. Human Leo1 promotes senescence of 2BS fibroblasts [].Arabidopsis Paf1C related proteins such as VIP4 (Leo1), VIP5 (Rtf1), ELF7 (Paf1), ELF8 (Ctr9) and ATXR7 (Set1) are required for the induction of seed dormancy. They control both germination and flowering time [ ].
Protein Domain
Name: Plant galacturonosyltransferase GAUT
Type: Family
Description: Galacturonosyltransferase 1 (GAUT1) is an alpha1,4-D-galacturonosyltransferase that transfers galacturonic acid from uridine 5'-diphosphogalacturonic acid onto the pectic polysaccharide homogalacturonan [ ]. The GAUT1-related gene family from Arabidopsis thaliana encodes 15 GAUT and 10 GAUT-like (GATL) proteins []. This entry represent the GAUT proteins. Mutants for GAUT genes indicate that GAUTs are involved in pectin and xylan biosynthesis. GAUTs 6, 8, 9, 10, 11, 12, 13, and 14 mutants result in aberrant wall composition. They show distinct patterns, suggesting that these GAUTs have at least six unique functions in pectin and/or xylan biosynthesis [ ]. GAUT12 (IRX8) is involved in the synthesis of cell wall polysaccharides; mutants in this gene are deficient in homogalacturonan and glucuronoxylan []. Similarly, GAUT8, also known as QUASIMODO1, affects homogalacturonan and xylan biosynthesis [, ]. GAUT11 is involved in the production of seed testa cell wall and mucilage [].
Protein Domain
Name: RNA polymerase-associated protein Ctr9
Type: Family
Description: This entry includes budding yeast RNA polymerase-associated protein Ctr9 and its homologues from other yeasts, animals and plants. The homologue in fission yeast is known as tetratricopeptide repeat protein 1 (Tpr1) [ ].Budding yeast Ctr9 is part of the Paf1 complex, an RNA polymerase II-associated protein complex containing Paf1, Cdc73, Ctr9, Rtf1 and Leo1 [ ]. Paf1 complex is involved in histone modifications, transcription elongation and other gene expression processes that include transcript site selection []. Human Paf1 complex (Paf1C) consists of Paf1, Cdc73, Ctr9, Rtf1, Leo1 and Wdr61 (Ski8). As in yeast, the human Paf1C has a central role in co-transcriptional histone modifications [].Arabidopsis Paf1C related proteins such as VIP4 (Leo1), VIP5 (Rtf1), ELF7 (Paf1), ELF8 (Ctr9) and ATXR7 (Set1) are required for the induction of seed dormancy. They control both germination and flowering time [ ].
Protein Domain
Name: Proteinase inhibitor I3, Kunitz legume
Type: Family
Description: The Kunitz-type soybean trypsin inhibitor (STI) family consists mainly of proteinase inhibitors from Leguminosae seeds [ ]. They belong to MEROPS inhibitor family I3, clan IC. They exhibit proteinase inhibitory activity against serine proteinases; trypsin (MEROPS peptidase family S1, ) and subtilisin (MEROPS peptidase family S8, ), thiol proteinases (MEROPS peptidase family C1, ) and aspartic proteinases (MEROPS peptidase family A1, ) [ ]. Inhibitors from cereals are active against subtilisin and endogenous alpha-amylases, while some also inhibit tissue plasminogen activator. The inhibitors are usually specific for either trypsin or chymotrypsin, and some are effective against both. They are thought to protect the seeds against consumption by animal predators, while at the same time existing as seed storage proteins themselves - all the actively inhibitory members contain 2 disulphide bridges. The existence of a member with no inhibitory activity, winged bean albumin 1, suggests that the inhibitors may have evolved from seed storage proteins.Proteins from the Kunitz family contain from 170 to 200 amino acid residues and one or two intra-chain disulphide bonds. The best conserved region is found in their N-terminal section. The crystal structures of soybean trypsin inhibitor (STI), trypsin inhibitor DE-3 from the Kaffir tree Erythrina caffra (ETI) [ ] and the bifunctional proteinase K/alpha-amylase inhibitor from wheat (PK13) have been solved, showing them to share the same 12-stranded β-sheet structure as those of interleukin-1 and heparin-binding growth factors []. The β-sheets are arranged in 3 similar lobes around a central axis, 6 strands forming an anti-parallel β-barrel. Despite the structural similarity, STI shows no interleukin-1 bioactivity, presumably as a result of their primary sequence disparities. The active inhibitory site containing the scissile bond is located in the loop between β-strands 4 and 5 in STI and ETI.The STIs belong to a superfamily that also contains the interleukin-1 proteins, heparin binding growth factors (HBGF) and histactophilin, all of which have very similar structures, but share no sequence similarity with the STI family.
Protein Domain
Name: RNA polymerase II associated factor Paf1
Type: Family
Description: In budding yeasts, Paf1 is part of the Paf1 complex, an RNA polymerase II-associated protein complex containing Paf1, Cdc73, Ctr9, Rtf1 and Leo1 [ ]. Paf1 complex is involved in histone modifications, transcription elongation and other gene expression processes that include transcript site selection []. This entry also includes Paf1 homologues from animals and plants. Human Paf1, also known as PD2 (pancreatic differentiation 2), is associated with tumorigenesis [ ]. Human Paf1 complex (Paf1C) consists of Paf1, Cdc73, Ctr9, Rtf1, Leo1 and Wdr61 (Ski8). As in yeast, the human Paf1C has a central role in co-transcriptional histone modifications []. Human Paf1 complex has a crucial role in the antiviral response []. Arabidopsis Paf1C related proteins such as VIP4 (Leo1), VIP5 (Rtf1), ELF7 (Paf1), ELF8 (Ctr9) and ATXR7 (Set1) are required for the induction of seed dormancy. They control both germination and flowering time [ ].
Protein Domain
Name: Gnk2-homologous domain
Type: Domain
Description: Ginkbilobin-2 (Gnk2) is an antifungal protein found in the endosperm of Ginkgo seeds, which inhibits the growth of phytopathogenic fungi such as Fusariumoxysporum. Gnk2 has considerable homology (~85%) to embryo-abundant proteins (EAP) from the gymnosperms Picea abies and P. glauca. Plant EAP are expressedin the late stage of seed maturation and are involved in protection against environmental stresses such as drought. The sequence of Gnk2 is also 28-31%identical to the extracellular domain of cysteine-rich receptor-like kinases (CRK) from the angiosperm Arabidopsis. The CRK members are induced by pathogeninfection and treatment with reactive oxygen species or salicylic acid and are involved in the hypersensitive reaction, which is a typical system ofprogrammed cell death. In addition, there are at least 60 genes in Arabidopsis encoding the cysteine-rich secreted proteins (CRSP) with an Gnk2-homologousdomain. Therefore, the proteins with a Gnk2-homologous domain are regarded as one of the largest protein superfamilies, although the role of the conservedGnk2-homologous domain remains unclear [ , ].The Gnk2-homologous domain is composed of two α-helices and a fivestranded β-sheet, which forms a compact single-domain architecture with an alpha+β-fold. It contains a C-X(8)-C-X(2)-C motif.Cysteine residues form three intramolecular disulphide bridges: C1-C5, C2-C3, and C4-C6 [].
Protein Domain
Name: RmlC-like jelly roll fold
Type: Homologous_superfamily
Description: RmlC (deoxythymidine diphosphates-4-dehydrorhamnose 3,5-epimerase; ) is a mainly beta class protein with a jelly roll-like topology. It is a dTDP-sugar isomerase enzyme involved in the synthesis of L-rhamnose, a saccharide required for the virulence of some pathogenic bacteria [ ]. This entry represents the domain with the jelly roll-like fold. Other protein families containing this domain include glucose-6-phosphate isomerase ( ); germin, a metal-binding protein with oxalate oxidase and superoxide dismutases activities [ ]; auxin-binding protein []; seed storage protein 7S []; acireductone dioxygenase []; as well as three proteins that have metal-binding sites similar to that of germine, namely quercetin 2,3-dioxygenase () [ ], phosphomannose isomerase () [ ] and homogentisate dioxygenase () [ ], the last three sharing a 2-domain fold with storage protein 7s.The cAMP-binding domains found in the cAMP receptor protein (CRP) family display a similar β-roll architecture consisting of eight antiparallel β-strands and three helical segments [ ]. These proteins include CooA, a CO-sensing haem protein that functions as a transcription activator [], and the CnbD (cyclic nucleotide binding domain) of the HCN cation channel in which cAMP binding modulates gating of the channel [].
Protein Domain
Name: NET domain
Type: Domain
Description: The bromodomain and extraterminal (BET) proteins are a class of transcriptional regulators whose members can be found in animals, plants and fungi. BET proteins are involved in diverse cellular phenomena such as meiosis, cell-cycle control, and homeosis and have been suggested to modulate chromatin structure and affect transcription via a sequence-independent mechanism. BET proteins are defined as having one (plants) or two (animals/yeast) bromodomains and an Extra Terminal (ET)domain. The ET domain consists of three separate regions, only one of which, the N-terminal ET (NET) domain is conserved in all BET proteins. The function of the NET domain is assumed to be protein binding [ , , , ].The structure of the NET domain comprises three α-helices and a characteristic loop region of an irregular but well-defined structure. The NET structure has an acidic patch that forms a continuousridge with a hydrophobic cleft. which may interact with other proteins and/or DNA [ ].Some proteins known to contain a NET domain include:Human RING3 (now designated Brd2)Murine MCAP (now designated Brd4)Drosophila FshYeast Bdf1 and Bdf2Arabidopsis imbibition-inducible (IMB1), whichplays a role in abscisic acid (ABA) and phytochrome A (phyA) mediated responses of seed germination.
Protein Domain
Name: Tol-Pal system, TolA
Type: Family
Description: Tol proteins are involved in the translocation of group A colicins. Colicins are bacterial protein toxins, which are active against Escherichia coli and other related species. TolA is anchored to the cytoplasmic membrane by a single membrane spanning segment near the N terminus, leaving most of the protein exposed to the periplasm [ ].TolA couples the inner membrane complex of itself with TolQ and TolR to the outer membrane complex of TolB and OprL (also called Pal). Most of the length of the protein consists of low-complexity sequence that may differ in both length and composition from one species to another, complicating efforts to discriminate TolA (the most divergent gene in the tol-pal system) from paralogs such as TonB. Selection of members of the seed alignment and criteria for setting scoring cut-offs are based largely on conserved operon structure. The Tol-Pal complex is required for maintaining outer membrane integrity, and is also involved in transport (uptake) of colicins and filamentous DNA, and implicated in pathogenesis. Transport is energized by the proton motive force. TolA is an inner membrane protein that interacts with periplasmic TolB and with outer membrane porins OmpC, PhoE and LamB.
Protein Domain
Name: Gnk2-homologous domain superfamily
Type: Homologous_superfamily
Description: Ginkbilobin-2 (Gnk2) is an antifungal protein found in the endosperm of Ginkgo seeds, which inhibits the growth of phytopathogenic fungi such as Fusariumoxysporum. Gnk2 has considerable homology (~85%) to embryo-abundant proteins (EAP) from the gymnosperms Picea abies and P. glauca. Plant EAP are expressed in the late stage of seed maturation and are involved in protection againstenvironmental stresses such as drought. The sequence of Gnk2 is also 28-31% identical to the extracellular domain of cysteine-rich receptor-like kinases(CRK) from the angiosperm Arabidopsis. The CRK members are induced by pathogen infection and treatment with reactive oxygen species or salicylic acid and areinvolved in the hypersensitive reaction, which is a typical system of programmed cell death. In addition, there are at least 60 genes in Arabidopsisencoding the cysteine-rich secreted proteins (CRSP) with an Gnk2-homologous domain. Therefore, the proteins with a Gnk2-homologous domain are regarded asone of the largest protein superfamilies, although the role of the conserved Gnk2-homologous domain remains unclear [, ].The Gnk2-homologous domain is composed of two α-helices and a five stranded β-sheet, which forms a compact single-domain architecture with analpha+β-fold. It contains a C-X(8)-C-X(2)-C motif. Cysteine residues form three intramolecular disulphide bridges: C1-C5, C2-C3,and C4-C6 [ ].
Protein Domain
Name: Histone-lysine N-methyltransferase Set1-like
Type: Family
Description: The COMPASS complex (complex proteins associated with Set1) is conserved in yeasts and in other eukaryotes up to humans. This entry represents Set1 and its homologues. Set1 is a methyltransferase and the catalytic component of the COMPASS that produces trimethylated histone H3 at Lys(4). The yeast COMPASS (Set1C) complex specifically mono-, di- and trimethylates histone H3 to form H3K4me1/2/3, which subsequently plays a role in telomere length maintenance and transcription elongation regulation [ , , ]. In yeasts, the Set1C complex consists of Set1(2), Bre2(2), Spp1(2), Sdc1(1), Shg1(1), Swd1(1), Swd2(1), and Swd3(1) [, , , ].In animals, SETD1A/B are histone methyltransferases that produce mono-, di-, and trimethylated histone H3 at 'Lys-4. However, if 'Lys-9' residue is already methylated, 'Lys-4' will not be. The 'Lys-4' methylation is a tag for epigenetic transcriptional activation [ , ]. The animal COMPASS complex is composed of at least the catalytic subunit (SETD1A or SETD1B), WDR5, WDR82, RBBP5, ASH2L/ASH2, CXXC1/CFP1, HCFC1 and DPY30 []. ATXR7, the Arabidopsis homologue to Set1, is required for the expression of the flowering repressors FLC and MADS-box genes of the MAF family [, ]. ATXR7 is also involved in the control of seed dormancy and germination [].
Protein Domain
Name: Sirohaem synthase, N-terminal
Type: Domain
Description: Bacterial sulphur metabolism depends on the iron-containing porphinoid sirohaem. CysG is a multi-functional enzyme with S-adenosyl-L-methionine (SAM)-dependent bismethyltransferase, dehydrogenase and ferrochelatase activities. CysG synthesizes sirohaem from uroporphyrinogen III via reactions which encompass two branchpoint intermediates in tetrapyrrole biosynthesis, diverting flux first from protoporphyrin IX biosynthesis and then from cobalamin (vitamin B12) biosynthesis. CysG is a dimer. Its dimerisation region is 74 residues long, and acts to hold the two structurally similar protomers held together asymmetrically through a number of salt-bridges across complementary residues within the dimerisation region [ ]. CysG dimerisation produces a series of active sites, accounting for CysG's multi-functionality, catalysing four diverse reactions:Two SAM-dependent methylationsNAD+-dependent tetrapyrrole dehydrogenationMetal chelationThis group represent a subfamily of CysG N-terminal region-related sequences. All sequences in the seed alignment for this model are N-terminal regions of known or predicted sirohaem synthases. The C-terminal region of each is uroporphyrin-III C-methyltransferase ( ), which catalyses the first step committed to the biosynthesis of either sirohaem or cobalamin (vitamin B12) rather than protohaem (haem). Functionally these sequences complete the process of oxidation and iron insertion to yield sirohaem. Sirohaem is a cofactor for nitrite and sulphite reductases, so sirohaem synthase is CysG of cysteine biosynthesis in some organisms.
Protein Domain
Name: Lipocalin Blc-like
Type: Domain
Description: This entry represents the lipocalin/cytosolic fatty-acid binding domain of Bcl and similar proteins predominantly found in bacteria and plants. Escherichia coli bacterial lipocalin (Blc, also known as YjeL) is an outer membrane lipoprotein involved in the storage or transport of lipids necessary for membrane maintenance under stressful conditions. Blc has a binding preference for lysophospholipids [ ]. This entry also includes eukaryotic lipocalins such as Arabidopsis thaliana temperature-induced lipocalin-1 (TIL) which is involved in thermotolerance, oxidative, salt, drought and high light stress tolerance, and is needed for seed longevity by ensuring polyunsaturated lipids integrity [, , , , , ].These proteins have a large β-barrel ligand-binding cavity. Lipocalins are mainly low molecular weight extracellular proteins that bind principally small hydrophobic ligands, and form covalent or non-covalent complexes with soluble macromolecules, as well as membrane bound-receptors. They participate in processes such as ligand transport, modulation of cell growth and metabolism, regulation of immune response, smell reception, tissue development and animal behavior. Cytosolic fatty-acid binding proteins, also bind hydrophobic ligands in a non-covalent, reversible manner, and have been implicated in intracellular uptake, transport and storage of hydrophobic ligands, regulation of lipid metabolism and sequestration of excess toxic fatty acids, as well as in signaling, gene expression, inflammation, cell growth and proliferation, and cancer development.
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom