Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 3801 to 3900 out of 30763 for seed protein

Category restricted to ProteinDomain (x)

0.024s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Name: Stomatal closure-related actin-binding protein
Type: Family
Description: Stomatal closure-related actin-binding (SCAB) proteins bind, bundle and stabilise actin filaments and regulate stomatal movement [ ]. Homologues are known only from plants.
Protein Domain
Name: Leucine-rich repeat-containing protein 42
Type: Family
Description: The function of LRRC42 is not clear. It is significantly upregulated in the majority of lung cancers [ ].
Protein Domain
Name: Coiled-coil domain-containing protein 174-like
Type: Family
Description: This entry includes CCDC174 and related proteins from animals, fungi and plants. In humans, CCDC174 may be involved in neuronal development [ ].
Protein Domain
Name: Conjugative transposon protein TraL
Type: Family
Description: This entry represents a family that contains conjugal transfer protein TraL [ ].
Protein Domain
Name: WD repeat-containing protein 11
Type: Family
Description: WD repeat-containing protein 11 (WDR11) is part of the WDR11 complex (consisting of WDR11, C17orf75 and FAM91A) that facilitates the tethering of AP-1-derived vesicles [ ]. Mutations of the WDR11 gene has been linked to congenital hypogonadotropic hypogonadism (CHH) and Kallmann syndrome (KS), human developmental genetic disorders defined by delayed puberty and infertility []. Homologues are known from animals and plants.
Protein Domain
Name: Polyomavirus coat protein VP2
Type: Family
Description: This family includes the VP2 and VP3 internal coat proteins from Polyomaviruses, which are small dsDNA tumour viruses. Their capsids contain 360 copies of the VP1 proteins arranged in 72 pentamers. This capsid encloses the internal proteins VP2 and VP3, as well as the viral DNA. A single copy of VP2 or VP3 associates with each VP1 pentamer. A crystal structure shows that the C-terminal region of the VP2/VP3 proteininteracts with the VP1 pentamer [ ].
Protein Domain
Name: Sterol transport protein NPC2-like
Type: Family
Description: Proteins in this family contain an MD-2-related lipid-recognition (ML) domain, which is implicated in lipid recognition, particularly in the recognition of pathogen related products. It has an immunoglobulin-like β-sandwich fold similar to that of E-set Ig domains. This domain is present in proteins from plants, animals and fungi, including the following proteins:NPC intracellular cholesterol transporter 2 (NPC2), which is known to bind cholesterol. Niemann-Pick disease type C2 is a fatal hereditary disease characterised by accumulation of low-density lipoprotein-derived cholesterol in lysosomes [ ].Phosphatidylglycerol/phosphatidylinositol transfer protein (Npc2) from yeasts, which catalyses the intermembrane transfer of phosphatidylglycerol and phosphatidylinositol [ ].House-dust mite allergen proteins such as Der f 2 from Dermatophagoides farinae and Der p 2 from Dermatophagoides pteronyssinus [ ].
Protein Domain
Name: Fur-regulated basic protein A
Type: Family
Description: This family of proteins is regulated by the ferric uptake regulator protein Fur [ ]. This family does not regulate the lutABC operon encoding iron sulfur-containing enzymes necessary for growth on lactate [].
Protein Domain
Name: Major spike protein G
Type: Family
Description: This is a family of proteins from single-stranded DNA bacteriophages. The G protein is a major spike protein involved in attachment to the bacterial host cell. The virion is composed of sixty copies of each of the F, G and J proteins, and 12 copies of the H protein. There are twelve spikes formed by five G proteins, each a tight beta barrel, and one H protein [ , ].
Protein Domain
Name: Salmonella/Shigella invasion protein E
Type: Family
Description: Secretion of virulence factors in Gram-negative bacteria involves transportation of the protein across two membranes to reach the cell exterior. There have been four secretion systems described in sequence similarities in plant pathogens like Ralstonia and Erwinia [].The type III secretion system is of great interest, as it is used to transport virulence factors from the pathogen directly into the host cell [ ] and is only triggered when the bacterium comes into close contact withthe host. The protein subunits of the system are very similar to those of bacterial flagellar biosynthesis. However, while the latter forms aring structure to allow secretion of flagellin and is an integral part of the flagellum itself [], type III subunits in the outer membranetranslocate secreted proteins through a channel-like structure. The Salmonella/Shigella invasion protein E gene (InvE) is one such type III secretion protein subunit, and is localised to the outer membrane of the SPI I pathogenicity island, and is involved in the surface presentation.
Protein Domain
Name: Uncharacterised protein family UPF0102
Type: Family
Description: The proteins in this entry are functionally uncharacterised.
Protein Domain
Name: Protein KASH5, EF-hand-like domain
Type: Domain
Description: This entry represents the EF-hand-like domain found in protein KASH5. KASH5 is a component of the LINC (LInker of Nucleoskeleton and Cytoskeleton) complex, involved in the connection between the nuclear lamina and the cytoskeleton [ ]. It interacts (via the last 22 AA) with SUN1; this interaction mediates telomere localisation []. Proteins containing this domain also include lrmp from Zebra fish. It is a maternally expressed membrane and cytoskeletal linker protein, which is essential for attachment of the centrosome to the male pronucleus [ ].
Protein Domain
Name: Gram-negative bacterial TonB protein
Type: Family
Description: TonB-dependent transporters (TBDT) are bacterial outer membrane (OM) proteins that bind and transport ferric chelates called siderophores. While iron complexes constitute the majority of substrates for TBDTs, others, like vitamin B12, are also transported by this mechanism [ ]. These transporters show high affinity and specificity for siderophores and require energy derived from the proton motive force across the inner membrane to transport them. The energy force is provided through interaction with an inner membrane protein complex consisting of TonB, ExbB, and ExbD []. The source of this energy is the ion electrochemical gradient of the cytoplasmic membrane, harvested by heteromultimeric complexes of ExbB and ExbD proteins, and transduced to the OM high affinity siderophore transporters by the protein TonB [].TonB is composed of three domains. The N-terminal transmembrane helix anchors the protein to the inner membrane and makes contact with ExbB and ExbD to form an energy transducing complex. The C-terminal globular domain directly contacts the transporters in the OM. These two domains are separated by a flexible, unstructured proline-rich domain that resides within the periplasm [ ].Escherichia coli has only one TonB protein which is shared by different TBDTs involved in in the acquisition of various substrates, but most bacteria have more than one tonB gene [].
Protein Domain
Name: Serine/threonine protein kinase, yersinia-type
Type: Family
Description: Secretion of virulence factors in Gram-negative bacteria involves transportation of the protein across two membranes to reach the cell exterior [ ]. There have been four secretion systems described in animal enteropathogens, such as Salmonella and Yersinia, with further sequence similarities in plant pathogens like Ralstonia and Erwinia [].The type III secretion system is of great interest, as it is used to transport virulence factors from the pathogen directly into the host cell and is only triggered when the bacterium comes into close contact with the host. The protein subunits of the system are very similar to those of bacterial flagellar biosynthesis [ ]. However, while the latter forms aring structure to allow secretion of flagellin and is an integral part of the flagellum itself, type III subunits in the outer membranetranslocate secreted proteins through a channel-like structure. Exotoxins secreted by the type III system do not possess a secretion signal, and are considered unique for this reason []. Yersinia spp. secrete a serine/threonine kinase, YpkA, [ , ] that causes autophosphorylation of host cell components, although the exact targets are unknown at present. It has also been suggested that the YpkA protein is involved in interferenceof signal transduction in the target cell [ ].
Protein Domain
Name: Calcium binding protein SSO6904
Type: Domain
Description: This domain is SSO6904 present in Sulfolobus solfataricus. SSO6904 is a calcium binding protein thought to have a weak affinity for other cations such as Mg2+ and Zn2+. The structure of SSO6904 is similar to that of saposin-fold proteins. Saposin proteins are membrane-interacting glycoproteins required for the hydrolysis of certain sphingolipids by specific lysosomal hydrolases [ ].
Protein Domain
Name: 60S ribosomal protein L19
Type: Family
Description: This entry represents ribosomal protein L19 from eukaryotes, as well as L19e from archaea [ ]. L19/L19e is part of the large ribosomal subunit, whose structure has been determined in a number of eukaryotic and archaeal species [].
Protein Domain
Name: Ras GTPase-activating protein-binding protein
Type: Family
Description: This entry represents proteins with an N-terminal NTF2 (nuclear transport factor 2) domain and a C-terminal RRM (RNA recognition motif) domain. It includes mammalian ATP- and magnesium-dependent helicase Ras GTPase-activating protein-binding proteins 1 and 2 [ ] and UBP3-associated protein Bre5 from yeast, which is a co-factor required for de-ubiquitination []. This entry also includes AtMBD6 from Arabidopsis. AtMBD6 maintains gene silencing in Arabidopsis by interacting with RNA binding proteins [].
Protein Domain
Name: Ribosomal protein S19e, archaeal
Type: Family
Description: This entry represents the archaeal ribosomal protein S19e. It may be involved in maturation of the 30S ribosomal subunit.Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].
Protein Domain
Name: SERTA domain-containing protein 3
Type: Family
Description: SERTAD3, also known as RBT1, is a transcriptional co-activator that binds the second subunit of replication protein A [ ].
Protein Domain
Name: F-box only protein 4
Type: Family
Description: F-box only protein 4 (FBXO4) is a substrate-specific adaptor of SCF (SKP1-CUL1-F-box protein) E3 ubiquitin-protein ligase complex that mediates the ubiquitination and subsequent proteasomal degradation of target proteins [ ].
Protein Domain
Name: TBCC domain-containing protein 1
Type: Family
Description: TBCC-domain containing 1 (TBCCD1) has a role in internal cell organization in animals, Chlamydomonas reinhardtii, and trypanosomes [ ]. In humans, it is required for centrosome and Golgi apparatus positioning [].
Protein Domain
Name: Conserved hypothetical protein CHP04338
Type: Family
Description: This protein family is restricted to the Actinomycetales, including Mycobacterium, Rhodococcus, Nocardia, Gordonii, and others. The invariant motif HEXXH, at the core of the best conserved region in the protein, suggests metallohydrolase activity, as does local sequence similarity in this region to other metallohydrolases.
Protein Domain
Name: Aromatic cluster surface protein
Type: Family
Description: Members of this family are absolutely restricted to the Mollicutes (Mycoplasma and Ureaplasma). All have a signal peptide, usually of the lipoprotein type, suggesting surface expression. Most members have lengths of about 280 residues but some members have a nearly full-length duplication. The mostly nearly invariant residue, a Trp,is part of a strongly conserved 9-residue motif, [ND]-W-[LY]-[WF]-X-[LF]-X-N-[LI], where X usually is hydrophobic. Because the hydrophobic six-residue core of this motif almost always contains three to four aromatic residues, we name this family aromatic cluster surface protein. Multiple paralogs may occur in a given Mycoplasma, usually clustered on the genome.
Protein Domain
Name: Arginine vasopressin-induced protein 1
Type: Family
Description: Arginine vasopressin-induced protein 1, also known as VIP32, may be involved in MAP kinase activation, epithelial sodium channel (ENaC) down-regulation and cell cycling [ ].
Protein Domain
Name: Potyviral polyprotein protein 3
Type: Domain
Description: This is the P3 protein section of the Potyviridae polyproteins. The function is not known except that the protein is essential to viral survival [ ].
Protein Domain
Name: Surface protein repeat SSSPR-51
Type: Repeat
Description: This repeat domain is designated SSRS51, Streptococcal and Staphylococcal Surface Protein Repeat of size 51. Up to twelve tandem repeats can occur, on some of the longest proteins of their respective species. Nearly all member proteins carry the C-terminal sortase target sequence, LPXTG. The repeat structure and probable surface location suggest a possible adhesion function. A protein with this class of repeats may have other classes as well.
Protein Domain
Name: Developmental pluripotency-associated protein 2/4
Type: Family
Description: Developmental pluripotency-associated protein 2 (Dppa2, also known as ECAT15-2) and Dppa4 (also known as ECAT15-1) have a common DNA binding domain known as the SAP motif. Dppa2 has been found to bind to the regulatory region of Nkx2-5 in embryonic stem (ES) cells [ ]. Dppa4 has been found to bind to both DNA and histone H3 necessary for the chromatin structure resistance to MNase and for the proper localization of Dppa4 in ES cell nuclei [].
Protein Domain
Name: Growth arrest-specific protein 1
Type: Family
Description: Growth arrest-specific protein 1 (GAS1) is a glycosylphosphatidylinositol-anchored protein involved in growth suppression [ , , ]. It inhibits cell proliferation when overexpressed in normal and transformed cell lines, and reduces tumour cell growth []. It also promotes apoptosis [], plays a role in mouse embryonic development [], and suppresses melanoma metastases [].
Protein Domain
Name: F-box only protein 34/46
Type: Family
Description: The functions of F-box only proteins 34 and 46 is not known. Homologues are found in vertebrates.
Protein Domain
Name: HMG domain-containing protein 3
Type: Family
Description: HMGXB3 (HMG-box containing 3) belongs to the High Mobility Group superfamily, and participates in a range of cellular processes including cell migration and proliferation [ , ].
Protein Domain
Name: Uncharacterised protein family UPF0060
Type: Family
Description: This entry describes a family of integral membrane proteins. Some members of this family have been proposed to function as a thallium-specific efflux pump [ ].
Protein Domain
Name: Ty transposon capsid protein
Type: Family
Description: Ty are yeast transposons. A 5.7kb transcript codes for p3 a fusion protein of TYA and TYB. The TYA protein is analogous to the gag protein of retroviruses. TYA a is cleaved to form 46kd protein which can form mature virion like particles [ ]. This entry corresponds to the capsid protein from Ty1 and Ty2 transposons.Yeast retrotransposon Ty1 produces its proteins as precursors that are subsequently cleaved by an aspartic protease encoded by the element. Cleavage of the gag and gag-Pol polyprotein precursors is a critical step in proliferation of retroviruses and retroelements. These cleavage events are essential for transposition as they release the active reverse transcriptase and integrase and they modify the structure of the virus-like particles in a way that is analogous to the morphological changes that occur during retrovirus core maturation [ , , ].
Protein Domain
Name: Paramyxoviridae nonstructural protein C
Type: Family
Description: This family consists of the polymerase accessory protein C from members of the paramyxoviridae.
Protein Domain
Name: Uncharacterised conserved protein UCP016642
Type: Family
Description: The function of this family of uncharacterised proteins is not known.
Protein Domain
Name: Testis expressed protein 56
Type: Family
Description: This family of proteins is includes Testis expressed protein 56 (Tex56), whose function is not yet clear. This family of proteins is found in mammals.
Protein Domain
Name: Calcium uptake protein 1/2/3
Type: Family
Description: This entry represents a group of calcium uptake proteins, including MICU1/2/3 from humans. They contain the conserved EF-hand Ca2+-binding domains. MICU1 and MICU2 are the main regulators of the mitochondrial Ca(2+)-uniporter (MCU) [ ]. MICU2 forms a heterodimer with MICU1 to modulate MCU channel activity []. MICU3 a tissue-specific enhancer of mitochondrial calcium uptake [].
Protein Domain
Name: Sperm acrosome-associated protein 9
Type: Family
Description: This family of proteins found in eukaryotes represents sperm acrosome-associated protein 9 (SPACA9, previously known as C9orf9 or MAST).Sperm acrosome-associated protein 9 has been suggested to form a complex with calcium-binding proteins calreticulin and caldendrin localized to the acrosome. Despite this, no known protein interaction motifs have been identified in MAST [ ].
Protein Domain
Name: Ubiquitin domain-containing protein 1/2
Type: Family
Description: This entry represents a group of ubiquitin domain-containing proteins, including UBTD1 and UBTD2. UBTD1 regulates cellular senescence through a positive feedback loop with TP53 [ ]. UBTD1 and UBE2D (E2 ubiquitin conjugating enzyme) have been shown to form a stable, stoichiometric complex []. UBTD2, also known as DC-UbP, interacts with deubiquitinating enzyme USP5 and the Ub-activating enzyme UbE1 [].
Protein Domain
Name: RNA-binding protein B2 superfamily
Type: Homologous_superfamily
Description: Protein B2 binds double-strand RNA (dsRNA) with high affinity and suppresses the host RNA silencing-based antiviral response. B2 is expressed by the insect Flock House virus (FHV) as a counter-defense mechanism against antiviral RNA silencing during infection. In vitro, B2 binds to dsRNA as a dimer and inhibits the cleavage of it by Dicer. B2 blocks cleavage of the FHV genome by Dicer and also the incorporation of FHV small interfering RNAs into the RNA-induced silencing complex [ ].
Protein Domain
Name: Uncharacterised protein family UPF0728
Type: Family
Description: This family of proteins is functionally uncharacterised. This family of proteins is found in metazoa. There is a conserved GPY sequence motif.
Protein Domain
Name: Centromere-binding protein ParB, C-terminal
Type: Domain
Description: This entry represents the C-terminal DNA-binding domain found in centromere-binding protein ParB, which is required for stable segregation. The C-terminal domain has a ribbon-helix helix (RHH) motif with a C-terminal loop (residues 119-128) following helix alpha-2. The domain forms a dimer with the C-terminal of the beta chain [ ].
Protein Domain
Name: Testis-expressed sequence 35 protein
Type: Family
Description: This family of proteins is found in eukaryotes. The mouse and human family members are specifically expressed in the testis [ ].
Protein Domain
Name: Cullin-associated NEDD8-dissociated protein 1/2
Type: Family
Description: This entry includes cullin-associated NEDD8-dissociated proteins 1 (CAND1 also known as TIP120A) and 2 (CAND2); these proteins have a C-terminal TATA-binding protein interacting (TIP20) domain. CAND1 is required for the assembly of the SCF E3 ubiquitin ligase complex. The SCF ubiquitin E3 ligase consists of SKP1, CUL1 and F-box protein, and it regulates ubiquitin-dependent proteolysis. CAND1 binds to CUL1, preventing it from associating with the other components that form the ligase. Neddylation of CUL1 (or the presence of SKP1 and ATP) dissociates it from CAND1, allowing the ligase complex to form [ , , ]. CAND1 also interacts with CUL3, a component of the Cul3-dependent E3 ubiquitin ligase complex []. CAND1 has been proposed to be an F-box protein exchange factor, and as substrates of the ligase complex are degraded by the proteasome and depleted, the ligase complex enters an intermediate, deneddylated state when CAND1 can bind, promoting dissociation of the substrate-recognition subunit and recruitment of a new substrate-recognition subunit []. CAND2 is uncharacterized but is assumed to have similar roles to CAND1.
Protein Domain
Name: Secretory component protein Psh3/Shr3
Type: Family
Description: This family of proteins are membrane localised chaperones that are required for correct plasma membrane localisation of amino acid permeases (AAPs) [ ]. Shr3 prevents AAPs proteins from aggregating and assists in their correct folding. In the absence of Shr3, AAPs are retained in the ER.
Protein Domain
Name: RING finger protein 37
Type: Family
Description: Proteins in this entry contain the U-box domain, which has been suggested to be a modified RING finger motif where the metal-coordinating cysteines and histidines have been replaced with other amino acids [ ]. RING finger protein 37 (RNF37, also known as Ubox5) belongs to the U box protein family, whose members are ubiquitin-protein ligases [].
Protein Domain
Name: KxDL motif-containing protein 1-like
Type: Family
Description: KXD1 is part of the BORC complex that may regulate lysosome positioning [ ]. This entry also includes CG10681 from fruit flies and KxDL motif-containing protein LO9-177 from rice. LO9-177 contributes to the promotion of leaf inclination and grain size by modulating cell elongation.
Protein Domain
Name: Nucleolar pre-ribosomal-associated protein 1
Type: Family
Description: Nucleolar pre-ribosomal-associated protein 1 (Npa1) is required for ribosome biogenesis and operates in the same functional environment as Rsa3p and Dbp6p during early maturation of 60S ribosomal subunits [ ]. The protein partners of Npa1p include eight putative helicases as well as the novel Npa2p factor. Npa1p can also associate with a subset of H/ACA and C/D small nucleolar RNPs (snoRNPs) involved in the chemical modification of residues in the vicinity of the peptidyl transferase centre []. The protein has also been referred to as Urb1.
Protein Domain
Name: Quorum-sensing regulator protein G
Type: Family
Description: Quorum-sensing regulator protein G (QseG, also known as YfhG) is involved in the regulation of virulence and metabolism in enterohemorrhagic Escherichia coli (EHEC) [ ]. It is required for pedestal formation in host epithelial cells during infection and for translocation of effector molecules into host epithelial cells [ ].
Protein Domain
Name: Vanadium-binding protein 2 superfamily
Type: Homologous_superfamily
Description: The Vanadium binding protein, Vanabin2, contains four α-helices connected by nine disulphide bonds. Vanadium accumulates in Ascidians however the biological reason remains unclear [ ].
Protein Domain
Name: Coiled-coil domain-containing protein 34/181
Type: Domain
Description: This domain can be found in animal CCDC34 and CCDC181 proteins. CCDC34 promotes cell proliferation and invasive properties in human cancer [ ]. CCDC181 is a microtubule-binding protein that may play a role in mediating ciliary motility [].
Protein Domain
Name: Centrosomal protein of 120kDa-like
Type: Family
Description: CEP120 is a centrosomal protein required for the recruitment of CEP295 to the proximal end of new-born centrioles at the centriolar microtubule wall during early S phase in a PLK4-dependent manner [ ]. It has been shown to interact with SPICE1 and CPAP []. This entry also includes CEP120-like proteins from plants and fungi, which do not have centrosomes.Mutations of the CEP120 gene cause short-rib thoracic dysplasia 13 with or without polydactyly (SRTD13) [ ] and Joubert syndrome 31 (JBTS31) [].
Protein Domain
Name: Ribosomal protein L35 superfamily
Type: Homologous_superfamily
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].L35 is a basic protein of 60 to 70 amino-acid residues from the large subunit [ ]. Like many basic polypeptides, L35 completely inhibits ornithine decarboxylase when present unbound in the cell, but the inhibitory function is abolished upon its incorporation into ribosomes []. It belongs to a family of ribosomal proteins, including L35 from bacteria, plant chloroplast, red algae chloroplasts and cyanelles. In plants it is a nuclear encoded gene product, which suggests a chloroplast-to-nucleus relocation during the evolution of higher plants [].The core structure of L35 has an α-β(3)-alpha fold arranged in two layers (alpha/beta).
Protein Domain
Name: Coiled-coil domain-containing protein 82
Type: Domain
Description: This short domain is found in Ccdc82 and homologous sequences. Its function is not known.
Protein Domain
Name: Zinc finger protein 414
Type: Family
Description: The function of zinc finger protein 414 (ZN414) is not known. ZN414 contains a C2H2-type zinc ribbon. Homologues are known only from vertebrates.
Protein Domain
Name: Spc7 kinetochore protein domain
Type: Domain
Description: This domain is found in cell division proteins which are required for kinetochore-spindle association [ ]. Proteins containing this domain include budding yeast Spc105 and fission yeast Spc7.Spc7 is a component of the NMS (Ndc80-MIND-Spc7) super complex, which has a role in kinetochore function during late meiotic prophase and throughout the mitotic cell cycle []. Spc105 and Kre28 forms a kinetochore complex, which is required for kinetochore binding by a discrete subset of kMAPs (BIM1, BIK1 and SLK19) and motors (CIN8, KAR3) [].
Protein Domain
Name: Coiled-coil domain-containing protein R3HC1/R3HCL
Type: Family
Description: The functions of R3HC1 and R3HCL are not known.
Protein Domain
Name: Coiled-coil domain-containing protein 33
Type: Family
Description: The function of coiled-coil domain-containing protein 33 (CCD33) is not known.
Protein Domain
Name: IQ domain-containing protein F
Type: Family
Description: This entry includes IQ domain-containing proteins F1, F2, F3, F5, and F6 from vertebrates. In mice, IQCF1 is a acrosomal protein involved in sperm capacitation and the acrosome reaction [ ].
Protein Domain
Name: Coiled-coil domain-containing protein 78
Type: Family
Description: CCDC78 is a component of the deuterosome, a structure that promotes de novo centriole amplification in multiciliated cells that can generate more than 100 centrioles [ ]. CCDC78 does not have the kinesin-motor domain. Mutations in CCDC78 genes cause centronuclear myopathy 4 (CNM4), which is a congenital muscle disorder characterised by progressive muscular weakness and wasting involving mainly limb girdle, trunk, and neck muscles. It may also affect distal muscles [].
Protein Domain
Name: F-actin-capping protein subunit alpha/beta
Type: Homologous_superfamily
Description: The F-actin capping protein binds in a calcium-independent manner to the fast growing ends of actin filaments (barbed end) thereby blocking the exchange of subunits at these ends. Unlike gelsolin and severin this protein does not sever actin filaments. The F-actin capping protein is a heterodimer composed of two unrelated subunits: alpha () and beta ( ). Neither of the subunits shows sequence similarity to other filament-capping proteins [ ].This entry represents the alpha and beta subunits of the F-actin-capping protein. The alpha subunit (CAPZA) is a protein of about 268 to 286 amino acid residues and the beta subunit (CAPZB) is about 280 amino acid residues.Their sequences are well conserved in eukaryotic species [ ]. In Drosophila mutations in the alpha and beta subunits cause actin accumulation and subsequent retinal degeneration []. In humans CAPZA and CAPZB are part of the WASH complex that controls the fission of endosomes [].
Protein Domain
Name: Elongator complex protein 2
Type: Family
Description: Elongator complex protein 2 (Elp2) is a component of the RNA polymerase II elongator complex, which is a major histone acetyltransferase component of the RNA polymerase II (RNAPII) holoenzyme. The eukaryotic elongator complex has been associated with many cellular activities, including transcriptional elongation [ , ], but its main function is tRNA modification [, ]. It is required for the formation of 5-methoxy-carbonylmethyl (mcm5) and 5-carbamoylmethyl (ncm5) groups on uridine nucleosides present at the wobble position of many tRNAs [, ].
Protein Domain
Name: Coiled-coil domain-containing protein 28
Type: Family
Description: This entry includes CCDC28A and CCDC28B. CCDC28B modulates mTORC2 complex assembly and function, possibly enhances AKT1 phosphorylation [ ].
Protein Domain
Name: Stress response protein NST1
Type: Family
Description: NST1 is a family of proteins that seem to be involved, directly or indirectly, in the salt sensitivity of some cellular functions in yeast. It does this without affecting sodium accumulation. It negatively affects salt-tolerance through an interaction with the splicing factor Msl1p. This interaction stresses the importance of efficient RNA processing under salt stress conditions [ ].
Protein Domain
Name: Recombination protein O, RecO
Type: Family
Description: The damage avoidance-tolerance pathway(s) requires functional recA, recF, recO, and recR genes, suggesting the mechanism to be daughter strand gap repair. The ruvABC genes or the recG gene is also required. The RecG pathway appears to be more active than the RuvABC pathway [ ]. RecO may contain a mononucleotide-binding fold [].
Protein Domain
Name: Cobalt transport protein CbiN
Type: Family
Description: The cobalt transport protein CbiN is part of the active cobalt transport system involved in uptake of cobalt in to the cell involved with cobalamin biosynthesis (vitamin B12). It has been suggested that CbiN may function as the periplasmic binding protein component of the active cobalt transport system [].
Protein Domain
Name: Macrophage-expressed gene 1 protein
Type: Family
Description: The function of macrophage-expressed gene 1 protein (MPEG1) is not known. MPEG1 is a single-pass type I membrane protein; it is expressed in macrophages and peripheral blood monocytes [ ].
Protein Domain
Name: Probable membrane protein MT1774/Rv1733c-like
Type: Family
Description: This entry represents a group of bacterial proteins, including MT1774 and Rv1733c from Mycobacterium tuberculosis. Their function is not known. Homologues are known only from Actinobacteria.
Protein Domain
Name: Protein kinase C mu-related
Type: Family
Description: Protein phosphorylation, which plays a key role in most cellular activities, is a reversible process mediated by protein kinases and phosphoprotein phosphatases. Protein kinases catalyse the transfer of the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. Phosphoprotein phosphatases catalyse the reverse process. Protein kinases fall into three broad classes, characterised with respect to substrate specificity []:Serine/threonine-protein kinasesTyrosine-protein kinasesDual specificity protein kinases (e.g. MEK - phosphorylates both Thr and Tyr on target proteins)Protein kinase function is evolutionarily conserved from Escherichia coli to human [ ]. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation []. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins. The catalytic subunits of protein kinases are highly conserved, and several structures have been solved [], leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases [].The protein kinase D family of enzymes consists of three isoforms: PKD1 (PKCmu), PKD2, and PKD3 (PKCnu). They all share a similar architecture with regulatory sub-domains that play specific roles in the activation, translocation and function of the enzymes. The PKD enzymes have recently been implicated in very diverse cellular functions, including Golgi organisation and plasma membrane directed transport, metastasis, immune responses, apoptosis and cell proliferation []. Each isoform is differentially regulated through phosphorylation [].
Protein Domain
Name: Uncharacterised membrane protein YitT
Type: Family
Description: This entry includes proteins with transmembrane domains, such as YitT from Bacillus subtilis. The function of YitT is not clear.
Protein Domain
Name: Transmembrane emp24 domain-containing protein
Type: Family
Description: This group of proteins consists of TMP21 (also known as transmembrane emp24 domain-containing protein 10) and related proteins, which are members of the p24 family. The p24 family is a widely conserved family of transmembrane proteins that plays a functional role in the initiation of assembly of COPI (Coat protein I) coated vesicles. COPI coated vesicles are involved in protein transport within the early secretory pathway [ , ].p24 proteins are major membrane components of COPI- and COPII-coated vesicles and are implicated in cargo selectivity of ER to Golgi transport [ , ]. Multiple members of the p24 family are found in all eukaryotes, from yeast to mammals. Members of the p24 family are type I membrane proteins with a signal peptide at the amino terminus, a lumenal coiled-coil (extracytosolic) domain, a single transmembrane domain with conserved amino acids, and a short cytoplasmic tail. They may be grouped into at least three subfamilies based on primary sequence []. One subfamily comprises yeast Emp24p and mammalian p24A. Another subfamily comprises yeast Erv25p and mammalian Tmp21, and the third subfamily comprises mammalian gp25L proteins.
Protein Domain
Name: Merozoite surface protein 2
Type: Family
Description: This entry represents a protein family specific to the genus Plasmodium. The merozoite surface antigen 2 (MSA-2) may play a role in the merozoite attachment to the erythrocyte. This protein was proposed to be a candidate for a protective vaccine against malaria [ , ].
Protein Domain
Name: Yeast membrane protein DUP/COS
Type: Family
Description: A number of uncharacterised integral membrane proteins from yeast contain an internal duplication due to duplicated genes. Duplicated copies of genes may be classified in two types of cluster organisation. The first type includes genes sharing a significant level of identity in the amino acid sequences of their predicted protein product. They are recovered on two different chromosomes, transcribed in the same orientation and the distance between them is conserved. The second type of cluster is based on one gene unit tandemly repeated. This duplication is itself repeated elsewhere in the genome. The basic gene unit is recovered many times in the genome and is a component of a multigene family of unknown function. These organisations in clusters of genes suggest a 'Lego organisation' of the yeast chromosomes []. The proteins belonging to this family are of unknown function.
Protein Domain
Name: Protein N-terminal glutamine amidohydrolase
Type: Family
Description: The N-end rule pathway is an ubiquitin (Ub)-dependent proteolytic system that mediates and regulates the degradation of intracellular proteins through the recognition of their N-terminal residues. NTAQ1 (also known as Nta1 in fungi) is an N-terminal amidohydrolase, which converts N-terminal Asn and Gln to the N-terminal residues Asp and Glu, the first step of the hierarchically organized N-end rule pathway []. The structure of Nta1 has been determined [].
Protein Domain
Name: Golgi apparatus protein 1
Type: Family
Description: The Golgi apparatus protein 1 (GLG1), which is located in Golgi cisterns of various cell types, can bind fibroblast growth factor and E-selectin. Sixteen cysteine-rich GLG1 repeats form the core of the protein and are located in the lumen. The C-terminal part of GLG1 is composed of a transmembrane region and a short cytoplasmic tail. The Cys-rich GLG1 repeat is a ~60 amino acid module that contains 4 Cys residues, which can form intrachain disulphide bridges [ ].
Protein Domain
Name: Leucine-rich repeat-containing protein 37
Type: Family
Description: This entry represents the leucine-rich repeat-containing protein 37. They are single-pass type I membrane proteins with unknown function.
Protein Domain
Name: WD repeat-containing protein 91
Type: Family
Description: This entry represents WDR91 and its homologues. WDR91 is part of the WDR81-WDR91 complex that functions as a negative regulator of the PI3 kinase/PI3K activity associated with endosomal membranes via BECN1, a core subunit of the PI3K complex. WDR91 has been shown to be recruited to endosomes by interacting with active guanosine triphosophate-Rab7 and inhibits Rab7-associated phosphatidylinositol 3-kinase activity [ ].This entry also includes Sorf-1, a WDR91 homologue from C. elegans. Sorf-1 and Sorf-2 form a complex with Beclin1 and inhibit the activity of the PI3K complex [ ].
Protein Domain
Name: F-box only protein 28
Type: Family
Description: F-box only protein 28 (FBXO28) is required for proper mitotic progression. It may regulate topoisomerase IIalpha decatenation activity and plays an important role in maintaining genomic stability [ ].
Protein Domain
Name: Sulfur carrier protein FdhD
Type: Family
Description: FdhD is a protein essential for the activity of formate dehydrogenases (FDHs) [ ], but it is not a component of membrane-bound formate dehydrogenase []. In Escherichia coli, it has been shown to function as a sulfurtransferase between IscS to the molybdenum cofactor prior to its insertion into formate dehydrogenase []. Sulfur transfer between IscS and FDH is a indispensable step for FDH activity [].
Protein Domain
Name: Flavivirus capsid protein C
Type: Domain
Description: Flaviruses are small, enveloped RNA viruses that use arthropods such as mosquitoes for transmission to their vertebrate hosts, and include Yellow fever virus, West Nile virus, Tick-borne encephalitis virus, Japanese encephalitis virus, and Dengue virus 2 [ ]. Flaviviruses consist of three structural proteins: the core nucleocapsid protein C, and the envelope glycoproteins M () and E. The virion of these viruses is a nucleocapsid covered by a lipoprotein envelope, where the nucleocapsid is a complex of capsid protein C and mRNA. The capsid protein C is a dimeric α-helical protein, and its interaction with RNA is critical for the production of viable virus particles [ ].
Protein Domain
Name: DNA packaging protein FI
Type: Family
Description: This family includes the lambda phage DNA-packaging protein FI [ ].
Protein Domain
Name: Ribosomal protein L25, C-terminal
Type: Homologous_superfamily
Description: The bacterial ribosomal protein L25 is bound to 5S rRNA along with L5 and L18, forming a separate domain of the ribosome [ ]. The solution structure of protein L25 uncomplexed with RNA shows two significantly disordered loops and a closed β-barrel domain with a complex topology that has significant structural similarities to the N-terminal domain of the Thermus thermophilus ribosomal protein TL5, to the general stress protein CTC, and to the C-terminal anticodon-binding domain of Escherichia coli glutaminyl-tRNA synthetase (GlnRS) [, ]. GlnRS contains a duplication consisting of two L25-like β-barrels domains with the swapping of N-terminal strands.This superfamily represents the C-terminal domain, which has a mainly β-strand structure, found in ribosomal L25-like proteins.
Protein Domain
Name: Nucleolar protein 4 family
Type: Family
Description: This family consists of nucleolar protein 4 (NOL4) [ , ] and nucleolar protein 4-like (NOL4L). NOL4 has been identified as a methylation target for cervical [] and head and neck cancer [].
Protein Domain
Name: Testis-specific expressed protein 55
Type: Family
Description: This family includes testis-specific expressed protein 55 (also known as TSCPA in human) from animals, which is involved in normal spermatogenesis [ , ].
Protein Domain
Name: Ribosome biogenesis protein Rpf2
Type: Family
Description: Rpf2 is a conserved protein essential for the maturation of 25 S rRNA and the 60 S ribosomal subunit assembly [ ]. It is part of the complex (Rpf2/Rrs1/rpL5/rpL11) that functions in intermediate stages of 66S preribosome maturation and assembles into 90S preribosomes containing 35S pre-rRNA [].
Protein Domain
Name: Homeodomain containing protein PHTF1/2
Type: Family
Description: This entry represents a group of homeodomain containing proteins from animals, including PHTF1/2 from mammals. In rat PHTF1 is an integral membrane protein abundantly expressed in testis [ ]. It is localised in the endoplasmic reticulum saccules applied to the trans face of the Golgi system [].
Protein Domain
Name: Transcriptional regulatory protein Sin3-like
Type: Family
Description: Proteins in this entry contain N-terminal PAH (paired amphipathic helix) repeats, a histone deacetylase interacting domain, and a Sin3, C-terminal domain. Sin3 proteins have at least three PAH domains (PAH1, PAH2, and PAH3). They are components of a co-repressor complex that silences transcription, playing important roles in the transition between proliferation and differentiation. Sin3 proteins are recruited to the DNA by various DNA-binding transcription factors such as the Mad family of repressors, Mnt/Rox, PLZF, MeCP2, p53, REST/NRSF, MNFbeta, Sp1, TGIF and Ume6 [ ]. Sin3 acts as a scaffold protein that in turn recruits histone-binding proteins RbAp46/RbAp48 and histone deacetylases HDAC1/HDAC2, which deacetylate the core histones resulting in a repressed state of the chromatin []. The PAH domains are protein-protein interaction domains through which Sin3 fulfils its role as a scaffold. The PAH2 domain of Sin3 can interact with a wide range of unrelated and structurally diverse transcription factors that bind using different interaction motifs. For example, the Sin3 PAH2 domain can interact with the unrelated Mad and HBP1 factors using alternative interaction motifs that involve binding in opposite helical orientations []. The Sin3, C-terminal domain forms interactions with histone deacetylases [].
Protein Domain
Name: Flavivirus non-structural protein NS1
Type: Domain
Description: The Flavivirus genome polypepetide contains the capsid protein C (core protein), the matrix protein (envelope protein M), the major envelope protein E, a numberof small non structural proteins (NS1, NS2A, NS2B, NS4A and NS4B), helicase and RNA-directed polymerase (NS5) [].
Protein Domain
Name: Ribosome biogenesis protein Bms1/Tsr1
Type: Family
Description: Bms1p and Tsr1p represent a new family of factors required for ribosome biogenesis. They are each independently required for 40S ribosomal subunitbiogenesis. Bms1p, a protein required for pre-rRNA processing, contains an evolutionarily conserved guanine nucleotide-binding (G) domain with five conserved polypeptide loopsdesignated G1 through G5, which form contact sites with the guanine nucleotide or coordinate the Mg(2+) ion. Sequences resembling G1 (consensus [GA]-x(4)-G- K-[ST]; also known as a P-loop), G4 (consensus [NT]-K-x-D), and G5 (consensusS-[AG] are present in all Bms1 proteins, and either fully conform with theconsensus or contain, at most, single conservative substitutions. The G2 motif (consensus G-P-[IV]-T) contains a T residue involved in the coordination of the Mg(2+) required for GTP hydrolysis. The G3 motif diverges from theconsensus found in G proteins, D-x(2)-G; however, the D residue is replaced with the conserved E residue. In contrast, Tsr1p lacks a P-loop and is notpredicted to bind GTP. It functions at a later step of 40S ribosome production, possibly in assembly and/or export of 43S pre-ribosomal subunitsto the cytosol [ , , ].
Protein Domain
Name: RNA binding protein HABP4/SERBP1
Type: Family
Description: This entry includes proteins with a hyaluronan/mRNA-binding protein domain, found in the HABP4 protein family of hyaluronan-binding proteins, and the PAI-1 mRNA-binding protein, PAI-RBP1 (also known as SERBP1). HABP4 has been observed to bind hyaluronan (a glucosaminoglycan), but it is not known whether this is its primary role in vivo. It has also been observed to bind RNA, but with a lower affinity than that for hyaluronan [ ]. PAI-1 mRNA-binding protein specifically binds the mRNA of type-1 plasminogen activator inhibitor (PAI-1), and is thought to be involved in regulation of mRNA stability []. However, in both cases, the sequence motifs predicted to be important for ligand binding are not conserved throughout the family, so it is not known whether members of this family share a common function.Hyaluronan/mRNA-binding protein may be involved in nuclear functions such as the remodeling of chromatin and the regulation of transcription [ , ].
Protein Domain
Name: Ribosomal export protein Nmd3
Type: Family
Description: Nmd3 acts as an adapter for the XPO1/CRM1-mediated export of the 60S ribosomal subunit [ , ]. It may activate the circularly permuted GTPase Lsg1 during 60S ribosome biogenesis [].
Protein Domain
Name: Dynein regulatory complex protein
Type: Family
Description: DRC1 is a key component of the nexin-dynein regulatory complex (N-DRC), essential for N-DRC integrity. It is required for the assembly and regulation of specific classes of inner dynein arm motors. It may also function to restrict dynein-driven microtubule sliding, thus aiding in the generation of ciliary bending [ ]. Mutations of DRC1 gene cause Ciliary dyskinesia, primary, 21 (CILD21), which is a disorder characterised by abnormalities of motile cilia []. DRC2, also known as CCDC65, is an essential component of the nexin-dynein regulatory complex (N-DRC)[ ]. DRC2 is necessary for the co-assembly of DRC2 and DRC1 to form the base plate of N-DRC.
Protein Domain
Name: RNA polymerase-binding protein RbpA
Type: Family
Description: RbpA binds to RNA polymerase (RNAP), stimulating transcription from principal, but not alternative sigma factor promoters [ ]. RbpA stimulates transcription from several principal sigma factor HrdB (SigA)-dependent promoters but not from a SigR-dependent promoter. Stimulation occurs in the presence of the transcription inititation inhibitor rifampicin [].
Protein Domain
Name: Ribosomal protein L28/L24 superfamily
Type: Homologous_superfamily
Description: This entry represents a domain superfamily found in the mitochondrial 39S ribosomal protein L28, mitochondrial 54S ribosomal protein L24 and 50S ribosomal protein L28. They belong to the ribosomal protein L28 family. They are components of the mitochondrial or non-mitochondrial large ribosomal subunits.Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of themass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].
Protein Domain
Name: Conserved hypothetical protein CHP02569
Type: Family
Description: This entry has so far only been found in Actinobacteria, including at least five species of Mycobacterium, three of Corynebacterium, and Nocardia farcinica - always in a single copy per genome. The function is unknown.
Protein Domain
Name: Parvovirus coat protein VP2
Type: Domain
Description: Parvoviruses are some of the smallest viruses containing linear, non-segmented single-stranded DNA genomes, with an average genome size of 5000 nucleotides. Parvoviruses have been described that infect a wide range of invertebrates and vertebrates and are well known for causing enteric disease in mammals. Genomes contain two large ORFs: NS1 and VP1; other ORFs are found in some sub-types and different gene products can arise from splice variants and the use of different start codons [ , ].The Parvovirus coat protein VP1 together with VP2 forms a capsomer. Both of these proteins are formed from the same transcript using alternative translation start codons. As a result, VP1 and VP2 differ only in the N terminus region. VP2 is involved in packaging the viral DNA [ ]. The mature viron contains three capsid proteins VP1, VP2, and VP3 and a noncapsid protein NS1. VP3 may arise from a third start codon with a favorable translationinitiation context which is present at position 3067 in the ChPV genome and which has been described in the goose and Muscovy duck parvoviruses [ ].
Protein Domain
Name: Mobile mystery protein A
Type: Family
Description: Proteins in this entry are more often encoded within mobilisation-related contexts than not. This includes a CRISPR-associated gene region in Geobacter sulfurreducens PCA, and plasmids in Agrobacterium tumefaciens and Coxiella burnetii. They are found together with mobile mystery protein B, a member of the Fic protein family ( ). Mobile mystery protein A is encoded by the upstream member of the gene pair and contains a helix-turn-helix domain.
Protein Domain
Name: Conserved hypothetical protein CHP02588
Type: Family
Description: The function of this protein is unknown. It often found as part of a two-gene operon with , a protein that appears to span the membrane seven times. It has so far been found in the bacteria Anabaena sp. (strain PCC 7120), Agrobacterium tumefaciens, Rhizobium meliloti, and Gloeobacter violaceus.
Protein Domain
Name: CRISPR-associated protein Cas5, bacterial
Type: Family
Description: The CRISPR-Cas system is a prokaryotic defense mechanism against foreign genetic elements. The key elements of this defense system are the Cas proteins and the CRISPR RNA. Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes [ ]. CRISPRs appear to provide acquired resistance against mobile genetic elements (viruses, transposable elements and conjugative plasmids). CRISPR clusters contain sequences complementary to antecedent mobile elements and target invading nucleic acids. CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA).The defense reaction is divided into three stages. In the adaptation stage, the invader DNA is cleaved, and a piece of it is selected to be integrated as a new spacer into the CRISPR locus, where it is stored as an identity tag for future attacks by this invader. During the second stage (the expression stage), the CRISPR RNA (pre-crRNA) is transcribed and subsequently processed into the mature crRNAs. In the third stage (the interference stage), Cas proteins, together with crRNAs, identify and degrade the invader [ , , ].The CRISPR-Cas systems have been sorted into three major classes. In CRISPR-Cas types I and III, the mature crRNA is generally generated by a member of the Cas6 protein family. Whereas in system III the Cas6 protein acts alone, in some class I systems it is part of a complex of Cas proteins known as Cascade (CRISPR-associated complex for antiviral defense). The Cas6 protein is an endoribonuclease necessary for crRNA production whereas the additional Cas proteins that form the Cascade complex are needed for crRNA stability [ ]. This entry represents a family of Cas5 proteins that includes DevS from Myxococcus xanthus, as well as related proteins from Leptospira interrogans and Gemmata obscuriglobus. Cas5 is a key regulator of development that is encoded in a cluster of CRISPR-associated (cas) genes, and in the special case of M. xanthus has taken on a role in the control of fruiting body development. This entry is related to , , and .
Protein Domain
Name: Heat shock protein beta-8
Type: Family
Description: HSPB8, also known as HSP22, is a small heat shock protein that interacts with itself, cvHSP (HSPB7), MKBP (HSPB2), HSP27 alphaB-crystallin and HSP20 [ , ].
Protein Domain
Name: CRISPR-associated protein Csh1, C-terminal
Type: Domain
Description: The CRISPR-Cas system is a prokaryotic defense mechanism against foreign genetic elements. The key elements of this defense system are the Cas proteins and the CRISPR RNA. Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes [ ]. CRISPRs appear to provide acquired resistance against mobile genetic elements (viruses, transposable elements and conjugative plasmids). CRISPR clusters contain sequences complementary to antecedent mobile elements and target invading nucleic acids. CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA).The defense reaction is divided into three stages. In the adaptation stage, the invader DNA is cleaved, and a piece of it is selected to be integrated as a new spacer into the CRISPR locus, where it is stored as an identity tag for future attacks by this invader. During the second stage (the expression stage), the CRISPR RNA (pre-crRNA) is transcribed and subsequently processed into the mature crRNAs. In the third stage (the interference stage), Cas proteins, together with crRNAs, identify and degrade the invader [ , , ].The CRISPR-Cas systems have been sorted into three major classes. In CRISPR-Cas types I and III, the mature crRNA is generally generated by a member of the Cas6 protein family. Whereas in system III the Cas6 protein acts alone, in some class I systems it is part of a complex of Cas proteins known as Cascade (CRISPR-associated complex for antiviral defense). The Cas6 protein is an endoribonuclease necessary for crRNA production whereas the additional Cas proteins that form the Cascade complex are needed for crRNA stability [ ]. This entry is found in the C-terminal region of a family of CRISPR associated proteins of the Hmari subtype [ ]. Except for some sequences from halophilic archaea, this domain contains a pair of CXXC motifs.
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom