Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 9701 to 9800 out of 30763 for seed protein

Category restricted to ProteinDomain (x)

0.041s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Name: Maintenance of telomere capping protein 4
Type: Family
Description: This is a family of fungal proteins. Mutations in a member from Saccharomyces cerevisiae, YBR255W, affected growth rate [ ]. YBR255W has been identified among genes that affect telomere capping, and has been renamed maintenance of telomere capping protein 4 [].
Protein Domain
Name: GINS complex protein Sld5, alpha-helical domain
Type: Domain
Description: Sld5 is a component of GINS tetrameric protein complex, and within the complex Sld5 interacts with Psf1 via its N-terminal A-domain, and with Psf2 through a combination of the A and B domains [ , ]. Sld5 in Drosophila is required for normal cell cycle progression and the maintenance of genomic integrity []. GINS is a complex of four subunits (Sld5, Psf1, Psf2 and Psf3) that is involved in both initiation and elongation stages of eukaryotic chromosome replication []. Besides being essential for the maintenance of genomic integrity, GINS plays a central role in coordinating DNA replication with cell cycle checkpoints and is involved in cell growth. The eukaryotic GINS subunits are homologous and homologues are also found in the archaea; the complex is not found in bacteria. The four subunits of the complex consist of two domains each, termed the α-helical (A) and β-strand (B) domains. The A and B domains of Sld5/Psf1 are permuted with respect to Psf1/Psf3 [, , , , , , ].
Protein Domain
Name: Ran-specific GTPase-activating protein 1, Ran-binding domain
Type: Domain
Description: This entry represents the Ran-binding domain (RBD) found in RanBP1 from humans and Yrb1 from budding yeasts. RanBP1 and Yrb1 are involved in nuclear import and export. RanBP1 and Yrb1 have been shown to shuttle between the nucleus and cytoplasm and the conserved RBD is necessary and sufficient for the essential function and nucleocytoplasmic shuttling [ ]. RanBP1/Yrb1 acts as a negative regulator of Regulator of chromosome condensation 1 (RCC1) by inhibiting RCC1-stimulated guanine nucleotide release from Ran [].
Protein Domain
Name: RsbS co-antagonist protein RsbRA N-terminal domain
Type: Domain
Description: The general stress response in Bacillus subtilis is governed by sigma(B), whose activity is controlled by a partner switching mechanism in which key protein interactions are governed by serine phosphorylation. In the environmental stress pathway, the RsbS antagonist binds and inactivates the RsbT switch protein/kinase. Following stress, RsbT phosphorylates RsbS, releasing RsbT to bind and activate the RsbU phosphatase.RsbRA (also known as RsbR) was previously reported to be a positive regulator that enhances RsbT kinase activity [ ]. It has since been reported instead to function as a co-antagonist of RsbS [].For RsbS to function, it requires one of a family of four co-antagonist proteins, named RsbRA (RsbR), RsbRB (YkoB), RsbRC (YojH) and RsbRD (YqhA). These RsbRA paralogs each have a a C-terminal domain closely resembling the entire RsbS protein [ ]. The N-terminal domain of RsbRA has been reported to show a classic globin fold. However, structural analysis has not revealed the presence of any bound cofactor, such as heme [].
Protein Domain
Name: Fungal mitogen-activated protein (MAP) kinase Sty1/Hog1
Type: Family
Description: This family is composed of the mitogen-activated protein kinases (MAPKs) Sty1 from Schizosaccharomyces pombe, Hog1 from Saccharomyces cerevisiae, and similar fungal proteins. Sty1 and Hog1 are stress-activated MAPKs that participate in transcriptional regulation in response to stress [ ].Sty1 is activated in response to oxidative stress, osmotic stress, and UV radiation [ , ]. It is regulated by the MAP2K Wis1, which is activated by the MAP3Ks Wis4 and Win1, which receive signals of the stress condition from membrane-spanning histidine kinases Mak1-3 []. Activated Sty1 stabilizes the Atf1 transcription factor and induces transcription of Atf1-dependent genes of the core environmental stress response [, ].Hog1 is the key element in the high osmolarity glycerol (HOG) pathway and is activated upon hyperosmotic stress. Activated Hog1 accumulates in the nucleus and regulates stress-induced transcription [ ]. The HOG pathway is mediated by two transmembrane osmosensors, Sln1 and Sho1 [].
Protein Domain
Name: Protein LOW PSII ACCUMULATION 2 like
Type: Family
Description: This entry represents a group of proteins from plants, including protein LOW PSII ACCUMULATION 2 (LPA2) from Arabidopsis whose function is not clear.
Protein Domain
Name: Protein of unknown function DUF2339, transmembrane
Type: Family
Description: This entry, found in various hypothetical bacterial proteins, has no known function.
Protein Domain
Name: Adenovirus large t-antigen, E1B 55kDa protein
Type: Family
Description: This family consists of adenovirus E1B 55kDa protein or large t-antigen. E1B 55kDa binds p53 the tumor suppressor protein converting it from a transcriptional activator which responds to damaged DNA in to an unregulated repressor of genes with a p53 binding site [ ]. This protects the virus against p53 induced host antiviral responses and prevents apoptosis as induced by the adenovirus E1A protein [].The E1B region of adenovirus encodes two proteins E1B 55kDa, the large t-antigen as found in this family and E1B 19kDa , the small t-antigen. Both of these proteins inhibit E1A induced apoptosis.
Protein Domain
Name: Inner layer core protein VP3, Orbivirus
Type: Family
Description: This entry represents the inner layer core protein VP3 from Orbiviruses, a family of Reoviruses that have dsRNA genomes of 10-12 linear segments [ ]. Orbiviruses include Broadhaven virus (BRD), Epizootic hemorrhagic disease virus and Bluetongue virus (BTV) []. The Orbivirus VP3 protein is part of the virus core and makes a 'subcore' shell made up of 120 copies of the 100K protein []. VP3 particles can also bind RNA and are fundamental in the early stages of viral core formation []. The structural core protein VP2 from BRD is similar to VP3 from BTV.
Protein Domain
Name: Patatin-like phospholipase domain containing protein 8-like
Type: Domain
Description: Patatin-like phospholipase domain containing protein 8 (PNPLA8, also known as iPLA2-gamma) is a Ca-independent myocardial phospholipase which maintains mitochondrial integrity. In humans, it is predominantly expressed in heart tissue. iPLA2-gamma can catalyse both phospholipase A1 and A2 reactions (PLA1 and PLA2 respectively). This family includes PNPLA8 (iPLA2-gamma) from mammals [ , , ] and AtPLAI from Arabidopsis [].
Protein Domain
Name: Transmembrane and coiled-coil domain-containing protein 2/5
Type: Family
Description: Proteins in this family are single-pass membrane proteins. Their function is unknown.
Protein Domain
Name: Uncharacterised conserved protein UCP036778, sugar epimerase-type
Type: Family
Description: This entry represents a related group of proteins found in some proteobacteria. They have not been experimentally characterised but are predicted to contain a TIM barrel xylose isomerase-like domain and may function as sugar epimerases or isomerases. From their genomic contexts some may perform a similar function to IolI from Bacillus subtilis [ ].
Protein Domain
Name: Uncharacterised conserved protein UCP036794, erythromycin esterase-type
Type: Family
Description: This group represents an uncharacterised conserved protein with an erythromycin esterase domain.
Protein Domain
Name: Uncharacterised conserved protein UCP006529, dinucleotide-utilising ThiF/HesA
Type: Family
Description: Members of this group contain a divergent form of a domain based on the common NAD/FAD-binding fold. Its various forms are found in a wide range of enzymes, including ubiquitin-activating enzymes and NAD/FAD-dependent dehydrogenases and oxidases ( ). Members are distantly related to the HesA/MoeB/ThiF group of NAD/FAD-binding fold enzymes. Hmd co-occurring protein HcgE from Methanothermobacter marburgensis, an adenylyltransferase that forms AMP-guanylylpyridinol from ATP and guanylylpyridinol [ ], also belongs to this protein family.
Protein Domain
Name: Conserved hypothetical protein CHP04064, radical SAM
Type: Family
Description: Members of this family are radical SAM enzymes that occur co-clustered with Nif11-related bacteriocin precursors, as described by TIGRFAMs model . They could be involved a Nif11-class bacteriocin maturation.
Protein Domain
Name: Synaptonemal complex central element protein 1
Type: Family
Description: Synaptonemal complex central element protein 1 is a major component of the transverse central element of synaptonemal complexes (SCS), formed between homologous chromosomes during meiotic prophase. It requires the transverse filament protein-SYCP1 in order to be incorporated into the central element. It may have a role in the synaptonemal complex assembly, stabilisation and recombination [ ].
Protein Domain
Name: Conserved hypothetical protein CHP04083, radical SAM
Type: Family
Description: Members of this family are radical SAM enzymes, homologous to a variety of other peptide-modifying radical SAM, and found primarily in methanogenic archaea.
Protein Domain
Name: Sporulation-specific cell division protein SsgB superfamily
Type: Homologous_superfamily
Description: SsgB is a conserved activator of developmental cell division in morphologically complex actinomycetes [ ]. It controls cell division and spore maturation in streptomycetes. Together with SsgA, SsgB activates sporulation-specific cell division by controlling the localisation of FtsZ [].SsgB shows structural similarity to whirly ssDNA/guide RNA-binding proteins found in mitochondria or in plants, but the interaction of this protein with nucleic acids is unlikely. On the other hand, a conserved surface for protein-protein interaction has been identified and suggest that SsgB might be a binding partner [ ].
Protein Domain
Name: FeGP cofactor biosynthesis protein HcgB, guanylyltransferase
Type: Family
Description: This entry represents the guanylyltransferase component of the Iron-guanylylpyridinol (FeGP) cofactor biosynthesis protein HcgB [ ].
Protein Domain
Name: Peptidase C3, picornavirus core protein 2A
Type: Domain
Description: This domain defines cysteine peptidases belong to MEROPS peptidase family C3 (picornain, clan PA(C)), subfamilies 3CA and 3CB. The protein fold of this peptidase domain for members of this family resembles that of the serine peptidase, chymotrypsin [], the type example for clan PA.Picornaviral proteins are expressed as a single polyprotein which is cleaved by the viral 3C cysteine protease []. The poliovirus polyprotein is selectively cleaved between the Gln-|-Gly bond. In other picornavirus reactions Glu may be substituted for Gln, and Ser or Thr for Gly.A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families [ ]. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid, N-ethylmaleimide or p-chloromercuribenzoate. Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [ ].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues [ ]. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrel. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel [ ]. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.
Protein Domain
Name: Zinc finger CCCH-type antiviral protein 1-like
Type: Family
Description: Proteins in this family contain C3H1-type zinc fingers. However, despite their name, they do not contain a canonical C3H1-type zinc-finger. Their function is not clear.
Protein Domain
Name: Protein of unknown function DUF917, C-terminal
Type: Homologous_superfamily
Description: This beta barrel domain is found at the C-terminal of the uncharacterised proteins belonging to the DUF917 family.
Protein Domain
Name: StAR-related lipid transfer protein 3, C-terminal
Type: Domain
Description: StAR-related lipid transfer protein 3 (STARD3, also known as MLN64) is an integral membrane protein localised to the late endosome and plasma membrane. It may function as a mediator of cholesterol transport from endosomal membranes to the plasma membrane and/or mitochondria [ , , , ]. It contains an N-terminal membrane-spanning domain that shares homology with the MENTHO protein and a C-terminal steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domain that binds lipids, including sterols []. Its transport to the late endosome is regulated by binding to 14-3-3 [].
Protein Domain
Name: Myotonic dystrophy protein kinase, coiled coil
Type: Domain
Description: This domain is found in the myotonic dystrophy protein kinase (DMPK) and adopts a coiled coil structure. It plays a role in dimerisation [ ].
Protein Domain
Name: Adenovirus small t-antigen, E1B 19kDa protein
Type: Family
Description: This family consists of adenovirus E1B 19kDa protein or small t-antigen. The E1B 19kDa protein inhibits E1A induced apoptosis and hence prolongs the viability of the host cell [ ]. It can also inhibit apoptosis mediated by tumour necrosis factor alpha and Fas antigen []. E1B 19kDa blocks apoptosis by interacting with and inhibiting the p53-inducible and death-promoting Bax protein []. The E1B region of adenovirus encodes two proteins E1B 19kDa the small t-antigen as found in this family and E1B 55kDa thelarge t-antigen which is not found in this family; both of these proteins inhibit E1A induced apoptosis [].
Protein Domain
Name: Potato leaf roll virus readthrough protein
Type: Family
Description: This family consists mainly of the Potato leafroll virus (PLrV) read through protein also known as the minor capsid protein. This is generated via a readthrough of open reading frame 3, the coat protein, allowing transcription of open reading frame 5 to give an extended coat protein with a large C-terminal addition or read through domain [ , ].The read through protein is essential for the circulative aphid transmission of PLrV [ ] and Beet western yellows virus []. The N-terminal region of the luteovirus readthrough domain determines virus binding to Buchnera GroEL and is essential for virus persistence in the aphid [].
Protein Domain
Name: Glycine cleavage system protein H-related, Chlamydia
Type: Family
Description: The H protein (GcvH) of the glycine cleavage system shuttles the methylamine group of glycine from the P protein to the T protein. Most Chlamydia lack the P and T proteins, and have a single homologue of GcvH that appears deeply split from canonical GcvH in molecular phylogenetic trees. This protein family represents the Chlamydial GcvH homologue, which is always seen as part of a two-gene operon, downstream of a member of the uncharacterised protein family . The function of the proteins in this entry are unknown.
Protein Domain
Name: Extracellular matrix-binding protein ebh, GA module
Type: Domain
Description: Protein G-related albumin-binding (GA) modules occur on the surface of numerous Gram-positive bacterial pathogens. Protein G of group C and G Streptococci interacts with the constant region of IgG and with human serum albumin. The GA module is composed of a left-handed three-helix bundle and is found in a range of bacterial cell surface proteins [ , ]. GA modules may promote bacterial growth and virulence in mammalian hosts by scavenging albumin-bound nutrients and camouflaging the bacteria. Variations in sequence give rise to differences in structure and function between GA modules in different proteins, which could alter pathogenesis and host specificity due to their varied affinities for different species of albumin []. Proteins containing a GA module include PAB from Peptostreptococcus magnus [].This entry represents the GA module from the extracellular matrix-binding protein ebhB.
Protein Domain
Name: Protein Casparian strip integrity factor 1/2
Type: Family
Description: This entry represents a group of plant proteins, including Protein Casparian strip integrity factor 1/2 (CIF1/2) from Arabidopsis. They are required for contiguous Casparian strip formation in Arabidopsis roots [ ].
Protein Domain
Name: WD repeat-containing protein 18, C-terminal domain
Type: Domain
Description: WD repeats are short subdomains of about 40 amino acids and fold into 4 anti-parallel beta hairpins. The domain represented by this entry has been detected on the C terminus of WD repeat-containing protein 18 (Wdr18), which palys a role during development [ ].
Protein Domain
Name: Casein kinase II subunit alpha'-interacting protein
Type: Family
Description: CSNKA2IP, also known as CKT2, may play a role in chromatin regulation of male germ cells [ ].
Protein Domain
Name: Non-structural protein NSP3, SUD-M domain, betacoronavirus
Type: Domain
Description: This entry represents the macrodomain referred to as SUD-M (middle SUD subdomain) of the SARS-unique domain (SUD) which binds G-quadruplexes (unusual nucleic-acid structures formed by consecutive guanosine nucleotides) [ ]. It can be found in non-structural protein 3 (NSP3) of Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV) and related coronaviruses []. NSP3 binds to viral RNA, nucleocapsid protein, as well as other viral proteins, and participates in polyprotein processing. It is a multifunctional protein comprising up to 16 different domains and regions []. In SARS-CoV the SUD-M (527-651) domain has been shown to bind single-stranded poly(A). It has been shown through the contact area with this RNA on the protein surface, and the electrophoretic mobility shift assays, that SUD-M has higher affinity for purine bases than for pyrimidine bases [ ].SUD consists of three globular domains separated by short linker peptide segments: SUD-N, SUD-M, and SUD-C [ ]. Among these, SUD-N and SUD-M are macrodomains. The SUD-N domain is a related macrodomain which also binds G-quadruplexes []. While SUD-N is specific to the NSP3 of SARS and betacoronaviruses of the sarbecovirus subgenera (B lineage), SUD-M is present in most NSP3 proteins except the NSP3 from betacoronaviruses of the embecovirus subgenera (A lineage). SUD-M, despite its name, is not specific to SARS. SUD-C adopts a frataxin-like fold, has structural similarity to DNA-binding domains of DNA-modifying enzymes, binds single-stranded RNA, and regulates the RNA binding behavior of the SUD-M macrodomain. SARS-CoV Nsp3 contains a third macrodomain (the X-domain). The X-domain may function as a module binding poly(ADP-ribose); however, SUD-N and SUD-M do not bind ADP-ribose, as the triple glycine sequence involved in its binding is not conserved in these [].Nsp3c-N and Nsp3c-M each display a typical α/β/α Macro domain fold, in spite of the complete absence of sequence similarities. The central β-sheet with six β-strands in the order β1-β6-β5-β2-β4-β3 is flanked by two (or three) helices on either side. Only the last strand, β3, is antiparallel to the other strands. Currently, most known functions of Nsp3c-N/M are connected with RNA binding. All the residues important for binding ADP-ribose and for de-MARylation/de-PARylation activity are not conserved in Nsp3c-N/M; therefore Nsp3c-N/M cannot bind ADP-ribose. Both Nsp3c-N and Nsp3c-M domains bind unusual nucleic acid structures formed by consecutives guanosine nucleotides, where four strands of nucleic acid are forming a superhelix (so-called G-quadruplexes) [, , , , ].
Protein Domain
Name: GTP-binding protein OBG, C-terminal domain superfamily
Type: Homologous_superfamily
Description: Obg subfamily proteins (also known as ObgE, YhbZ and CgtA) are conserved P- loop GTPases, that are involved in a wide range of cellular processes, including sporulation, cellular differentiation, ribosome assembly, DNA replication, chromosome segregation, and stringent response in eubacteria and plant chloroplasts. Obg subfamily proteins have three domains: the Obg fold, the G domain, and the Obg C-terminal (OCT) domain. A potential role of the OCT domain in the regulation of the nucleotide-binding state has been suggested [ , , ]. The OCT domain structure contains a four-stranded beta sheet and three alpha helices flanked by an additional beta strand []. This entry represents the OCT domain.
Protein Domain
Name: RNA synthesis protein NSP10 superfamily, coronavirus
Type: Homologous_superfamily
Description: Non-structural protein NSP10 is involved in RNA synthesis. It is synthesised as part of a replicase polyprotein, whose cleavage generates many non-structural proteins [ ]. NSP10 has a mixed α/β fold comprised of five α-helices contains, one 3(10)-helix, and three β-strands () and it is rich in cysteines, featuring two zinc fingers with Cx(2)-C-x(5)-H-x(6)-C and C-x(2)-C-x(7)-C-x-C motifs [ , ]. Twelve identical subunits assemble to form a unique spherical dodecameric architecture, which is proposed to be a functional form of the ExoN/MTase coactivator domainThe small NSP10 protein is among the more conserved coronavirus proteins and a critical cofactor for activation of multiple replicative enzymes. It interacts with NSP14 and NSP16 acting as a scaffolding protein and regulating their respective exonuclease (ExoN) and ribose-2'-O-MTase (2'-O-MTase) activities [, ], mediating the stabilization of the SAM binding pockets of NSP16 and NSP14. When binding to the N-terminal of NSP14, NSP10 allows the ExoN active site to adopt a stably closed conformation, allowing efficient hydrolysis of dsRNA []. Efficient catalytic activity of NSP16 depends on heterodimerization with NSP10. The structure of the SARS-CoV-2 NSP10/NSP16 heterodimer revealed that it is extremely similar to that of its SARS-CoV-1 homologue [].One NSP10 residue, Tyr-96, is of particular interest. The aromatic nature of Tyr-96 plays a crucial role in the NSP10-NSP16 interaction and in the activation of the NSP16 2'-O-MTase activity as well as in the NSP10-NSP14 interaction. This residue is specific for SARS-CoV NSP10, and is a phenylalanine in other coronavirus homologues [ ].
Protein Domain
Name: L-A virus major coat protein superfamily
Type: Homologous_superfamily
Description: Members of this entry include the major coat protein of the Saccharomyces cerevisiae virus L-A (ScV-L-A) [ ]. The major coat protein is a large polypeptide without apparent domain division.
Protein Domain
Name: Light-independent protochlorophyllide reductase, iron-sulphur ATP-binding protein
Type: Family
Description: Synonym: dark protochlorophyllide reductaseProtochlorophyllide reductase catalyzes the reductive formation of chlorophyllide from protochlorophyllide during biosynthesis of chlorophylls and bacteriochlorophylls. Three genes, bchL, bchN and bchB, are involved in light-independent protochlorophyllide reduction in bacteriochlorophyll biosynthesis. In cyanobacteria, algae, and gymnosperms, three similar genes, chlL, chlN and chlB are involved in protochlorophyllide reduction during chlorophylls biosynthesis. BchL/chlL, bchN/chlN and bchB/chlB exhibit significant sequence similarity to the nifH, nifD and nifK subunits of nitrogenase, respectively. Nitrogenase catalyzes the reductive formation of ammonia from dinitrogen [ ]. The light-independent (dark) form of protochlorophyllide reductase plays a key role in the ability of gymnosperms, algae, and photosynthetic bacteria to form chlorophyll in the dark. Genetic and sequence analyses have indicated that dark protochlorophyllide reductase consists of three protein subunits that exhibit significant sequence similarity to the three subunits of nitrogenase, which catalyzes the reductive formation of ammonia from dinitrogen. Dark protochlorophyllide reductase activity was shown to be dependent on the presence of all three subunits, ATP, and the reductant dithionite.The BchL peptide (ChlL in chloroplast and cyanobacteria) is an ATP-binding iron-sulphur protein of the dark form protochlorophyllide reductase, an enzyme similar to nitrogenase [ ].
Protein Domain
Name: Chlorophyllide reductase iron protein subunit X
Type: Family
Description: This entry represents the X subunit of the three-subunit enzyme, (bacterio)chlorophyllide reductase [ , ]. This enzyme is responsible for the reduction of the chlorin B-ring and is closely related to the protochlorophyllide reductase complex which reduces the D-ring. Both of these complexes in turn are homologous to nitrogenase. This subunit is homologous to the nitrogenase component II, or 'iron' protein.
Protein Domain
Name: Bacteriophage PRD1, P2, absorption protein superfamily
Type: Homologous_superfamily
Description: This entry represents absorption protein P2 (synonym: receptor-binding protein P2) from the bacteriophage PRD1. Absorption protein P2 is a multi-β-sheet protein whose complicated topology forms an elongated seahorse-shaped molecule with a distinct head, containing a pseudo-beta propeller structure with approximate six-fold symmetry, and tail (β-sandwich). It is required for the attachment of the phage to the host conjugative DNA transfer complex. This is a poorly understood large transmembrane complex of unknown architecture, with at least 11 different proteins [].
Protein Domain
Name: Baseplate tail-tube protein gp48, T4-like virus
Type: Family
Description: This viral tail tube protein is also referred to as Gp48. It is required for the assembly and length regulation of the tail tube of bacteriophage T4 [ ].
Protein Domain
Name: Protein of unknown function DUF112, transmembrane
Type: Family
Description: Members of this prokaryotic family have no known function. Members are predicted to be integral membrane proteins and are similar to a protein in a tartrate utilisation region (TAR) of Agrobacterium vitis a common pathogen of grapevine. Most grapevine strains utilise tartrate, an abundant compound in grapevine [ ].
Protein Domain
Name: Altered inheritance of mitochondria protein 11
Type: Family
Description: This family consists of uncharacterized proteins from fungi. They have been named altered inheritance of mitochondria protein 11 (AIM11). Saccharomyces cerevisiae has a second paralogue: YBL059W/IAI11.
Protein Domain
Name: Hepatitis C virus, Non-structural 5a protein
Type: Domain
Description: Although Hepatitis A virus, Hepatitis B virus, and Hepatitis C virus have similar names, because they all cause liver inflammation, these are distinctly different viruses both genetically and clinically. The Hepatitis C virus (HCV) is a small (50-80 nm in diameter), enveloped, single-stranded, positive sense RNA virus. It is member of the family Flaviviridae. There are seven genotypes and a number of subtypes with diverse geographic distributions. The genome of HCV consists of a single open reading frame. At the 5' and 3' ends of the RNA are the UTR regions that are not translated into proteins but are important to translation and replication of the viral RNA. The 5' UTR has a ribosome binding site (IRES - Internal ribosome entry site) that starts the translation of unique polyprotein that is later cut by cellular and viral proteases into 10 active structural and non-structural smaller proteins []. This entry represents Non-structural 5a viral protein (NS5A). This zinc-containing phosphoprotein plays a role in the regulation of HCV RNA replication [ , ]. Biochemical characterization of the complex formed by the interaction of NS5A-RNA show the importance of zinc in this interaction and confirm the binding of the protein to the viral genome []. The NS5a protein is phosphorylated when expressed in mammalian cells.It is thought to interact with the dsRNA-dependent (interferon inducible) kinase PKR, [ , ]. It also modulates TNFRSF21/DR6 signaling pathway for viral propagation [].
Protein Domain
Name: Ectoderm-neural cortex protein 1, BTB/POZ domain
Type: Domain
Description: Ectoderm-neural cortex protein 1 (ENC1), also known as kelch-like protein 37 (KLHL37), is expressed in the central nervous system (CNS), where it interacts with actin and contributes to the organisation of the cytoskeleton during the specification of neural fate [ ]. ENC1 functions as a negative regulator of transcription factor Nrf2 (a regulator of a cellular defense mechanism against environmental insults) through suppressing Nrf2 protein translation []. It plays a pivotal role in neuronal and adipocyte differentiation [].The KLHL (Kelch-like) proteins generally have a BTB/POZ domain, a BACK domain, and five to six Kelch motifs. They constitute a subgroup at the intersection between the BTB/POZ domain and Kelch domain superfamilies. The BTB/POZ domain facilitates protein binding [ ], while the Kelch domain (repeats) form β-propellers. The Kelch superfamily of proteins can be subdivided into five groups: (1) N-propeller, C-dimer proteins, (2) N-propeller proteins, (3) propeller proteins, (4) N-dimer, C-propeller proteins, and (5) C-propeller proteins. KLHL family members belong to the N-dimer, C-propeller subclass of Kelch repeat proteins []. In addition to BTB/POZ and Kelch domains, the KLHL family members contain a BACK domain, first described as a 130-residue region of conservation observed amongst BTB-Kelch proteins []. Many of the Kelch-like proteins have been identified as adaptors for the recruitment of substrates to Cul3-based E3 ubiquitin ligases [, ].This entry represents the N-terminal BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain of ENC1.
Protein Domain
Name: Kelch-like ECH-associated protein 1, BTB/POZ domain
Type: Domain
Description: The KLHL (Kelch-like) proteins generally have a BTB/POZ domain, a BACK domain, and five to six Kelch motifs. They constitute a subgroup at the intersection between the BTB/POZ domain and Kelch domain superfamilies. The BTB/POZ domain facilitates protein binding [ ], while the Kelch domain (repeats) form β-propellers. The Kelch superfamily of proteins can be subdivided into five groups: (1) N-propeller, C-dimer proteins, (2) N-propeller proteins, (3) propeller proteins, (4) N-dimer, C-propeller proteins, and (5) C-propeller proteins. KLHL family members belong to the N-dimer, C-propeller subclass of Kelch repeat proteins []. In addition to BTB/POZ and Kelch domains, the KLHL family members contain a BACK domain, first described as a 130-residue region of conservation observed amongst BTB-Kelch proteins []. Many of the Kelch-like proteins have been identified as adaptors for the recruitment of substrates to Cul3-based E3 ubiquitin ligases [, ].Kelch-like ECH-associated protein 1 (KEAP1, also known as KLHL19) is a BTB-Kelch substrate adaptor protein for a Cul3-dependent ubiquitin ligase complex that functions as a sensor for thiol-reactive chemopreventive compounds and oxidative stress. It targets NFE2L2/NRF2 (a transcription factor) for ubiquitination and degradation by the proteasome, thus resulting in the suppression of its transcriptional activity and the repression of antioxidant response element-mediated detoxifying enzyme gene expression [ , ]. Another KEAP1 substrate, PGAM5, a Bcl-XL-interacting protein, has also been identified [].This entry represents the N-terminal BTB (Broad-Complex, Tramtrack and Bric a brac)/POZ (poxvirus and zinc finger) domain of KEAP1 and similar animal proteins.
Protein Domain
Name: Kelch-like protein 2 , BTB/POZ domain
Type: Domain
Description: KLHL2, also called actin-binding protein Mayven, is a novel actin-binding protein predominantly expressed in the brain. It plays a role in the reorganisation of the actin cytoskeleton, and promotes growth of cell projections in oligodendrocyte precursors [ , ]. KLHL2 is a component of a cullin-RING-based BCR (BTB-CUL3-RBX1) E3 ubiquitin-protein ligase complex that mediates the ubiquitination of target proteins, such as NPTXR, leading most often to their proteasomal degradation [, ]. It contains a BTB domain and kelch repeat domains, characteristics of a kelch family protein.This entry represents the BTB/POZ domain, which is a common protein-protein interaction motif of about 100 amino acids.
Protein Domain
Name: La-related protein 7 homolog, xRRM domain
Type: Domain
Description: This is the atypical RRM (xRRM) domain found in a group of fungal proteins, including La-related protein 7 homolog (Lar7, also known as Pof8) from Schizosaccharomyces pombe, a member of the ancient superfamily of La and La-related proteins (LaRPs) [ , , ]. Lar7 is a RNA-binding protein required for assembly of the holoenzyme telomerase ribonucleoprotein (RNP) complex. It binds telomerase RNA (TER1) and promotes assembly of TER1 with catalytic subunit of telomerase reverse transcriptase (Trt1) [].
Protein Domain
Name: Insulin-like growth factor binding protein 2
Type: Family
Description: The insulin family of proteins groups together several evolutionarily related active peptides [ ]: these include insulin [, ], relaxin [, ], insect prothoracicotropic hormone (bombyxin) [], insulin-like growth factors (IGF1 and IGF2) [, ], mammalian Leydig cell-specific insulin-like peptide (gene INSL3), early placenta insulin-like peptide (ELIP) (gene INSL4), locust insulin-related peptide (LIRP), molluscan insulin-related peptides (MIP), and Caenorhabditis elegans insulin-like peptides. The 3D structures of a number of family members have been determined [, , ]. The fold comprises two polypeptide chains (A and B) linked by two disulphide bonds: all share a conserved arrangement of 4 cysteines in their A chain, the first of which is linked by a disulphide bond to the third, while the second and fourth are linked by interchain disulphide bonds to cysteines in the B chain. Insulin is found in many animals, and is involved in the regulation of normal glucose homeostasis. It also has other specific physiological effects, such as increasing the permeability of cells to monosaccharides, amino acids and fatty acids, and accelerating glycolysis and glycogen synthesis in the liver [ ]. Insulin exerts its effects by interaction with a cell-surface receptor, which may also result in the promotion of cell growth []. Insulin is synthesised as a prepropeptide from which an endoplasmic reticulum-targeting sequence is cleaved to yield proinsulin. The sequence of prosinsulin contains 2 well-conserved regions (designated A and B), separated by an intervening connecting region (C), which is variable between species [ ]. The connecting region is cleaved, liberating the active protein, which contains the A and B chains, held together by 2 disulphide bonds []. Insulin-like Growth Factor Binding Proteins (IGFBP) are a group of vertebrate secreted proteins, which bind to IGF-I and IGF-II with high affinity and modulate the biological actions of IGFs. The IGFBP family has six distinct subgroups, IGFBP-1 through 6, based on conservation of gene (intron-exon) organisation, structural similarity, and binding affinity for IGFs. Across species, IGFBP-5 exhibits the most sequence conservation, while IGFBP-6 exhibits the least sequence conservation. The IGFBPs contain inhibitor domain homologues, which are related to MEROPS protease inhibitor family I31 (equistatin, clan IX). All IGFBPs share a common domain architecture ( : ). While the N-terminal ( , IGF binding protein domain), and the C-terminal ( , thyroglobulin type-1 repeat) domains are conserved across vertebrate species, the mid-region is highly variable with respect to protease cleavage sites and phosphorylation and glycosylation sites. IGFBPs contain 16-18 conserved cysteines located in the N-terminal and the C-terminal regions, which form 8-9 disulphide bonds [ ]. As demonstrated for human IGFBP-5, the N terminus is the primary binding site for IGF. This region, comprised of Val49, Tyr50, Pro62 and Lys68-Leu75, forms a hydrophobic patch on the surface of the protein [ ]. The C terminus is also required for high affinity IGF binding, as well as for binding to the extracellular matrix [] and for nuclear translocation [, ] of IGFBP-3 and -5. IGFBPs are unusually pleiotropic molecules. Like other binding proteins, IGFBP can prolong the half-life of IGFs via high affinity binding of the ligands. In addition to functioning as simple carrier proteins, serum IGFBPs also serve to regulate the endocrine and paracrine/autocrine actions of IGF by modulating the IGF available to bind to signalling IGF-I receptors [ , ]. Furthermore, IGFBPs can function as growth modulators independent of IGFs. For example, IGFBP-5 stimulates markers of bone formation in osteoblasts lacking functional IGFs []. The binding of IGFBP to its putative receptor on the cell membrane may stimulate the signalling pathway independent of an IGF receptor, to mediate the effects of IGFBPs in certain target cell types. IGFBP-1 and -2, but not other IGFBPs, contain a C-terminal Arg-Gly-Asp integrin-binding motif. Thus, IGFBP-1 can also stimulate cell migration of CHO and human trophoblast cells through an action mediated by alpha 5 beta 1 integrin []. Finally, IGFBPs transported into the nucleus (via the nuclear localisation signal) may also exert IGF-independent effects by transcriptional activation of genes.This family represents IGFBP-2, which in general appears to inhibit IGF actions, in particular those of IGF-II. IGFBP-2 plays a crucial role in regulating cell proliferation. It can stimulate cell proliferation in an IGF-independent manner in certain cell types, but can also inhibit cell proliferation by suppressing the activities of IGFs [ ]. For example, IGFBP-2 exerts an inhibitory effect on non-malignant prostate cells but stimulates proliferation of prostate cancer cells in a MAP-kinase and androgen-modulated process []. IGFBP-2 has been demonstrated to associate with the cell surface (e.g. cell membranes of the rat olfactory bulb []) via proteoglycans in a RGD motif-independent manner.
Protein Domain
Name: Insulin-like growth factor binding protein 5
Type: Family
Description: The insulin family of proteins groups together several evolutionarily related active peptides [ ]: these include insulin [, ], relaxin [, ], insect prothoracicotropic hormone (bombyxin) [], insulin-like growth factors (IGF1 and IGF2) [, ], mammalian Leydig cell-specific insulin-like peptide (gene INSL3), early placenta insulin-like peptide (ELIP) (gene INSL4), locust insulin-related peptide (LIRP), molluscan insulin-related peptides (MIP), and Caenorhabditis elegans insulin-like peptides. The 3D structures of a number of family members have been determined [, , ]. The fold comprises two polypeptide chains (A and B) linked by two disulphide bonds: all share a conserved arrangement of 4 cysteines in their A chain, the first of which is linked by a disulphide bond to the third, while the second and fourth are linked by interchain disulphide bonds to cysteines in the B chain. Insulin is found in many animals, and is involved in the regulation of normal glucose homeostasis. It also has other specific physiological effects, such as increasing the permeability of cells to monosaccharides, amino acids and fatty acids, and accelerating glycolysis and glycogen synthesis in the liver [ ]. Insulin exerts its effects by interaction with a cell-surface receptor, which may also result in the promotion of cell growth []. Insulin is synthesised as a prepropeptide from which an endoplasmic reticulum-targeting sequence is cleaved to yield proinsulin. The sequence of prosinsulin contains 2 well-conserved regions (designated A and B), separated by an intervening connecting region (C), which is variable between species []. The connecting region is cleaved, liberating the active protein, which contains the A and B chains, held together by 2 disulphide bonds []. Insulin-like Growth Factor Binding Proteins (IGFBP) are a group of vertebrate secreted proteins, which bind to IGF-I and IGF-II with high affinity and modulate the biological actions of IGFs. The IGFBP family has six distinct subgroups, IGFBP-1 through 6, based on conservation of gene (intron-exon) organisation, structural similarity, and binding affinity for IGFs. Across species, IGFBP-5 exhibits the most sequence conservation, while IGFBP-6 exhibits the least sequence conservation. The IGFBPs contain inhibitor domain homologues, which are related to MEROPS protease inhibitor family I31 (equistatin, clan IX). All IGFBPs share a common domain architecture ( : ). While the N-terminal ( , IGF binding protein domain), and the C-terminal ( , thyroglobulin type-1 repeat) domains are conserved across vertebrate species, the mid-region is highly variable with respect to protease cleavage sites and phosphorylation and glycosylation sites. IGFBPs contain 16-18 conserved cysteines located in the N-terminal and the C-terminal regions, which form 8-9 disulphide bonds [ ]. As demonstrated for human IGFBP-5, the N terminus is the primary binding site for IGF. This region, comprised of Val49, Tyr50, Pro62 and Lys68-Leu75, forms a hydrophobic patch on the surface of the protein [ ]. The C terminus is also required for high affinity IGF binding, as well as for binding to the extracellular matrix [] and for nuclear translocation [, ] of IGFBP-3 and -5. IGFBPs are unusually pleiotropic molecules. Like other binding proteins, IGFBP can prolong the half-life of IGFs via high affinity binding of the ligands. In addition to functioning as simple carrier proteins, serum IGFBPs also serve to regulate the endocrine and paracrine/autocrine actions of IGF by modulating the IGF available to bind to signalling IGF-I receptors [ , ]. Furthermore, IGFBPs can function as growth modulators independent of IGFs. For example, IGFBP-5 stimulates markers of bone formation in osteoblasts lacking functional IGFs []. The binding of IGFBP to its putative receptor on the cell membrane may stimulate the signalling pathway independent of an IGF receptor, to mediate the effects of IGFBPs in certain target cell types. IGFBP-1 and -2, but not other IGFBPs, contain a C-terminal Arg-Gly-Asp integrin-binding motif. Thus, IGFBP-1 can also stimulate cell migration of CHO and human trophoblast cells through an action mediated by alpha 5 beta 1 integrin []. Finally, IGFBPs transported into the nucleus (via the nuclear localisation signal) may also exert IGF-independent effects by transcriptional activation of genes.This family represents IGFBP-5, which is the most conserved IGFBP and an essential regulator of physiological processes in bone, kidney and mammary gland [ ]. IGFBP-5 can function independently of the IGFs; for example, acting as a growth factor stimulating markers of bone formation in osteoblasts lacking functional IGFs. IGFBP-5 contains a putative nuclear localization signal in the C-terminal region [], consistent with its translocation to the nucleus of human breast cancer cells []. The three-dimensional structure of the N-terminal IGF-binding domain of IGFBP-5, complexed with IGF-I, has been determined [].
Protein Domain
Name: Insulin-like growth factor binding protein 3
Type: Family
Description: The insulin family of proteins groups together several evolutionarily related active peptides [ ]: these include insulin [, ], relaxin [, ], insect prothoracicotropic hormone (bombyxin) [], insulin-like growth factors (IGF1 and IGF2) [, ], mammalian Leydig cell-specific insulin-like peptide (gene INSL3), early placenta insulin-like peptide (ELIP) (gene INSL4), locust insulin-related peptide (LIRP), molluscan insulin-related peptides (MIP), and Caenorhabditis elegans insulin-like peptides. The 3D structures of a number of family members have been determined [, , ]. The fold comprises two polypeptide chains (A and B) linked by two disulphide bonds: all share a conserved arrangement of 4 cysteines in their A chain, the first of which is linked by a disulphide bond to the third, while the second and fourth are linked by interchain disulphide bonds to cysteines in the B chain. Insulin is found in many animals, and is involved in the regulation of normal glucose homeostasis. It also has other specific physiological effects, such as increasing the permeability of cells to monosaccharides, amino acids and fatty acids, and accelerating glycolysis and glycogen synthesis in the liver [ ]. Insulin exerts its effects by interaction with a cell-surface receptor, which may also result in the promotion of cell growth []. Insulin is synthesised as a prepropeptide from which an endoplasmic reticulum-targeting sequence is cleaved to yield proinsulin. The sequence of prosinsulin contains 2 well-conserved regions (designated A and B), separated by an intervening connecting region (C), which is variable between species [ ]. The connecting region is cleaved, liberating the active protein, which contains the A and B chains, held together by 2 disulphide bonds []. Insulin-like Growth Factor Binding Proteins (IGFBP) are a group of vertebrate secreted proteins, which bind to IGF-I and IGF-II with high affinity and modulate the biological actions of IGFs. The IGFBP family has six distinct subgroups, IGFBP-1 through 6, based on conservation of gene (intron-exon) organisation, structural similarity, and binding affinity for IGFs. Across species, IGFBP-5 exhibits the most sequence conservation, while IGFBP-6 exhibits the least sequence conservation. The IGFBPs contain inhibitor domain homologues, which are related to MEROPS protease inhibitor family I31 (equistatin, clan IX). All IGFBPs share a common domain architecture ( : ). While the N-terminal ( , IGF binding protein domain), and the C-terminal ( , thyroglobulin type-1 repeat) domains are conserved across vertebrate species, the mid-region is highly variable with respect to protease cleavage sites and phosphorylation and glycosylation sites. IGFBPs contain 16-18 conserved cysteines located in the N-terminal and the C-terminal regions, which form 8-9 disulphide bonds []. As demonstrated for human IGFBP-5, the N terminus is the primary binding site for IGF. This region, comprised of Val49, Tyr50, Pro62 and Lys68-Leu75, forms a hydrophobic patch on the surface of the protein [ ]. The C terminus is also required for high affinity IGF binding, as well as for binding to the extracellular matrix [] and for nuclear translocation [, ] of IGFBP-3 and -5. IGFBPs are unusually pleiotropic molecules. Like other binding proteins, IGFBP can prolong the half-life of IGFs via high affinity binding of the ligands. In addition to functioning as simple carrier proteins, serum IGFBPs also serve to regulate the endocrine and paracrine/autocrine actions of IGF by modulating the IGF available to bind to signalling IGF-I receptors [ , ]. Furthermore, IGFBPs can function as growth modulators independent of IGFs. For example, IGFBP-5 stimulates markers of bone formation in osteoblasts lacking functional IGFs []. The binding of IGFBP to its putative receptor on the cell membrane may stimulate the signalling pathway independent of an IGF receptor, to mediate the effects of IGFBPs in certain target cell types. IGFBP-1 and -2, but not other IGFBPs, contain a C-terminal Arg-Gly-Asp integrin-binding motif. Thus, IGFBP-1 can also stimulate cell migration of CHO and human trophoblast cells through an action mediated by alpha 5 beta 1 integrin []. Finally, IGFBPs transported into the nucleus (via the nuclear localisation signal) may also exert IGF-independent effects by transcriptional activation of genes.This family represents IGFBP-3, which is the major IGF-binding protein in serum and is expressed in many tissues, including normal and malignant breast epithelium [ ]. IGFBP-3 has a well-characterised role in modulating the mitogenic and anti-apoptotic effects of IGFs by regulating their access to the IGF-I receptor []. There is now accumulating evidence to suggest that IGFBP-3 may have intrinsic anti-proliferative and pro-apoptotic effects on the growth of human cancer cells. Similar to IGFBP-5, IGFBP-3 can accumulate in the nucleus via its C-terminal nuclear translocation signal [], where it may exert its growth-modulating effects. IGFBP-3, via its C-terminal end, can bind to the acid-labile subunit (ALS), which together with IGF forms the 150kDa ternary complex found in serum. In addition, IGFBP-3 can function via a TGFbeta-related pathway, probably by binding to a type V TGF-beta receptor [].
Protein Domain
Name: Ribosomal protein L7/L12, oligomerisation domain superfamily
Type: Homologous_superfamily
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].Ribosomal protein L7/12 consists of two domains that are connected by a flexible region. The N-terminal domain is required for dimer formation and for anchoring the protein to the ribosome by binding to ribosomal protein L10, while the C-terminal domain is required for translation factors binding [ ].This entry represents the oligomerisation domain superfamily of the ribosomal protein L7/12. It has a multihelical structure with an intertwined tetramer topology.
Protein Domain
Name: Golgi-associated PDZ and coiled-coil motif-containing protein
Type: Family
Description: GOPC, also known as PIST or CAL, is primarily localized to the Golgi apparatus. It binds the G protein-coupled receptor beta1AR and modulates beta1AR intracellular trafficking [ ]. GOPC also interacts with cystic fibrosis transmembrane regulator (CFTR), retains CFTR in the cell and targets it for degradation [].
Protein Domain
Name: Bifunctional molybdenum cofactor biosynthesis protein MoaC/MogA
Type: Family
Description: This entry includes a group of molybdenum cofactor biosynthesis bifunctional proteins. The proteins are composed of 2 domains: a molybdenum cofactor biosynthesis protein C-like domain and a molybdenum cofactor biosynthesis protein B-like domain. Together with MoaA, they are involved in the conversion of 5'-GTP to cyclic pyranopterin monophosphate (cPMP or molybdopterin precursor Z).
Protein Domain
Name: Molybdenum cofactor biosynthesis protein B, proteobacteria
Type: Family
Description: MoaB is thought to be involved in molybdopterin biosynthesis, though its exact role is not known. Structural studies of this polypeptide suggest that it may play a role in substrate-shuttling during biosynthesis [ ]. MoaB was capable of binding GTP, and it was suggested that the putative active site could also bind precursor Z and/or molybdenum. Potential protein interaction domains were also found, implying that MoaB may play a transport and/or storage role in molybdopterin biosynthesis.
Protein Domain
Name: Partitioning defective protein 6, PB1 domain
Type: Domain
Description: PAR-6 is essential for asymmetric division of the Caenorhabditis elegans zygote and for cell polarization in the Drosophila embryo [ , , ]. Mammalian Par6 binds to Par3 and aPKC, as well as to the small GTPase Cdc42, to form a complex which is a major cell polarity regulator []. This complex has been implicated in the formation of tight junctions (TJ) in mammalian epithelial cells. Whereas only one par-6 gene exists in Drosophila and C. elegans, a family of four is present in mammals, designated as Par6A-D [], although no full-length clone of Par6D has yet been identified. Par6 isoforms A-C localize differently when expressed in MDCK epithelial cells and have distinct effects on TJ formation [].Par6 contains an N-terminal PB1 domain, a C-terminal PDZ domain and a semi-CRIB motif immediately preceding the PDZ domain. The PB1 domain of Par6 forms a hetero-dimer with the PB1 domain of aPKC [ ].
Protein Domain
Name: Defect at low temperature protein 1
Type: Family
Description: This family of proteins from fungi includes Dlt1 from Saccharomyces cerevisiae. It is a protein of unknown function that is required for growth under high-pressure and low-temperature conditions [ ].
Protein Domain
Name: Autophagy-related protein 2, middle RBG module
Type: Domain
Description: The Atg2 protein, an integral membrane protein, is required for a range of functions including the regulation of autophagy in conjunction with the Atg1-Atg13 complex [, , ]. It is a lipid transfer protein required for autophagosome completion and peroxisome degradation. It is a member of the RBG (repeating β-grooves) superfamily, together with VPS13, SHIP164, Csf1, and the Hob proteins, which all share the same structure consisting of long hydrophobic grooves made of multiple repeating modules of five β-sheets followed by a loop [, ]. This entry represents central RBG modules.
Protein Domain
Name: Attachment protein G3P, N-terminal domain superfamily
Type: Homologous_superfamily
Description: The G3P protein (also known as attachment protein or coat protein A) of filamentous phage such as M13, phage fd and phage f1, is an essential coat protein for the infection of Escherichia coli. The G3P protein consists of three domains: two N-terminal domains (N1 and N2) with a similar β-barrel fold, and a C-terminal domain [ ]. The N-terminal domains protrude from the phage surface, while the C-terminal domain acts as an anchor embedded in the phage coat, together forming a horseshoe-like structure []. The G3P protein exists as 3-5 copies at the tip of the phage particle. Infection by filamentous phage involves two distinct cellular receptors, the F' pilus and the periplasmic protein TolA, which are bound sequentially [ ]. The N2 domain binds the F' pilus, causing a conformational change which allows the N1 domain to bind the C-terminal domain of TolA as a co-receptor.This entry represents the two N-terminal domains, N1 and N2, of G3P.
Protein Domain
Name: Poliovirus core protein 3a, soluble domain
Type: Homologous_superfamily
Description: The 3A protein is found in positive-strand RNA viruses. It is a critical component of the poliovirus replication complex, and is also an inhibitor of host cell ER to Golgi transport. This superfamily represents the soluble domain of poliovirus core protein 3a.
Protein Domain
Name: WW domain-containing adapter protein with coiled-coil
Type: Family
Description: WAC interacts with RNF20/40 through its C-terminal coiled-coil region and promotes RNF20/40 s E3 ligase activity for H2B ubiquitination [ ]. Drosophila WAC,also known as Wacky, promotes the interaction between TTT and Pontin/Reptin, thereby promoting mTORC1 activity by facilitating mTORC1 dimerization and mTORC1-Rag interaction [ ].
Protein Domain
Name: RNA-binding protein 27, RNA recognition motif
Type: Domain
Description: This entry represents the RNA recognition motif (RRM) of human RNA-binding protein 27 (RBM27). Although the specific function of the RRM in RBM27 remains unclear, it shows high sequence similarity with RRM1 of RBM26, which functions as a cutaneous lymphoma (CL)-associated antigen [ ].
Protein Domain
Name: Protein elav, RNA recognition motif 1
Type: Domain
Description: This entry represents the RNA recognition motif 1 (RRM1) of Drosophila embryonic lethal abnormal vision (ELAV) protein.
Protein Domain
Name: Protein couch potato, RNA recognition motif
Type: Domain
Description: This entry represents the RNA recognition motif (RRM) of Cpo, an RNA-binding protein encoded by Drosophila couch potato (cpo) gene. The gene is named because several partial loss-of-function alleles cause hypoactive behavior in adults. Cpo contains a well conserved RNA recognition motif (RRM). It may control the processing of RNA molecules required for the proper functioning of the peripheral nervous system (PNS) [ ].
Protein Domain
Name: La-related protein 6, RNA recognition motif
Type: Domain
Description: This entry represents the RNA recognition motif (RRM) of LARP6.LARP6 (also known as acheron) is a novel member of the lupus antigen (La) family. LARP6 has been found to regulate the stability and translation of collagen mRNAs by binding to a stem-loop structure in the 5'UTR of type I and type III collagen mRNAs [ , ]. It can also bind to the developmental transcription factor CASK-C and form a complex with Id (inhibitor of differentiation) transcription factors []. It shuttles between the nucleus and cytoplasm and participates in the nuclear export of collagen mRNAs [].LARP6 is structurally related to the La autoantigen and contains a La motif (LAM), nuclear localization and export (NLS and NES) signals, and an RNA recognition motif (RRM) [ ].
Protein Domain
Name: RNA-binding protein 18, RNA recognition motif
Type: Domain
Description: This entry represents the RRM (RNA recognition motif) of RBM18, a putative RNA-binding protein containing a well-conserved RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain). The biological role of RBM18 remains unclear [ , ].
Protein Domain
Name: Nucleolar protein 8, RNA recognition motif
Type: Domain
Description: This entry represents the RNA recognition motif of NOP8 (also known as NOP132). In humans, NOP132 associates with proteins involved in ribosome biogenesis and RNA metabolism, including the DEAD-box RNA helicase protein, DDX47 [ ]. Budding yeast NOP8 is required for 60S ribosomal subunit synthesis [].
Protein Domain
Name: Uncharacterised membrane protein Ta0354, soluble domain
Type: Domain
Description: This domain is found at the C-terminal of a family of archaeal proteins annotated as membrane proteins.
Protein Domain
Name: DNA-dependent protein kinase catalytic subunit, CC3
Type: Domain
Description: This domain represents a region of the Circular Cradle segment (CC) from DNA-PKcs that covers the complete CC3 and part of CC4. This domain contains the Ku-binding site A and a the highly conserved region (HCR) II [ ].DNA-dependent protein kinase catalytic subunit (DNA-PKcs) is involved in DNA nonhomologous end joining (NHEJ), which is recruited by Ku70/80 heterodimer to DNA ends and required for double-strand break (DSB) repair. DNA-PKcs phosphorylates a number of protein substrates, including the heat shock protein 90 (HSP90), the transcription factors p53, specificity protein 1 (Sp1) and MYC23, and a majority of NHEJ factors. It folds into three well-defined large structural units, consisting of a N-terminal region (arranged in four supersecondary α-helical structures, N1 to N4), the Circular Cradle (consisting of five supersecondary α-helical structures CC1 to CC5), and the C-terminal Head (comprising FAT, FRB, kinase, and FATC). The N-terminal and CCs regions resemble HEAT repeats, and thus, they are also referred to as N-HEAT and M-HEAT ('middle'), respectively. The N-terminal region likely mediates DNA binding and, together with the CCs, forms a ring through which Ku70/80 may present DNA for repair. The CCs form a curved elliptical ring that serves as a scaffold to maintain the integrity of the whole complex. It has been suggested that the binding of Ku or DNA activates the allosteric mechanism required for communication between the N terminus and the CC with the kinase in the Head [, , , ].
Protein Domain
Name: Nitrogenase molybdenum-iron protein beta chain, N-terminal
Type: Domain
Description: The enzyme responsible for nitrogen fixation, the nitrogenase, shows a high degree of conservation of structure, function, and amino acid sequence across wide phylogenetic ranges. All known Mo-nitrogenases consist of two components, component I (also called dinitrogenase, or Fe-Mo protein), an alpha2beta2 tetramer encoded by the nifD and nifK genes, and component II (dinitrogenase reductase, or Fe protein) a homodimer encoded by the nifH gene [ , ] which has an Fe4S4 cluster bound between the subunits and two ATP-binding domains. The Fe protein supplies energy by ATP hydrolysis, and transfers electrons from reduced ferredoxin or flavodoxin to component 1 for the reduction of molecular nitrogen to ammonia [ , ]. Nitrogenase contains two unusual rare metal clusters; one of them is the iron molybdenum cofactor (FeMo-co), which is considered to be the site of dinitrogen reduction and whose biosynthesis requires the products of the nifNE operon and of some other nif genes []. It has been proposed that nifNE might serve as a scaffold upon which FeMo-co is built and then inserted into component I [].This entry represents the uncharacterised N-terminal domain of the molybdenum-iron protein beta chain, which is part of the nitrogenase complex that catalyses the key enzymatic reactions in nitrogen fixation.
Protein Domain
Name: Adaptor protein Cbl, N-terminal domain superfamily
Type: Homologous_superfamily
Description: Cbl (Casitas B-lineage lymphoma) is an adaptor protein that functions as a negative regulator of many signalling pathways that start from receptors at the cell surface.The N-terminal region of Cbl contains a Cbl-type phosphotyrosine-binding (Cbl-PTB) domain, which is composed of three evolutionarily conserved domains: an N-terminal four-helix bundle (4H) domain, an EF hand-like calcium-binding domain, and a divergent SH2-like domain. The calcium-bound EF-hand wedges between the 4H and SH2 domains, and roughly determines their relative orientation. The Cbl-PTB domain has also been named Cbl N-terminal (Cbl-N) or tyrosine kinase binding (TKB) domain [ , ].The N-terminal 4H domain contains four long α-helices. The C and D helices in this domain pack against the adjacent EF-hand-like domain, and a highly conserved loop connecting the A and B helices contacts the SH2-like domain. The EF-hand motif is similar to classical EF-hand proteins. The SH2-like domain retains the general helix-sheet-helix architecture of the SH2 fold, but lacks the secondary β-sheet, comprising β-strands D', E and F, and also a prominent BG loop [].This entry represents the N-terminal four-helical bundle domain superfamily. This domain can also be found in the N terminus of mammalian MLKL, a pseudokinase essential for TNF-alpha-induced necroptosis []. The four-helical bundle (NB) domain of MLK is involved in oligomerization to facilitate plasma membrane targeting through the low-affinity binding of NB to phosphorylated inositol polar head groups of phosphatidylinositol phosphate (PIP) phospholipids [, ].
Protein Domain
Name: Integral membrane protein GPR155, DEP domain
Type: Domain
Description: GRP155-like proteins, also known as PGR22, contain an N-terminal permease domain, a central transmembrane region and a C-terminal DEP domain. They are orphan receptors of the class B G protein-coupled receptors [ ]. Their function is unknown.
Protein Domain
Name: Daunorubicin resistance ABC transporter membrane protein
Type: Family
Description: ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [ ].The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [, , ].The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly β-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel β-sheet of armI by a two-fold axis [ , , , , , ].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions [ ]. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [, ]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [, , ].This entry represents the Daunorubicin resistance ABC transporter membrane protein, which is associated with the effux of the drug daunorubicin and found in bacteria and archaea. It functions as an ATP dependent antiporter. In eukaryotes proteins of similar function include p-glyco-proteins and a multidrug resistance protein.
Protein Domain
Name: Polycystic kidney disease type 1 protein
Type: Family
Description: Polycystin-1 (PC1) plays a critical role in renal tubule diameter control. Mutations in the polycystin-1 gene cause cyst formation in human autosomal dominant polycystic kidney disease [ , ]. It may serve as a cell surface signaling receptor at cell-cell/cell-matrix junctions and as a mechano-sensor in renal primary cilia that activates signalling pathways involved in renal tubular differentiation [].Polycystin-1 contains an REJ (receptor for egg jelly) domain and a GPS (G protein-coupled receptor proteolytic site) domain in its N-terminal extracellular region (ectodomain). It can be cleaved into N-terminal fragment (NTF) and C-terminal fragment (CTF) at the GPS domain [ ]. The GPS cleavage may play an important role for the biological function of PC1 [].
Protein Domain
Name: Mitochondrial adapter protein MCP1, transmembrane domain
Type: Domain
Description: This entry represents the transmembrane domains found in Mitochondrial adapter protein MCP1 from yeast, a mitochondrial protein involved in mitochondrial lipid homeostasis [ ]. MCP1 recruits the lipid transfer protein Vps13 to mitochondria promoting vacuole-mitochondria contacts [, , ].
Protein Domain
Name: Tobacco mosaic virus-like, coat protein superfamily
Type: Homologous_superfamily
Description: This superfamily contains virus coat proteins. Examples include those from Tobacco mosaic virus (TMV), Cucumber green mottle mosaic virus and Ribgrass mosaic virus (RMV).In order to establish infections, viruses must be delivered to the cells of potential hosts and must then engage in activities that enable their genomes to be expressed and replicated. With most viruses, the events that precede the onset of production of progeny virus particles are referred to as the early events and, in the case of positive-strand RNA viruses, they include the initial interaction with and entry of host cells and the release (uncoating) of the genome from the virus particles. The uncoating process in TMV may involve the bidirectional release of coat protein subunits from the viral RNA which may be mediated by cotranslational and coreplicational disassembly mechanisms [ ].The TMV particle is assembled from its constituent coat protein and RNA by a complex process. The protein forms an obligatory intermediate (a cylindrical disk composed of two layers of protein units), which recognises a specific RNA hairpin sequence. This mechanism simultaneously fulfils the physical requirement for nucleating the growth of the helical particle and the biological requirement for specific recognition of the viral DNA [ ]. TMV has a four-helical up-and-down bundle fold.
Protein Domain
Name: Ribosomal protein S3, C-terminal domain superfamily
Type: Homologous_superfamily
Description: Ribosomal protein S3 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S3 is known to be involved in the binding of initiator Met-tRNA. This family of ribosomal proteins includes S3 from bacteria, algae and plant chloroplast, cyanelle, archaebacteria, plant mitochondria, vertebrates, insects, Caenorhabditis elegans and yeast [ ]. This entry is the C-terminal domain. This domain has two layers (alpha/beta) with antiparallel β-sheets.
Protein Domain
Name: Protein of unknown function DUF3865, CADD-like
Type: Family
Description: This entry represents a family of proteins of unknown function. The Nostoc punctiforme protein ( ) has been structurally characterised and adopts a heme oxygenase-like fold similar to that of the Chlamydia trachomatis death domain-binding CADD protein. The proposed active sites of the Nostoc and Chlamydia sequences are identical, suggesting similar functions.
Protein Domain
Name: Conserved hypothetical protein CHP02679, N terminus
Type: Domain
Description: This domain is found to the N terminus of bacterial conserved hypothetical proteins which are encoded within a conserved gene four-gene neighbourhood found sporadically in a phylogenetically broad range of bacteria including: Nocardia farcinica, Symbiobacterium thermophilum, and Streptomyces avermitilis (Actinobacteria), Geobacillus kaustophilus (Firmicutes), Azoarcus sp. EbN1 and Ralstonia solanacearum (Beta-proteobacteria).
Protein Domain
Name: Thoeris protein ThsB, TIR-like domain superfamily
Type: Homologous_superfamily
Description: This is the TIR-like domain of ThsB proteins, which adopts a Rossmann-like fold [ ]. ThsB is responsible for recognizing phage infection [].Thoeris is a bacterial antiphage defense system, which consists of two genes, thsA and thsB, via NAD+ degradation [ , , ]. ThsA has robust NAD+ cleavage activity and and a two-domain architecture containing a N-terminal NAD-binding domain (denoted as sirtuin-like or Macro) and C-terminal SLOG-like domain. In some instances, such as in B. amyloliquefaciens, ThsA has an N-terminal transmembrane domain []. ThsB (also referred to as TIR1 and TIR2) is structurally similar to TIR domain proteins but without enzymatic activity.
Protein Domain
Name: Scaffold protein Nfu/NifU, N-terminal domain superfamily
Type: Homologous_superfamily
Description: Iron-sulphur (FeS) clusters are important cofactors for numerous proteins involved in electron transfer, in redox and non-redox catalysis, in gene regulation, and as sensors of oxygen and iron. These functions depend on the various FeS cluster prosthetic groups, the most common being [2Fe-2S] and [4Fe-4S][ ]. FeS cluster assembly is a complex process involving the mobilisation of Fe and S atoms from storage sources, their assembly into [Fe-S]form, their transport to specific cellular locations, and their transfer to recipient apoproteins. So far, three FeS assembly machineries have been identified, which are capable of synthesising all types of [Fe-S] clusters: ISC (iron-sulphur cluster), SUF (sulphur assimilation), and NIF (nitrogen fixation) systems.In the NIF system, NifS and NifU are required for the formation of metalloclusters of nitrogenase in Azotobacter vinelandii, and other organisms, as well as in the maturation of other FeS proteins. Nitrogenase catalyses the fixation of nitrogen. It contains a complex cluster, the FeMo cofactor, which contains molybdenum, Fe and S. NifS is a cysteine desulphurase. NifU binds one Fe atom at its N-terminal, assembling an FeS cluster that is transferred to nitrogenase apoproteins [ ]. Nif proteins involved in the formation of FeS clusters can also be found in organisms that do not fix nitrogen [].This domain is found at the N terminus of NifU (from NIF system) and NifU related proteins, and in the human Nfu protein. Both of these proteins are thought to be involved in the assembly of iron-sulphur clusters, functioning as scaffolds [, ].
Protein Domain
Name: Hut operon regulatory protein HutP superfamily
Type: Homologous_superfamily
Description: The HutP protein regulates the expression of Bacillus 'hut' structural genes by an anti-termination complex, which recognises three UAG triplet units, separated by four non-conserved nucleotides on the RNA terminator region. L-Histidine and Mg2+ ions are also required. These proteins exhibit the structural elements of alpha/beta proteins, arranged in the order: α-α-beta-α-α-β-β-beta in the primary structure, and the four antiparallel β-strands form a β-sheet in the order beta1-beta2-beta3-beta4, with two α-helices each on the front (alpha1 and alpha2) and at the back (alpha3 and alpha4) of the β-sheet [ ].
Protein Domain
Name: Retroviral nucleocapsid Gag protein p24, N-terminal
Type: Domain
Description: The Gag protein from retroviruses, also known as p24, forms the inner protein layer of the nucleocapsid. It is composed of two domains, the N-terminal domain (NTD), which contributes to viral core formation, and the C-terminal domain (CTD), which is required for capsid dimerisation, Gag oligomerization and viral formation [ , ]. This protein performs highly complex orchestrated tasks during the assembly, budding, maturation and infection stages of the viral replication cycle. During viral assembly, the proteins form membrane associations and self-associations that ultimately result in budding of an immature virion from the infected cell. Gag precursors also function during viral assembly to selectively bind and package two plus strands of genomic RNA. ELISA tests for p24 is the most commonly used method to demonstrate virus replication both in vivoand in vitro[ , ].This is the N-terminal domain of capsid protein p24 from retroviruses.
Protein Domain
Name: Equine herpesvirus protein of unknown function
Type: Family
Description: The IR5 open reading frame (ORF) of the Equid herpesvirus 1 (EHV-1) genome maps within the inverted repeat segments. Sequence analyses of the gene region revealed an ORF of 236 amino acids that showed a high degree of similarity to ORF64 of Human herpesvirus 3 (HHV-3) and ORF3 of Equid herpesvirus 4 (EHV-4), both of which map within the inverted repeats, and to the US10 ORF of Human herpesvirus 1 (HHV-1), which maps within the unique short segment. The IR5 ORF houses a sequence of 13 residues (CAYWCCLGHAFAC) that matches perfectly the consensus zinc finger motif (C-X2-4-C-X2-15-C/H-X2-4-C/H) []. Putative cis-acting elements flanking the IR5 ORF include a TATA box, a CAAT box, and a polyadenylation signal. Coupled with various experimental data, the IR5 gene of EHV-1 thus exhibits characteristics representative of a late gene of the gamma-1 class. The DNA sequence covering ~70% of the short unique region (Us) and part of the short inverted repeat of the Meleagrid herpesvirus 1 (MeHV-1) GA strain has been determined. Sequence analysis showed the presence of nine potential ORFs in the Us region, four of which were found to be similar to US10 (minor virion protein) [].
Protein Domain
Name: Death-associated protein kinase 1, catalytic domain
Type: Domain
Description: This entry represents the catalytic domain of Death-associated protein kinases 1 (DAPK1), which act as a positive regulator of apoptosis [ , , , , ].Loss of DAPK1 expression, usually because of DNA methylation, is implicated in many tumour types [ , , ].DAPK1 is highly abundant in the brain and has also been associated with neurodegeneration [ ].Protein phosphorylation, which plays a key role in most cellular activities, is a reversible process mediated by protein kinases and phosphoprotein phosphatases. Protein kinases catalyse the transfer of the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. Phosphoprotein phosphatases catalyse the reverse process. Protein kinases fall into three broad classes, characterised with respect to substrate specificity []:Serine/threonine-protein kinasesTyrosine-protein kinasesDual specificity protein kinases (e.g. MEK - phosphorylates both Thr and Tyr on target proteins)Protein kinase function is evolutionarily conserved from Escherichia coli to human [ ]. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation []. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins. The catalytic subunits of protein kinases are highly conserved, and several structures have been solved [], leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases [].
Protein Domain
Name: Protein kinase Mps1 family, catalytic domain
Type: Domain
Description: The Mps1 family of protein kinases is a critical regulator of genetic stability [, ]. In yeast, Mps1 is required for the spindle checkpoint and SPBduplication [ ]. Human Mps1, also known as TTK, is required for proper chromosome alignment on the metaphase plate and for the fidelity of chromosome segregation, and might also have a role at centrosomes []. It is associated with cell proliferation and involved in mitotic cell cycle checkpoint control [, ]. In late mitosis, Mps1 is ubiquitinated by the APC/C and subsequently degraded []. This entry represents the catalytic domain of Mps1, which is in the C-terminal region [ ].
Protein Domain
Name: Rho-associated protein kinase 2, catalytic domain
Type: Domain
Description: Rho-associated protein kinases (ROCKs) were originally identified as small GTPase Rho effectors. Later, ROCKs were found actively phosphorylating many actin-binding proteins and intermediate filament proteins to modulate their functions [ ]. Two ROCK isoforms have been identified:ROCK1 (ROKb, p160ROCK) and ROCK2. As major downstream effectors of the small GTPase RhoA, they regulate cellular contraction, motility, morphology, polarity, cell division, and gene expression [ , ].This entry represents the catalytic domain of ROCK2. ROCK2 has a predominant role in vascular smooth muscle cell contractility [ ].
Protein Domain
Name: Nuclear receptor-binding protein 2, pseudokinase domain
Type: Domain
Description: This entry represents the pseudokinase domain found in nuclear receptor binding protein 2 (NRBP2). This domain shows similarity to protein kinases but lacks crucial residues for catalytic activity and/or ATP binding. NRBP2 may be involved in neural progenitor cell survival [ ].
Protein Domain
Name: Protein tyrosine phosphatase, receptor type, N-terminal
Type: Domain
Description: Protein tyrosine (pTyr) phosphorylation is a common post-translational modification which can create novel recognition motifs for protein interactions and cellular localisation, affect protein stability, and regulate enzyme activity. Consequently, maintaining an appropriate level of protein tyrosine phosphorylation is essential for many cellular functions. Tyrosine-specific protein phosphatases (PTPase; ) catalyse the removal of a phosphate group attached to a tyrosine residue, using a cysteinyl-phosphate enzyme intermediate. These enzymes are key regulatory components in signal transduction pathways (such as the MAP kinase pathway) and cell cycle control, and are important in the control of cell growth, proliferation, differentiation and transformation [ , ]. The PTP superfamily can be divided into four subfamilies []:(1) pTyr-specific phosphatases(2) dual specificity phosphatases (dTyr and dSer/dThr)(3) Cdc25 phosphatases (dTyr and/or dThr)(4) LMW (low molecular weight) phosphatasesBased on their cellular localisation, PTPases are also classified as:Receptor-like, which are transmembrane receptors that contain PTPase domains [ ] Non-receptor (intracellular) PTPases [ ] All PTPases carry the highly conserved active site motif C(X)5R (PTP signature motif), employ a common catalytic mechanism, and share a similar core structure made of a central parallel β-sheet with flanking α-helices containing a β-loop-α-loop that encompasses the PTP signature motif [ ]. Functional diversity between PTPases is endowed by regulatory domains and subunits. This entry represents a domain found in various protein tyrosine phosphatase haematopoietic receptors, e.g. CD45, which dephosphorylate growth stimulating proteins. The domain is found in eukaryotes, and is approximately 30 amino acids in length. There is a single completely conserved residue L that may be functionally important.
Protein Domain
Name: Molybdenum cofactor biosynthesis protein F, N-terminal
Type: Domain
Description: This entry represents the N-terminal domain of MoaF.Molybdenum cofactor biosynthesis protein F (MoaF) is essential for the production of the monoamine-inducible 30kDa protein in Klebsiella [ ]. It is necessary for reconstituting organoautotrophic growth in Ralstonia eutropha []. MoaF is conserved in proteobacteria and some lower eukaryotes. The operon regulating the Moa genes is responsible for molybdenum cofactor biosynthesis.
Protein Domain
Name: Hepatitis C virus, Non-structural protein NS4a
Type: Domain
Description: NS4a (non-structural protein) forms an integral part of the NS3 serine protease in Hepatitis C virus, as it is required in a number of cases as a cofactor of cleavage [, ]. It has also been reported that NS4a interacts with NS4b and NS3 to form a multi-subunit replicase complex [].
Protein Domain
Name: Stage V sporulation protein G superfamily
Type: Homologous_superfamily
Description: This entry represents the stage V sporulation protein G (SpoVG) superfamily. It is essential for sporulation and specific to stage V sporulation in Bacillus megaterium and Bacillus subtilis [ ]. In B. subtilis, expression decreases after 30-60 minutes of cold shock [].Structurally, SpoVG comprises a coiled antiparallel β-sheet packed against the C-terminal helix. In addition, the extensions of strands 4 and 5 form an isolated β-hairpin.
Protein Domain
Name: Holliday junction regulator protein family C-terminal
Type: Domain
Description: Although this family is conserved in the Holliday junction regulator, HJURP, proteins in higher eukaryotes, alongside an Scm3, , family, its exact function is not known. The C-terminal region of Scm3 proteins has been evolving rapidly, and this short repeat at the C-terminal end can be present in up to two copies in the higher eukaryotes.
Protein Domain
Name: GTP cyclohydrolase I, feedback regulatory protein
Type: Family
Description: GTP cyclohydrolase I feedback regulatory protein (GFRP) in mammals helps regulate the biosynthesis of tetrahydrobiopterin through the feedback inhibition of the rate-limiting enzyme GTP cyclohydrolase I (GTPCHI). Tetrahydrobiopterin is the cofactor required for the hydroxylation of aromatic amino acids. The crystal structure of GFRP reveals that the protein forms a homopentamer [ ]. In the presence of phenylalanine, the stimulatory complex consists of a GTPCHI decamer sandwiched by two GFRP pentamers, which is thought to enhance GTPCHI activity by locking the enzyme in the active state []. The structure of GFRP consists of two alpha/beta layers arranged beta(2)-α-β(2)-α-β(2), with antiparallel β-sheets in the order 342165.
Protein Domain
Name: GTP binding protein 1-like, GTP-binding domain
Type: Domain
Description: This GTP-binding domain is found in mammalian GTP binding protein 1 (GTPBP1), GTPBP2, and nematode homologues AGP-1 and CGP-1 [ ].Even though GTPBP1 has a GTP-binding domain, it has not been reported to have GTPase activity. GTPBP1 is thought to modulate mRNA decay by destabilizing the interactions between the RBPs and the cytoplasmic exosome using a GTP-binding-dependent mechanism [ ].
Protein Domain
Name: Ribosomal protein L9, C-terminal domain superfamily
Type: Homologous_superfamily
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [, ].Ribosomal protein L9 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L9 is known to bind directly to the 23S rRNA. It belongsto a family of ribosomal proteins grouped on the basis of sequence similarities [ ].The crystal structure of Bacillus stearothermophilus L9 shows the 149-residue protein comprises two globular domains connected by a rigid linker [ ]. Each domain contains an rRNA binding site, and the protein functions as astructural protein in the large subunit of the ribosome. The C-terminal domain consists of two loops, an α-helix and a three-stranded mixed parallel, anti-parallel β-sheet packed against the central α-helix. The long central α-helix is exposed to solvent in the middle and participates in thehydrophobic cores of the two domains at both ends.
Protein Domain
Name: CRISPR locus-related putative DNA-binding protein Csa3
Type: Family
Description: Most but not all examples of this family are associated with CRISPR loci, a combination of DNA repeats and characteristic proteins encoded near the repeat cluster. The C-terminal region of this protein is homologous to DNA-binding helix-turn-helix domains with predicted transcriptional regulatory activity [ ].
Protein Domain
Name: Ribosomal protein L11/L12, N-terminal domain superfamily
Type: Homologous_superfamily
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].Ribosomal protein L11 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L11 is known to bind directly to the 23S rRNA and plays a significant role during initiation, elongation, and termination of protein synthesis. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities [ ], groups bacteria, plant chloroplast, red algal chloroplast, cyanelle and archaeabacterial L11; and mammalian, plant and yeast L12 (YL15). L11 is a protein of 140 to 165 amino-acid residues. L11 consists of a 23S rRNA binding C-terminal domain and an N-terminal domain that directly contacts protein synthesis factors. These two domains are joined by a flexible linker that allows inter-domain movement during protein synthesis. While the C-terminal domain of L11 binds RNA tightly, the N-terminal domain makes only limited contacts with RNA and is proposed to function as a switch that reversibly associates with an adjacent region of RNA [, , , ]. In E. coli, the C-terminal half of L11 has been shown [] to be in an extended and loosely folded conformation and is likely to be buried within the ribosomal structure.Structurally, the ribosomal protein L11/L12 N-terminal domain has a beta-alpha(2)-beta(2) fold arranged into two layers (alpha/beta) with antiparallel β-sheet.
Protein Domain
Name: Ribosomal protein L6, alpha-beta domain superfamily
Type: Homologous_superfamily
Description: Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites [ , ]. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits. Many ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome [ , ].L6 is a protein from the large (50S) subunit. In Escherichia coli, it is located in the aminoacyl-tRNA binding site of the peptidyltransferase centre, and is known to bind directly to 23S rRNA. It belongs to a family of ribosomal proteins, including L6 from bacteria, cyanelles (structures that perform similar functions to chloroplasts, but have structural and biochemical characteristics of Cyanobacteria) and mitochondria; and L9 from mammals, Drosophila, plants and yeast. L6 contains two domains with almost identical folds, suggesting that is was derived by the duplication of anancient RNA-binding protein gene. Analysis reveals several sites on the protein surface where interactions with other ribosome components may occur, the N terminus being involved in protein-protein interactions and the C terminus containing possible RNA-binding sites [ ].This α-β domain found duplicated in ribosomal L6 proteins consists of two β-sheets and one α-helix packed around single core [ ].
Protein Domain
Name: Intraflagellar transport protein 27 homologue, eukaryotes
Type: Family
Description: IFT27, also known as RabL4 (Rab-like4), is a small GTPase-like component of the intraflagellar transport (IFT) complex B that promotes the exit of the BBSome complex from cilia via its interaction with ARL6 [ ]. It is capable of forming a heterodimer with IFT25 (HSPB11) [, ] and is involved in the hedgehog (Hh) signaling pathway [].
Protein Domain
Name: Ribosomal protein L11, C-terminal domain superfamily
Type: Homologous_superfamily
Description: Ribosomal protein L11 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L11 is known to bind directly to the 23S rRNA and plays a significant role during initiation, elongation, and termination of protein synthesis. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities [ ], groups bacteria, plant chloroplast, red algal chloroplast, cyanelle and archaeabacterial L11; and mammalian, plant and yeast L12 (YL15). L11 is a protein of 140 to 165 amino-acid residues. L11 consists of a 23S rRNA binding C-terminal domain and an N-terminal domain that directly contacts protein synthesis factors. These two domains are joined by a flexible linker that allows inter-domain movement during protein synthesis. While the C-terminal domain of L11 binds RNA tightly, the N-terminal domain makes only limited contacts with RNA and is proposed to function as a switch that reversibly associates with an adjacent region of RNA [, , , ]. In E. coli, the C-terminal half of L11 has been shown [] to be in an extended and loosely folded conformation and is likely to be buried within the ribosomal structure.This entry represents the C-terminal domain of L11/L12. The domain consists of a three-helical bundle and a short parallel two-stranded β-ribbon, with an overall α3-β4-α4-α5-β5 topology. All five secondary structure elements contribute to a conserved hydrophobic core. The domain is characterised by two extended loops that are disordered in the absence of the RNA but have defined structures in the complex [].
Protein Domain
Name: PR domain zinc finger protein 2
Type: Family
Description: PRDM2 (also known as RIZ) is a transcriptional regulator and tumour suppressor that catalyzes methylation of lysine 9 of histone H3. Its PR domain is responsible for its catalytic activity []. PRDM2 belongs to the PRDM family, whose members are characterised by the presence of an N-terminal PR (PRDI-BF1 and RIZ1 homology) domain followed by multiple zinc fingers which confer DNA binding activity. PR domains are only distantly related to the classical SET methyltransferase domains []. PRDM2 interacts with another tumour suppressor, the retinoblastoma protein (Rb). A short motif, IRCDE, in the acidic region (AR) of RIZ is important for this interaction [].
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom