Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, pathways, authors, ontology terms, etc. (e.g. eve, embryo, zen, allele)
  • Use OR to search for either of two terms (e.g. fly OR drosophila) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. dros* for partial matches or fly AND NOT embryo to exclude a term

Search results 38301 to 38400 out of 38750 for *

Category restricted to ProteinDomain (x)

0.026s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Name: Protein of unknown function DUF6295
Type: Family
Description: This entry represents a member of a biosynthetic gene cluster (BGC). This BGC (BGC0001922) is described by MIBiG as an example of the following biosynthetic class, polyketide, in particular the isoindolinomycin biosynthetic gene cluster from Streptomyces sp. SoC090715LN-16 [ ]. This family appears to be predominantly found in bacteria.
Protein Domain
Name: Laccase, second cupredoxin domain
Type: Domain
Description: Laccase is a blue multi-copper enzyme that catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. Laccase has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism. Plants usually express multiple laccase genes, but their precise physiological/biochemical roles remain largely unclear [ ]. In cotton (Gossypium spp.), laccases may be involved in fibre development [].Like other related multicopper oxidases (MCOs), laccase is composed of three cupredoxin domains that include one mononuclear and one trinuclear copper centre. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper centre and transferring them to the active site trinuclear copper centre. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper [ , ].
Protein Domain
Name: Domain of unknown function DUF6294
Type: Domain
Description: This BGC (BGC0000081) is described by MIBiG as an example of the following biosynthetic class, polyketide, in particular the kedarcidin biosynthetic gene cluster from Streptoalloteichus sp. ATCC 53650CC []. This domain family appears to be predominantly found in bacteria.
Protein Domain
Name: Cytosolic carboxypeptidase-like protein 5 catalytic domain
Type: Domain
Description: This entry contains the M14 carboxypeptidase-like domain of cytosolic carboxypeptidase-like protein 5 (CCP5, ATP/GTP binding protein-like 5 or AGBL-5; MEROPS identifier M14.025), and related proteins. CCP5 is part of the cytosolic carboxypeptidase (CCP) family, which also includes enzymes CCP1/Nna1, CCP4, and CCP6 [ ], and belongs to subfamily M14B of peptidase family M14 []. CCP5 removes alpha- and gamma-linked glutamates from tubulin [].
Protein Domain
Name: Domain of unknown function DUF6293
Type: Domain
Description: This domain is predominantly found in archaeal proteins and is functionally uncharacterised. This domain has a conserved sequence motif HxxPxxG and a conserved Asparagine residue.
Protein Domain
Name: Ephrin type-A receptor 8, ligand binding domain
Type: Domain
Description: This entry represents the ligand-binding domain found in ephrin type-A receptor 8 (EphA8), also known as EEK. EphA8 has been suggested to play a role in axonal pathfinding during nervous system development in mammals [ ], as well as being implicated in cancer [].Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands [ , ]. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling) [ ].
Protein Domain
Name: Laccase, first cupredoxin domain
Type: Domain
Description: Laccase is a blue multicopper oxidase (MCO) which catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. Laccase has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism. Plants usually express multiple laccase genes, but their precise physiological/biochemical roles remain largely unclear [ ]. In cotton (Gossypium spp.), laccases may be involved in fibre development [].MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper centre and transferring them to the active site trinuclear copper centre. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3 [ , ].
Protein Domain
Name: Laccase, third cupredoxin domain
Type: Domain
Description: Laccase is a blue multicopper oxidase (MCO) which catalyzes the oxidation of a variety aromatic - notably phenolic and inorganic substances coupled to the reduction of molecular oxygen to water. Laccase has been implicated in a wide spectrum of biological activities and, in particular, plays a key role in morphogenesis, development and lignin metabolism. Plants usually express multiple laccase genes, but their precise physiological/biochemical roles remain largely unclear [ ]. In cotton (Gossypium spp.), laccases may be involved in fibre development []. MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper centre and transferring them to the active site trinuclear copper centre. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3 [ , ].
Protein Domain
Name: Bacteriocin biosynthesis, cyclodehydratase domain
Type: Domain
Description: A subset of microcins has been recently described in which the amino acid side chains of cysteine, serine and threonine from a ribosomally produced precursor undergo heterocyclization to generate a product with thiazole or (methyl)oxazole moieties. This small bacteriocins are called TOMMs (thiazole/oxazole-modified microcins). A trimeric complex is reponsible for the formation of these heterocycle-containing metabolites, consisting of a zinc-tetrathiolate containing cyclodehydratase, a flavin mononucleotide-dependent dehydrogenase and a docking scaffold protein.This entry represents a ThiF-like domain of a fusion protein found in clusters associated with the production of TOMMs [ ]. This domain is thought to act as a cyclodehydratase, as do members of the SagC family modelled by .
Protein Domain
Name: Conserved hypothetical protein CHP03858, luciferase-like monooxygenase, putative
Type: Family
Description: This entry represents a related group of proteins of unknown function within the luciferase-like monooxygenase (LLM) superfamily. As most proteins in this entry are from species incapable of synthesising coenzyme F420, they are likely to use FMN as a cofactor.
Protein Domain
Name: Integrating conjugative element protein, PFL4693
Type: Family
Description: Members of this protein, such as PFL_4693 from Pseudomonas fluorescens Pf-5 belong to extended genomic regions that appear to be spread by conjugative transfer. Most members have a predicted N-terminal signal sequence. The function is unknown.
Protein Domain
Name: Conserved hypothetical protein CHP03843
Type: Family
Description: This HMM represents a protein family largely restricted to the Actinobacteria (high-GC Gram-positives), although it is also found in the Chloroflexi. Distant similarity to the phosphatidylinositol 3- and 4-kinase is suggested by the matching of some members to .
Protein Domain
Name: Ribosomal protein L12, archaea
Type: Family
Description: This entry represents the L12 protein of the large (50S) subunit of the archaeal ribosome. Archaeal L12 is functionally equivalent to L7/L12 in bacteria and the P1 and P2 proteins in eukaryotes. L12 is homologous to P1 and P2 but is not homologous to bacterial L7/L12. It is located in the L12 stalk, with proteins L10, L11, and 23S rRNA. In several mesophilic and thermophilic archaeal species, the binding of 23S rRNA to protein L11 and to the L10/L12p pentameric complex was found to be temperature-dependent and cooperative [ ].
Protein Domain
Name: Lantibiotic protection ABC transporter permease subunit, MutG family
Type: Family
Description: This entry includes lantibiotic ABC transporter permease subunit MutG which is a highly hydrophobic, integral membrane protein, and part of the bacitracin ABC transport system that confers resistance to the Gram-positive bacteria in which this system operates, specifically to lantibiotic mutacin. This protein transports mutacin to the surface and expels it from the membrane. Lantibiotics are small peptides, produced by Gram-positive bacteria, which are ribosomally-synthesized as pre-peptides and act by disrupting membrane integrity. Genes encoding the lantibiotic ABC transporter subunits are highly organized in operons containing all the genes required for maturation, transport, immunity, and synthesis [ , , ]. This protein family is largely restricted to gallidermin-family lantibiotic cassettes, but also include orphan transporter cassettes in species that lack candidate lantibiotic precursor and synthetase genes [, ]. Members of this group of proteins are predominantly found in Firmicutes and some species of Actinobacteria.
Protein Domain
Name: CRISPR-associated protein, CsaX
Type: Family
Description: The CRISPR-Cas system is a prokaryotic defense mechanism against foreign genetic elements. The key elements of this defense system are the Cas proteins and the CRISPR RNA. Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes [ ]. CRISPRs appear to provide acquired resistance against mobile genetic elements (viruses, transposable elements and conjugative plasmids). CRISPR clusters contain sequences complementary to antecedent mobile elements and target invading nucleic acids. CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA).The defense reaction is divided into three stages. In the adaptation stage, the invader DNA is cleaved, and a piece of it is selected to be integrated as a new spacer into the CRISPR locus, where it is stored as an identity tag for future attacks by this invader. During the second stage (the expression stage), the CRISPR RNA (pre-crRNA) is transcribed and subsequently processed into the mature crRNAs. In the third stage (the interference stage), Cas proteins, together with crRNAs, identify and degrade the invader [ , , ].The CRISPR-Cas systems have been sorted into three major classes. In CRISPR-Cas types I and III, the mature crRNA is generally generated by a member of the Cas6 protein family. Whereas in system III the Cas6 protein acts alone, in some class I systems it is part of a complex of Cas proteins known as Cascade (CRISPR-associated complex for antiviral defense). The Cas6 protein is an endoribonuclease necessary for crRNA production whereas the additional Cas proteins that form the Cascade complex are needed for crRNA stability [ ]. This entry comprises of a minor CRISPR-associated protein. So far, members are only found in the context of the (strictly archaeal) Apern subtype of CRISPR/Cas system, and is further restricted to the Sulfolobales, including Metallosphaera sedula DSM 5348 and multiple species of the genus Sulfolobus.
Protein Domain
Name: Proteasome, alpha subunit, bacterial
Type: Family
Description: The proteasome (or macropain) ( ) [ , , , , ] is a multicatalytic proteinase complex in eukaryotes and archaea, and in some bacteria, that is involved in an ATP/ubiquitin-dependent non-lysosomal proteolytic pathway. In eukaryotes the 20S proteasome is composed of 28 distinct subunits which form a highly ordered ring-shaped structure (20S ring) of about 700kDa. Proteasome subunits can be classified on the basis of sequence similarities into two groups, alpha (A) and beta (B). The proteasome consists of four stacked rings composed of alpha/beta/beta/alpha subunits. There are seven different alpha subunits and seven different beta subunits []. Three of the seven beta subunits are peptidases, each with a different specificity. Subunit beta1c (MEROPS identifier T01.010) has a preference for cleaving glutaminyl bonds ("peptidyl-glutamyl-like"or "caspase-like"), subunit beta2c (MEROPS identifier T01.011) has a preference for cleaving arginyl and lysyl bonds ("trypsin-like"), and subunit beta5c (MEROPS identifier T01.012) cleaves after hydrophobic amino acids ("chymotrypsin-like") [ ]. The proteasome subunits are related to N-terminal nucleophile hydrolases, and the catalytic subunits have an N-terminal threonine nucleophile.Members of this entry are the alpha subunit of the 20S proteasome as found in Actinobacteria such as Mycobacterium, Rhodococcus, and Streptomyces. In most Actinobacteria (an exception is Propionibacterium acnes), the proteasome is accompanied by a system of tagging proteins for degradation with Pup.
Protein Domain
Name: Transcription regulator, NtcA
Type: Family
Description: Proteins of this entry, found in the cyanobacteria, are the global nitrogen regulator NtcA. This DNA-binding transcriptional regulator is required for expressing many different ammonia-repressible genes. The consensus NtcA-binding site is GTA(N8)TAC [ , ].
Protein Domain
Name: Conjugative transposon, TraN
Type: Family
Description: This entry represents the TraN protein, which is encoded by transfer region genes of conjugative transposons of Bacteroides. This protein is related to conjugative transfer proteins VirB9 and TrbG of Agrobacterium Ti plasmids.
Protein Domain
Name: Domain of unknown function DUF6292
Type: Domain
Description: This entry represents a member of a biosynthetic gene cluster (BGC). This BGC (BGC0001203) is described by MIBiG as an example of the following biosynthetic classes, NRP (non-ribosomal peptide) and polyketide, in particular the clarexpoxcin biosynthetic gene cluster from uncultured bacterium AR_456 [ ]. This domain family appears to be predominantly found in Actinobacteria.
Protein Domain
Name: Domain of unknown function DUF6291
Type: Domain
Description: This domain, predominantly found in bacterial and viral proteins, is functionally uncharacterised. It has two conserved residues, a leucine and a tyrosine.
Protein Domain
Name: Protein of unknown function DUF6290
Type: Family
Description: This family of proteins is functionally uncharacterised, they are predominantly found in bacteria. Proteins in this family presumably contain a ribbon-helix-helix DNA-binding motif.
Protein Domain
Name: Protein of unknown function DUF6289
Type: Family
Description: This entry represents a member of a biosynthetic gene cluster (BGC). This BGC (BGC0000385) is described by MIBiG as an example of the following biosynthetic class, NRP (non-ribosomal peptide), in particular the lysobactin biosynthetic gene cluster from Lysobacter sp. ATCC 53042 [ ].
Protein Domain
Name: Protein of unknown function DUF6288
Type: Family
Description: This family of bacterial proteins is functionally uncharacterised. Proteins in this family are approximately 800 amino acids in length and they presumably contain PDZ domains.
Protein Domain
Name: Ephrin type-A receptor 4, ligand binding domain
Type: Domain
Description: This entry represents the ligand-binding domain found in ephrin type-A receptor 4 (EphA4). A loss of EphA4, as well as EphB2, precedes memory decline in a murine model of Alzheimers disease [ ]. EphA4 has been shown to have a negative effect on axon regeneration and functional restoration in corticospinal lesions and has been implicated in circadian sleep regulation [, ].Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands [ , ]. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling) [ ].
Protein Domain
Name: Domain of unknown function DUF6287
Type: Domain
Description: This presumed domain, found in bacteria, is functionally uncharacterised. It is around 50 amino acids in length and contains a conserved GTW sequence motif.
Protein Domain
Name: Ascorbate oxidase homologue, second cupredoxin domain
Type: Domain
Description: The proteins in this subfamily share homology to ascorbate oxidase and other members of the blue copper oxidase family. Expression of protein NTP303 from Nicotiana tabacum is detected during germination and pollen tube growth [ ]. Ascorbate oxidase is a member of the multicopper oxidase (MCO) family that couples oxidation of substrates with reduction of dioxygen to water.Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper centre. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper centre and transferring them to the active site trinuclear copper centre. The cupredoxin domain 2 of 3-domain MCOs has lost the ability to bind copper [ ].
Protein Domain
Name: Domain of unknown function DUF6286
Type: Domain
Description: This domain family is found in bacterial proteins and is functionally uncharacterised. It has a conserved GV motif.
Protein Domain
Name: Domain of unknown function DUF6285
Type: Domain
Description: This domain family is predominantly found in bacterial proteins and it is functionally uncharacterised. In some members of this family, thought to be aminoglycoside phosphotransferases, it is located at the C-terminal.
Protein Domain
Name: Ascorbate oxidase homologue, first cupredoxin domain
Type: Domain
Description: The proteins in this subfamily share homology to ascorbate oxidase and other members of the blue copper oxidase family. Expression of protein NTP303 from Nicotiana tabacum is detected during germination and pollen tube growth [ ]. Ascorbate oxidase is a member of the multicopper oxidase (MCO) family that couples oxidation of substrates with reduction of dioxygen to water.MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper centre and transferring them to the active site trinuclear copper centre. The cupredoxin domain 1 of 3-domain MCOs contains part the trinuclear copper binding site, which is located at the interface of domains 1 and 3. This subfamily does not harbour trinuclear copper binding histidines [].
Protein Domain
Name: Protein of unknown function DUF6284
Type: Family
Description: This entry represents a member of a biosynthetic gene cluster (BGC). This BGC (BGC0000700) is described by MIBiG as an example of the following biosynthetic class, saccharide, in particular the istamycin biosynthetic gene cluster from Streptomyces tenjimariensis [ ]. This family appears to be predominantly found in Actinoabactteria.
Protein Domain
Name: Protein of unknown function DUF6283
Type: Family
Description: This entry represents a member of a biosynthetic gene cluster (BGC). This BGC (BGC0001735) is described by MIBiG as an example of the following biosynthetic class, other (unspecified), in particular the pentostatine biosynthetic gene cluster from Streptomyces antibioticus [ ]. This family appears to be predominantly found in bacteria.
Protein Domain
Name: Ascorbate oxidase homologue, third cupredoxin domain
Type: Domain
Description: The proteins in this subfamily share homology to ascorbate oxidase and other members of the blue copper oxidase family. Expression of protein NTP303 from Nicotiana tabacum is detected during germination and pollen tube growth [ ]. Ascorbate oxidase is a member of the multicopper oxidase (MCO) family that couples oxidation of substrates with reduction of dioxygen to water.MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper centre and transferring them to the active site trinuclear copper centre. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part of the trinuclear copper binding site, which is located at the interface of domains 1 and 3. This subfamily does not harbour T1 copper or trinuclear copper binding sites [ ].
Protein Domain
Name: Gar2, RNA recognition motif 2
Type: Domain
Description: This entry represents the RNA recognition motif 2 (RRM2) of yeast protein Gar2, a novel nucleolar protein required for 18S rRNA and 40S ribosomal subunit accumulation [ ]. It shares similar domain architecture with nucleolin from vertebrates [] and Nsr1 from Saccharomyces cerevisiae []. The highly phosphorylated N-terminal domain of Gar2 is made up of highly acidic regions separated from each other by basic sequences, and contains multiple phosphorylation sites. The central domain of Gar2 contains two closely adjacent RNA recognition motifs (RRMs). The C-terminal RGG (or GAR) domain of gar2 is rich in glycine, arginine and phenylalanine residues [].
Protein Domain
Name: Ephrin type-A receptor 5, ligand binding domain
Type: Domain
Description: This entry represents the ligand-binding domain found in ephrin type-A receptor 5 (EphA5), also known as brain-specific kinase (Bsk). EphA5 is almost exclusively expressed in the nervous system, and is thought to play a role in synaptogenesis, and in dorsoventral organization of the spinal cord [ , , ].Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands [ , ]. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling) [ ].
Protein Domain
Name: Copper resistance protein, third cupredoxin domain
Type: Domain
Description: CopA is a multicopper oxidase (MCO) related to laccase and L-ascorbate oxidase, both copper-containing enzymes [ , ]. CopA mutant causes a loss of function including copper tolerance and oxidase activity and copA transcription is inducible in the presence of copper [].Although MCOs have diverse functions, majority of them have three cupredoxin domain repeats that include one mononuclear and one trinuclear copper centre. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper centre and transferring them to the active site trinuclear copper centre. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3 [ , , , ].
Protein Domain
Name: ParB-related, ThiF-related cassette, protein B
Type: Family
Description: This entry contains a novel genetic system characterised by six major proteins, including a ParB homologue and a ThiF homologue. It is designated PRTRC, or ParB-Related, ThiF-Related Cassette. This protein family is designated protein B.
Protein Domain
Name: Prokaryotic E2 family D
Type: Family
Description: This family is part of the E2/UBC superfamily of proteins found in several bacteria. Members of this family lack the conserved histidine of the classical E2-fold. However, they have an absolutely conserved histidine carboxyl-terminal to the conserved cysteine [ , ]. Members of this family are usually present in a conserved gene neighbourhood with genes encoding members of the Ub modification pathway such as the E1, Ub and JAB proteins. These neighbourhoods also contain a gene encoding a rapidly diverging α-helical protein [].
Protein Domain
Name: ATP-dependent protease, HslV subunit
Type: Family
Description: ATP-dependent protease complexes are present in all three kingdoms of life, where they rid the cell of misfolded or damaged proteins and control the level of certain regulatory proteins. They include the proteasome in Eukaryotes, Archaea, and Actinomycetales and the HslVU (ClpQY, clpXP) complex in other eubacteria. Genes homologues to eubacterial HslU (ClpY, clpX) have also been demonstrated in to be present in the genome of trypanosomatid protozoa [ ].The prokaryotic ATP-dependent proteasome is coded for by the heat-shock locus VU (HslVU). It consists of HslV, a peptidase, and HslU ( ), the ATPase and chaperone belonging to the AAA/Clp/Hsp100 family. The crystal structure of Thermotoga maritima HslV has been determined to 2.1-A resolution. The structure of the dodecameric enzyme is well conserved compared to those from Escherichia coli and Haemophilus influenzae [ , ].This entry represents the HslV peptidase subunit of the ATP-depdendent HSIVU protease complex found in bacteria and some lower eukaryotes.
Protein Domain
Name: ParB-related, ThiF-related cassette, protein F
Type: Family
Description: This entry contains a novel genetic system characterised by seven (usually) major proteins including a ParB homologue and a ThiF homologue. It is commonly found on plasmids or in bacterial chromosomal regions near phage, plasmid or transposon markers. It is most common among the beta Proteobacteria. It has been named the system PRTRC, or ParB-Related, ThiF-Related Cassette. This family is designated protein F, and it is the most divergent of the families.
Protein Domain
Name: ABC transporter, F420-0 import, ATP-binding protein, predicted
Type: Family
Description: ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [ ].The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [, , ].The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly β-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel β-sheet of armI by a two-fold axis [, , , , , ].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions [ ]. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [, ]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [, , ].This entry represents a small clade of ABC-type transporter ATP-binding protein components encoded as part of a three gene cassette along with a periplasmic substrate-binding protein ( ) and a permease ( ). The organisms containing this cassette are all Actinobacteria and contain numerous proteins requiring the coenzyme F420. The model in this entry was defined based on five such organisms, four of which are lacking all F420 biosynthetic capability save the final side-chain polyglutamate attachment step (via the gene cofE: ). In Jonesia denitrificans DSM 20603 and marine actinobacterium PHSC20C1 this cassette is in an apparent operon with the cofE gene and, in PHSC20C1, also with a F420-dependent glucose-6-phosphate dehydrogenase ( ). Based on these observations this ATP-binding protein is predicted to be a component of an F420-0 (that is, F420 lacking only the polyglutamate tail) transporter.
Protein Domain
Name: Conserved hypothetical protein CHP03879, regulatory domain, putative
Type: Domain
Description: This entry represents a domain shared by two different protein families of unknown function. These proteins are regularly encoded next to their corresponding putative partner family, a probable regulatory protein with homology to KaiC. By implication, therefore, proteins in this entry may also be involved in sensory transduction and/or regulation.
Protein Domain
Name: VPDSG-CTERM protein sorting domain
Type: Domain
Description: The PEP-CTERM/exosortase system has been previously identified through in silico analysis [ ]. This entry describes a PEP-CTERM-like variant C-terminal protein sorting signal, as found at the C terminus of twenty otherwise unrelated proteins in Verrucomicrobiae bacterium DG1235. The variant motif, VPDSG, seems an intermediate between the VPEP motif () of typical exosortase systems and the classical LPXTG of sortase in Gram-positive bacteria [ ].
Protein Domain
Name: ABC transporter, F420-0 import, periplasmic substrate-binding protein, predicted
Type: Family
Description: ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain [ ].The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site [, , ].The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly β-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel β-sheet of armI by a two-fold axis [ , , , , , ].The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions [ ]. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette [, ]. More than 50 subfamilies have been described based on a phylogenetic and functional classification [, , ].This entry represents a small clade of ABC-type transporter periplasmic substrate-binding proteins encoded as part of a three gene cassette along with a permease ( ) and an ATPase ( ). The organisms containing this cassette are all Actinobacteria and contain numerous proteins requiring the coenzyme F420. The model in this entry was defined based on five such organisms, four of which are lacking all F420 biosynthetic capability save the final side-chain polyglutamate attachment step (via the gene cofE: ). In Jonesia denitrificans DSM 20603 and marine actinobacterium PHSC20C1 this cassette is in an apparent operon with the cofE gene and, in PHSC20C1, also with a F420-dependent glucose-6-phosphate dehydrogenase ( ). Based on these observations this periplasmic substarte-binding protein is predicted to be a component of an F420-0 (that is, F420 lacking only the polyglutamate tail) transporter.
Protein Domain
Name: ParB-related, ThiF-related cassette, protein C
Type: Domain
Description: This entry represents a novel genetic system which is characterised by six major proteins, one of which is a ParB and ThiF homologue designated PRTRC (ParB-Related,ThiF-Related Cassette). This PRTRC system c protein is often found on plasmids.
Protein Domain
Name: Protein of unknown function DUF6281
Type: Family
Description: This entry represents a member of a biosynthetic gene cluster (BGC). This BGC (BGC0000703) is described by MIBiG as an example of the following biosynthetic class, saccharide, in particular the kanamycin biosynthetic gene cluster from Streptomyces kanamyceticus [ ].
Protein Domain
Name: Protein of unknown function DUF6280
Type: Family
Description: This family of proteins is functionally uncharacterised and found in Alphaproteobacteria.
Protein Domain
Name: Protein of unknown function DUF6278
Type: Family
Description: This entry represents a member of a biosynthetic gene cluster (BGC). This BGC (BGC0001590) is described by MIBiG as an example of the following biosynthetic class, polyketide, in particular the formicamycins A-M biosynthetic gene cluster from Streptomyces sp. KY5 [ ]. This family appears to be predominantly found in Actinobacteria.
Protein Domain
Name: Protein of unknown function DUF6277
Type: Family
Description: This entry represents a member of a biosynthetic gene cluster (BGC). This BGC (BGC0001096) is described by MIBiG as an example of the following biosynthetic classes, NRP (non-ribosomal peptide) and polyketide, in particular the FR901464 biosynthetic gene cluster from Pseudomonas sp. 2663 [ ].
Protein Domain
Name: Protein of unknown function DUF6276
Type: Family
Description: This family of proteins found in archaea is functionally uncharacterised. Proteins in this family contain an N-terminal zinc binding domain.
Protein Domain
Name: Yme2, RNA recognition motif
Type: Domain
Description: This entry represents the RNA recognition motif (RRM) of Yme2 (Mitochondrial escape protein 2), which is an inner mitochondrial membrane protein that plays a critical role in mitochondrial DNA transactions [ ]. In Saccharomyces cerevisiae, it may serve as a mediator of nucleoid structure and number in mitochondria []. Yme2 contains an exonuclease domain and an RNA recognition motif (RRM).
Protein Domain
Name: Protein of unknown function DUF6275
Type: Family
Description: This family of proteins is functionally uncharacterised. This family of proteins is found in Firmicute bacteria and their phage.
Protein Domain
Name: Protein of unknown function DUF6274
Type: Family
Description: This entry represents a member of a biosynthetic gene cluster (BGC). This BGC (BGC0000119) is described by MIBiG as an example of the following biosynthetic classes, polyketide and saccharide, in particular the 7-deoxypactamycin biosynthetic gene cluster from Streptomyces pactum [ , , ]. This family appears to be predominantly found in Actinobacteria.
Protein Domain
Name: Domain of unknown function DUF6273
Type: Domain
Description: This domain is predominantly found in Firmicute bacterial proteins and is functionally uncharacterised. It is found in proteins that are likely to be surface exposed.
Protein Domain
Name: Ephrin type-A receptor 2, ligand binding domain
Type: Domain
Description: This entry represents the ligand-binding domain found in ephrin type-A receptor 2 (EphA2). EphA2 negatively regulates cell differentiation and has been shown to be overexpressed in tumor cells and tumor blood vessels in a variety of cancers including breast, prostate, lung, and colon. As a result, it is an attractive target for drug design since its inhibition could affect several aspects of tumor progression [ , , ]. Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands [ , ]. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling) [ ].
Protein Domain
Name: Ephrin type-A receptor 3, ligand binding domain
Type: Domain
Description: This entry represents the ligand-binding domain found in ephrin type-A receptor 3 (EphA3). EphA3 has also been implicated in lung tumors [ ]. Class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands [ , ]. Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling) [ ].
Protein Domain
Name: Ascorbate oxidase, third cupredoxin domain
Type: Domain
Description: Ascorbate oxidase catalyzes the oxidation of ascorbic acid to dehydroascorbic acid. This multicopper oxidase (MCO) is found in cucurbitaceous plants such as pumpkin, cucumber, and melon. It can detect levels of ascorbic acid and eliminate it. The biological function of ascorbate oxidase is still not clear [ ]; some studies suggest that it may play a crucial role in cell elongation and enlargement []. In pumpkin, its expression is increased during callus growth, fruit development and seedling elongation [].MCOs are capable of oxidizing a vast range of substrates, varying from aromatic compounds to inorganic compounds such as metals. Although the members of this family have diverse functions, majority of them have three cupredoxin domain repeats. The copper ions are bound in several sites: Type 1, Type 2, and/or Type 3. The ensemble of types 2 and 3 copper is called a trinuclear cluster. MCOs oxidize their substrate by accepting electrons at a mononuclear copper centre and transferring them to the active site trinuclear copper centre. The cupredoxin domain 3 of 3-domain MCOs contains the Type 1 (T1) copper binding site and part the trinuclear copper binding site, which is located at the interface of domains 1 and 3 [ ].
Protein Domain
Name: Protein of unknown function DUF6336
Type: Family
Description: This entry represents a member of a biosynthetic gene cluster (BGC). This BGC (BGC0001593) is described by MIBiG as an example of the following biosynthetic class, NRP (non-ribosomal peptide), in particular the ficellomycin biosynthetic gene cluster from Streptomyces ficellus [ ].
Protein Domain
Name: Protein of unknown function DUF6335
Type: Family
Description: This entry represents a member of a biosynthetic gene cluster (BGC). This BGC (BGC0001615) is described by MIBiG as an example of the following biosynthetic class, NRP (non-ribosomal peptide), in particular hexose-palythine-serine biosynthetic gene cluster from Heteroscytonema crispum UCFS10 [ ].
Protein Domain
Name: Protein of unknown function DUF6334
Type: Family
Description: This entry represents a member of a biosynthetic gene cluster (BGC). This BGC (BGC0001543) is described by MIBiG as an example of the following biosynthetic class, polyketide, in particular the chejuenolide A biosynthetic gene cluster from Hahella chejuensis [ ].
Protein Domain
Name: Protein of unknown function DUF6332
Type: Family
Description: This entry represents a member of a biosynthetic gene cluster (BGC). This BGC (BGC0001598) is described by MIBiG as an example of the following biosynthetic classes, NRP (non-ribosomal peptide) and polyketide, in particular the foxicins A-D biosynthetic gene cluster from Streptomyces diastatochromogenes [ ]. This family appears to be predominantly found in Actinobacteria.
Protein Domain
Name: Protein of unknown function DUF6331
Type: Family
Description: This entry represents a member of a biosynthetic gene cluster (BGC). This BGC (BGC0001060) is described by MIBiG as an example of the following biosynthetic class, polyketide, in particular the dynemicin A biosynthetic gene cluster from Micromonospora chersina [].
Protein Domain
Name: Protein of unknown function DUF6330
Type: Family
Description: This family of bacterial proteins is functionally uncharacterised. Proteins in this family are approximately 70 amino acids in length.
Protein Domain
Name: Domain of unknown function DUF6329
Type: Domain
Description: This domain is predominantly found in uncharacterised bacterial proteins and its function is unknown. This domain is approximately 60 amino acids in length and it contains three conserved residues, a cysteine, a leucine and a tryptophan.
Protein Domain
Name: Protein of unknown function DUF6328
Type: Family
Description: This entry represents a member of a biosynthetic gene cluster (BGC). This BGC (BGC0000219) is described by MIBiG as an example of the following biosynthetic classes, polyketide and saccharide, in particular the elloramycin biosynthetic gene cluster from Streptomyces olivaceus [ ]. This family appears to be predominantly found in Actinobacteria.
Protein Domain
Name: Protein of unknown function DUF6327
Type: Family
Description: This family of proteins is functionally uncharacterised and found in bacteria. Proteins in this family have two conserved sequence motifs: KKY and YQK.
Protein Domain
Name: Protein of unknown function DUF6326
Type: Family
Description: This family of proteins is functionally uncharacterised. This family of proteins is found in bacteria and archaea.
Protein Domain
Name: Protein of unknown function DUF6325
Type: Family
Description: This family of proteins is functionally uncharacterised and is predominantly found in bacteria.
Protein Domain
Name: Protein of unknown function DUF6324
Type: Family
Description: This family of proteins is functionally uncharacterised and is found in bacteria. Proteins in this family have two conserved sequence motifs: QIGPT and GMVR.
Protein Domain
Name: Protein of unknown function DUF6323
Type: Family
Description: This entry represents a member of a biosynthetic gene cluster (BGC). This BGC (BGC0001520) is described by MIBiG as an example of the following biosynthetic class, polyketide, in particular the aurantinin B biosynthetic gene cluster from Bacillus subtilis [ ]. This family appears to be predominantly present in Firmicutes.
Protein Domain
Name: Protein of unknown function DUF6322
Type: Family
Description: This family of proteins, predominantly found in Caudovirales, is functionally uncharacterised. Proteins in this family are approximately 150 amino acids in length. There are conserved sequences: ETINSMDYNTGY at the N-terminal, WTG in the centre and TDTTS at the C-terminal.
Protein Domain
Name: Domain of unknown function DUF6321
Type: Domain
Description: This presumed domain is functionally uncharacterised. This domain family is predominantly found in bacteria and viruses, and is approximately 80 amino acids in length. It has a conserved sequence GGLxxxGRxxY and a conserved tryptophan residue at the C-terminal.
Protein Domain
Name: Protein of unknown function DUF6320
Type: Family
Description: This family of proteins is functionally uncharacterised. This family of proteins is mainly found in Firmicute bacteria.
Protein Domain
Name: Protein of unknown function DUF6319
Type: Family
Description: This family of proteins found in bacteria is functionally uncharacterised.
Protein Domain
Name: Domain of unknown function DUF6318
Type: Domain
Description: This domain, found in Actinobacteria proteins, is functionally uncharacterised.
Protein Domain
Name: Protein of unknown function DUF6313
Type: Family
Description: This entry represents a member of a biosynthetic gene cluster (BGC). This BGC (BGC0000703) is described by MIBiG as an example of the following biosynthetic class, saccharide, in particular the kanamycin biosynthetic gene cluster from Streptomyces kanamyceticus [ ].
Protein Domain
Name: Nif11 domain
Type: Domain
Description: This domain is found mainly in the Cyanobacteria and in Proteobacteria such as the nitrogen-fixing bacterium Azotobacter vinelandii. It is found in Nif11, a protein described in Azotobacter as linked to nitrogen fixation [ ]. It also constitutes a leader peptide in Nif11-derived peptides (N11P), which are thought to be post-translationally modified microcins derived from a putative nitrogen-fixing protein []. N11P sequences have a classic leader peptide cleavage motif, usually Gly-Gly, which marks the end of family-wide similarity area and the beginning of a low-complexity region rich in Cys, Gly and Ser [].
Protein Domain
Name: Nif11-like leader peptide
Type: Domain
Description: This entry describes a conserved, fairly long (about 65 residue) leader peptide region for a family of putative ribosomal natural products (RNP) of small size. Members of the seed alignment (most sequences scoring better than 54 bits to the HMMER 2 model) tend to have the Gly-Gly motif as the last two residues of the matched region. This is a cleavage site for a combination processing/export ABC transporter with a peptidase domain. Members include the prochlorosins, lantipeptides from Prochlorococcus [ , ].
Protein Domain
Name: Prokaryotic N-terminal methylation site
Type: PTM
Description: This short motif directs methylation of the conserved phenylalanine residue. It is most often found at the N terminus of pilins and other proteins involved in secretion (see , , and ). There is a cleavage site G^FxxxE followed by a hydrophobic stretch. The new N-terminal residue produced after cleavage, usually Phe, is methylated. Separate domains of the prepilin peptidase appear to be responsible for cleavage and methylation. Proteins with this N-terminal region include type IV pilins and other components of pilus biogenesis, competence proteins, and type II secretion proteins. Typically several proteins in a single operon have this region.
Protein Domain
Name: GPI inositol-deacylase
Type: Family
Description: PGAP1 (Bst1 in yeast) functions as a GPI inositol-deacylase; this deacylation is important for the efficient transport of GPI-anchored proteins from the endoplasmic reticulum to the Golgi body [ ]. Mutations of the PGAP1 gene cause mental retardation, autosomal recessive 42 (MRT42) [].
Protein Domain
Name: Peptidase S11, D-Ala-D-Ala carboxypeptidase A, C-terminal
Type: Domain
Description: Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes [ ]. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Many families of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases [].Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base [ ]. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ].This entry contains proteins that are annotated as penicillin-binding protein 5 and 6. These belong to MEROPS peptidase family S11 (D-Ala-D-Ala carboxypeptidase A family, clan SE). Penicillin-binding protein 5 expressed by Escherichia coli functions as a D-alanyl-D-alanine carboxypeptidase. It is composed of two domains that are oriented at approximately right angles to each other. The N-terminal domain ( ) is the catalytic domain. The C-terminal domain, this entry, is organised into a sandwich of two anti-parallel β-sheets, and has a relatively hydrophobic surface as compared to the N-terminal domain. Its precise function is unknown; it may mediate interactions with other cell wall-synthesising enzymes, thus allowing the protein to be recruited to areas of active cell wall synthesis. It may also function as a linker domain that positions the active site in the catalytic domain closer to the peptidoglycan layer, to allow it to interact with cell wall peptides [ ].
Protein Domain
Name: PaaX-like, N-terminal
Type: Domain
Description: This entry describes the N-terminal region of proteins that are similar to, and nclude, the product of the paaX gene of Escherichia coli ( ). PaaX is a transcriptional regulator that is always found in association with operons believed to be involved in the degradation of phenylacetic acid [ ]. The gene product has been shown to bind to the promoter sites and repress their transcription [].
Protein Domain
Name: PA-IL-like
Type: Family
Description: The members of this family are similar to the galactophilic lectin-1 expressed by Pseudomonas aeruginosa (PA-IL, ). Lectins recognising specific carbohydrates found on the surface of host cells are known to be involved in the initiation of infections by this organism. The protein is thought to be organised into an extensive network of β-sheets, as is the case with many other lectins [ ].
Protein Domain
Name: PHA accumulation regulator DNA-binding, N-terminal
Type: Domain
Description: This domain is found at the N terminus of the polyhydroxyalkanoate (PHA) synthesis regulators. These regulators have been shown to directly bind DNA and PHA [ ]. The invariant nature of this domain compared to the C-terminal domain(s) suggests that it contains the DNA-binding function.
Protein Domain
Name: Halocyanin, copper-binding domain
Type: Domain
Description: Halocyanins are blue (type I) copper redox proteins found in halophilic archaea such as Natronomonas pharaonis (Natronobacterium pharaonis). Halocyanin from from N. pharaonis has been characterised and shown to be a small blue copper protein with a molecular mass of about 15.5kDa [ , ]. This protein, which was named halocyanin, contains one Cu2+, with a copper-binding site containing two His, one Met, and one Cys as probable ligands. It is probable that halocyanin is a peripheral membrane protein, which serves as a mobile electron carrier.This entry represents the copper-binding domain of halocyanins. This domain is present only once in some halocyanins and is duplicated in others. It is not found in plastocyanins or certain divergent paralogs of halocyanin.
Protein Domain
Name: Major surface glycoprotein G
Type: Family
Description: Respiratory synctial virus (RSV) has two major virion envelope proteins, the fusion F and major attachment G glycoproteins, which are the two viral neutralisation antigens []. This entry represents the major surface glycoprotein G from RSV. G glycoprotein interacts with host CX3CR1, the receptor for the CX3C chemokine fractalkine, to modulate the immune response and facilitate infection [, ]. There are two versions of the G protein: the full length G protein (mG), which is anchored by a transmembrane domain near the N terminus; the secreted version (sG), which lacks the transmembrane domain due to an alternative initiation of translation []. The secreted version-sG helps the virus evade antibody-mediated restriction of replication by acting as an antigen decoy [].
Protein Domain
Name: Myelin P0 protein-related
Type: Family
Description: This entry represents a group of transmembrane proteins, including myelin protein P0, myelin protein zero-like protein 1/2/3, sodium channel subunit beta-2 and v-set and immunoglobulin domain-containing protein 1.
Protein Domain
Name: V-set and immunoglobulin domain-containing protein 1
Type: Family
Description: V-set and immunoglobulin domain-containing protein 1 (VSIG1, also konwn as A34) belongs to the immunoglobulin superfamily (IgSF), whose members have one or more Ig-like domains in the extracellular region that is implicated in cell-cell adhesion, a transmembrane domain, and one cytoplasmic C-terminal region [ ]. VSIG1 is required for the proper differentiation of glandular gastric epithelia [].
Protein Domain
Name: SNAP-25 domain
Type: Domain
Description: This entry represents a domain found in the SNAP-25 family members. SNAP-25 (synaptosome-associated protein 25kDa) proteins are components of SNARE complexes, which are proposed to account for the specificity of membrane fusion and todirectly execute fusion by forming a tight complex (the SNARE or core complex) that brings the synaptic vesicle and plasma membranestogether. The SNAREs constitute a large family of proteins that are characterised by 60-residue sequences known as SNARE motifs (), which have a high propensity to form coiled coils and often precedecarboxy-terminal transmembrane regions. The synaptic core complex is formed by four SNARE motifs (two from SNAP25 and one each from synaptobrevin and syntaxin 1) that areunstructured in isolation but form a parallel four-helix bundle on assembly. The crystal structure of the core complex revealedthat the helix bundle is highly twisted and contains several salt bridges on the surface, as well as layers of interior hydrophobic residues.However, a polar layer in the centre of the complex is formed by three glutamines (two from SNAP25 and one from syntaxin 1) and one arginine(from synaptobrevin) [ ].Members of the SNAP-25 family contain a cluster of cysteine residues that can be palmitoylated for membrane attachment [ ].
Protein Domain
Name: Dopamine D3 receptor
Type: Family
Description: Dopamine receptors are members of the rhodopsin-like G-protein coupled receptor family and are prominent in the vertebrate central nervous system (CNS). Dysfunction of dopaminergic neurotransmission in the CNS has been implicated in a variety of neuropsychiatric disorders [ ], including social phobia [], Tourette's syndrome [], Parkinson's disease [], schizophrenia [], neuroleptic malignant syndrome [], attention-deficit hyperactivity disorder (ADHD) [] and drug and alcohol dependence [, ]. As a result, dopamine receptors are common drug targets; antipsychotics are often dopamine receptor antagonists while psychostimulants are typically indirect agonists of dopamine receptors [, , , ].There are at least five different known subtypes of dopamine receptors designated D1, D2, D3, D4 and D5 [ ]. They are distinguished by their G-protein coupling, ligand specificity, anatomical distribution and physiological effects. Dopamine receptors are divided into two further subfamilies. The D1-like family consists of D1 and D5 receptors, which couple to Gs and mediate excitatory neurotransmission. The D2-like family, meanwhile, consists of D2, D3 and D4 receptors, which couple to Gi/Go and mediate inhibitory neurotransmission. Although dopamine receptors are widely distributed in the brain, they are found in different locations that have different receptor type densities, presumably reflecting different functional roles []. D1 and D2 receptor subtypes are found at 10-100 times the levels of the D3, D4, D5 subtypes [].This entry represents the dopamine D3 receptors, which have a similar pharmacological profile to D2 receptors. They are expressed predominantly in the limbic area (including the olfactory tubercle, nucleus accumbens, islands of Calleja and hypothalamus), and they are present in lower levels in the caudate-putamen and cerebral cortex. The receptors are also found in dopamine cell bodies in the substantia nigra. The distribution of dopamine D3 receptors is consistent with a role in cognition and emotional functions and they may be a target of antipsychotic therapy involving dopamine antagonists [ , , ]. The receptors have been implicated in modulation of cocaine self-administration [].
Protein Domain
Name: Dopamine D2 receptor
Type: Family
Description: Dopamine receptors are members of the rhodopsin-like G-protein coupled receptor family and are prominent in the vertebrate central nervous system (CNS). Dysfunction of dopaminergic neurotransmission in the CNS has been implicated in a variety of neuropsychiatric disorders [ ], including social phobia [], Tourette's syndrome [], Parkinson's disease [], schizophrenia [ ], neuroleptic malignant syndrome [], attention-deficit hyperactivity disorder (ADHD) [] and drug and alcohol dependence [, ]. As a result, dopamine receptors are common drug targets; antipsychotics are often dopamine receptor antagonists while psychostimulants are typically indirect agonists of dopamine receptors [, , , ].There are at least five different known subtypes of dopamine receptors designated D1, D2, D3, D4 and D5 [ ]. They are distinguished by their G-protein coupling, ligand specificity, anatomical distribution and physiological effects. Dopamine receptors are divided into two further subfamilies. The D1-like family consists of D1 and D5 receptors, which couple to Gs and mediate excitatory neurotransmission. The D2-like family, meanwhile, consists of D2, D3 and D4 receptors, which couple to Gi/Go and mediate inhibitory neurotransmission. Although dopamine receptors are widely distributed in the brain, they are found in different locations that have different receptor type densities, presumably reflecting different functional roles []. D1 and D2 receptor subtypes are found at 10-100 times the levels of the D3, D4, D5 subtypes [].This entry represents the dopamine D2 receptors. They have a similar pharmacological profile to D3 and D4 receptors. The D2 receptor is present in high levels in the principal dopamine projection areas (including the caudate-putamen, nucleus accumbens and olfactory tubercle); they are found in cell bodies of dopaminergic neurons in the substantia nigra and ventral tegmental area, and in the periphery they are found in the pituitary, heart and blood vessels. In humans, the pulmonary artery expresses D1, D2, D4, and D5 and receptor subtypes, which may account for vasodilatory effects of dopamine in the blood [ ].Dopamine D2 receptors have been shown to be important in the reward effects of morphine [ ]. D2 receptor knockout mice have been shown exhibit abnormal synaptic plasticity [] and to display reduced levels of aggression [].
Protein Domain
Name: Dopamine D1 receptor
Type: Family
Description: Dopamine receptors are members of the rhodopsin-like G-protein coupled receptor family and are prominent in the vertebrate central nervous system (CNS). Dysfunction of dopaminergic neurotransmission in the CNS has been implicated in a variety of neuropsychiatric disorders [ ], including social phobia [], Tourette's syndrome [], Parkinson's disease [], schizophrenia [], neuroleptic malignant syndrome [], attention-deficit hyperactivity disorder (ADHD) [] and drug and alcohol dependence [, ]. As a result, dopamine receptors are common drug targets; antipsychotics are often dopamine receptor antagonists while psychostimulants are typically indirect agonists of dopamine receptors [, , , ].There are at least five different known subtypes of dopamine receptors designated D1, D2, D3, D4 and D5 [ ]. They are distinguished by their G-protein coupling, ligand specificity, anatomical distribution and physiological effects. Dopamine receptors are divided into two further subfamilies. The D1-like family consists of D1 and D5 receptors, which couple to Gs and mediate excitatory neurotransmission. The D2-like family, meanwhile, consists of D2, D3 and D4 receptors, which couple to Gi/Go and mediate inhibitory neurotransmission. Although dopamine receptors are widely distributed in the brain, they are found in different locations that have different receptor type densities, presumably reflecting different functional roles []. D1 and D2 receptor subtypes are found at 10-100 times the levels of the D3, D4, D5 subtypes [].This entry represents the dopamine D1 receptor, also known as D(1A) dopamine receptor. The receptors are found in greatest abundance in the caudate-putamen, nucleus accumbens and olfactory tubercle, with lower levels in the frontal cortex, habenula, amygdala, hypothalamus and thalamus. In the periphery, binding sites are found in the kidney, heart, liver and parathyroid gland. In humans, the pulmonary artery expresses D1, D2, D4, and D5 receptor subtypes, which may account for vasodilatory effects of dopamine in the blood [ ]. In rats, dopamine D1 receptors are present on the smooth muscle of the blood vessels in most major organs [] and have been shown to have vasodilation effects []. They are also present on the juxtaglomerular apparatus and on renal tubules []. Dopamine D1 receptor knockout mice have been shown to have reduced motivation for alcohol consumption [ ]. A single nucleotide polymorphism in the receptor has been associated with bipolar disorder [].
Protein Domain
Name: Protein serine/threonine phosphatase 2C, C-terminal
Type: Domain
Description: Protein phosphatase 2C (PP2C, also known as Protein phosphatase 1) is involved in regulating cellular responses to stress in various eukaryotes. It consists of two domains: an N-terminal catalytic domain and a C-terminal domain characteristic of mammalian PP2Cs. This domain consists of three antiparallel alpha helices, one of which packs against two corresponding α-helices of the N-terminal domain. The C-terminal domain does not seem to play a role in catalysis, but it may provide protein substrate specificity due to the cleft that is created between it and the catalytic domain [ ].
Protein Domain
Name: Reovirus RNA-dependent RNA polymerase lambda 3
Type: Family
Description: The sequences in this family are similar to the reoviral minor core protein lambda 3 ( ), which functions as a RNA-dependent RNA polymerase within the protein capsid. It is organised into 3 domains. The N- and C-terminal domains create a "cage"which encloses a conserved central catalytic domain within a hollow centre. This catalytic domain is arranged to form finger, palm and thumb subdomains. Unlike other RNA polymerases, such as HIV reverse transcriptase and T7 RNA polymerase, the lambda 3 protein binds template and substrate with only localised rearrangements, and catalytic activity can occur with little structural change. However, the structure of the catalytic complex is similar to that of other polymerase catalytic complexes with known structure [ ].
Protein Domain
Name: Purine catabolism PurC-like domain
Type: Domain
Description: This domain is found in the purine catabolism regulatory protein expressed by Bacillus subtilis (PucR, ). PucR is thought to be a transcriptional regulator of genes involved in the purine degradation pathway, and may contain a LysR-like DNA-binding domain [ ]. It is similar to LysR-type regulators in that it represses its own expression []. The other members of this family are also putative regulatory proteins.
Protein Domain
Name: Plasmid pRiA4b, Orf3
Type: Family
Description: Members of this family are similar to the protein product of ORF-3 ( ) found on plasmid pRiA4 in the bacterium Agrobacterium rhizogenes. This plasmid is responsible for tumourigenesis at wound sites of plants infected by this bacterium, but the ORF-3 product does not seem to be involved in the pathogenetic process [ ]. Other proteins found in this family are annotated as being putative TnpR resolvases (, ), but no further evidence was found to back this. Moreover, another member of this family is described as a probable lexA repressor ( ) and in fact carries a LexA DNA binding domain ( ), but no references were found to expand on this.
Protein Domain
Name: RTP801-like
Type: Family
Description: RTP801, also known as REDD1, is the protein product of a hypoxia-inducible factor 1 (HIF-1)- responsive gene and is thought to be involved in various cellular processes [ ]. Both RTP801 and RTP801-like (REDD2) work downstream of AKT and upstream of TSC2 to inhibit mTOR, a serine/threonine kinase that plays an essential role in cell growth control []. Two members of this family expressed by Drosophila melanogaster, Scylla () and Charybde ( ), are designated as Hox targets [ , ].
Protein Domain
Name: Protein of unknown function DUF3294
Type: Family
Description: This family was annotated as mitochondrial ribosomal protein Mrp8, based on the presumed similarity of the S.cerevisiae protein to an E.coli mitochondrial ribosomal protein. However, this similarity is spurious, and the function is not known.
Protein Domain
Name: Thrombin receptor
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups [ ]. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [ , , ].Thrombin is a serine protease with a central role in blood clotting. It cleaves various substrates involved in coagulation, and activates cellsurface receptors via a novel proteolytic action. Thrombin stimulates aggregation and secretion in blood platelets at the site of vascular injury,and also has inflammatory and reparative actions, stimulating chemotaxis in monocytes, proliferation of fibroblasts and lymphocytes, and inducingendothelium-dependent relaxation of blood vessels. The protein activates a number of substrates involved in coagulation: it cleaves fibrinogen tofibrin and activates coagulation factor XIII; it also activates factors V and VIII. When bound to thrombomodulin, it activates plasma protein C,which, in concert with protein S, inactivates factors Va and VIIIa, leading to a decrease in thrombin formation.The thrombin receptor is expressed in high levels in platelets, vascular endothelial cells, and various cell lines. The receptor activatesphosphoinositide metabolism via a pertussis-toxin-insensitive G-protein, and inhibits adenylyl cyclase via a pertussis-toxin-sensitive G-protein.
Protein Domain
Name: Alphavirus E2 glycoprotein
Type: Family
Description: Alphaviruses are enveloped RNA viruses that use arthropods such as mosquitoes for transmission to their vertebrate hosts, and include Semliki Forest and Sindbis viruses [ ]. Alphaviruses consist of three structural proteins: the core nucleocapsid protein C, and the envelope proteins P62 and E1 () that associate as a heterodimer. The viral membrane-anchored surface glycoproteins are responsible for receptor recognition and entry into target cells through membrane fusion. The proteolytic maturation of P62 into E2 and E3 ( ) causes a change in the viral surface. Together the E1, E2, and sometimes E3 glycoprotein "spikes"form an E1/E2 dimer or an E1/E2/E3 trimer, where E2 extends from the centre to the vertices, E1 fills the space between the vertices, and E3, if present, is at the distal end of the spike [ , ]. Upon exposure of the virus to the acidity of the endosome, E1 dissociates from E2 to form an E1 homotrimer, which is necessary for the fusion step to drive the cellular and viral membranes together []. This entry represents the alphaviral E2 glycoprotein. The E2 glycoprotein functions to interact with the nucleocapsid through its cytoplasmic domain, while its ectodomain is responsible for binding a cellular receptor.
Protein Domain
Name: Peptidase S3, togavirin
Type: Domain
Description: Togavirin, also known as Sindbis virus core endopeptidase, is a serine protease resident at the N terminus of the p130 polyprotein of togaviruses [ ]. The endopeptidase signature identifies the peptidase as belonging to the MEROPS peptidase family S3 (togavirin family, clan PA(S)). The polyprotein also includes structural proteins for the nucleocapsid core and for the glycoprotein spikes []. Togavirin is only active while part of the polyprotein, cleavage at a Trp-Ser bond resulting in total lack of activity []. Mutagenesis studies have identified the location of the His-Asp-Ser catalytic triad, and X-ray studies have revealed the protein fold to be similar to that of chymotrypsin [, ].
USDA
InterMine logo
The Legume Information System (LIS) is a research project of the USDA-ARS:Corn Insects and Crop Genetics Research in Ames, IA.
LegumeMine || ArachisMine | CicerMine | GlycineMine | LensMine | LupinusMine | PhaseolusMine | VignaMine | MedicagoMine
InterMine © 2002 - 2022 Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, United Kingdom