An intein is a segment of a protein that is able to excise itself and rejoin the remaining portions (the exteins) with a peptide bond.
Most reported inteins also contain an endonuclease domain that plays a role in intein propagation. In fact, many genes have unrelated intein-coding segments inserted at different positions. For these and other reasons, inteins (or more properly, the gene segments coding for inteins) are sometimes called selfish genetic elements but it may be more accurate to call them parasitic.
Inteins have also been called 'protein introns'.
Intein-mediated protein splicing occurs after mRNA has been translated into a protein. This pre-cursor protein contains three segements - an N-extein followed by the intein followed by a C-extein.
After splicing has taken place, the result is also called an extein.
The first intein was discovered in 1987. Since then, inteins have been found in all three domains of life (eukaryotes, bacteria, and archaea).
|Table of contents|
2 Intein naming conventions
3 Split inteins
Inteins have been engineered for particular applications such as protein synthesis, and the selective labeling of protein segments, which is useful for NMR studies of large proteins.
Intein naming conventions
The first part of an intein name is based on the
scientific name of the organism in which it is found, and the second part is based on the name of the corresponding gene or extein.
For example, the intein found in Thermoplasma acidophilum and associated with 'Vacuolar ATPase subunit A' (VMA) is called 'Tac VMA'.
Normally, as in this example, just three letters suffice to specify the organism, but there are variations. For example, additional letters may be added to indicate a strain.
If more than one intein is encoded in the corresponding gene, the inteins are given a numerical suffix starting from 5' to 3' or in order of their identification. For example, "Msm dnaB-1".
The segment of the gene that encodes the intein is usually given the same name as the intein, but to avoid confusion, the name of the intein proper is usually capitalized (e.g. Pfu RIR1-1), whereas the name of the corresponding gene segment is italicized.
For example, in Cyanobacteria, DnaE, the catalytic subunit
alpha of DNA polymerase III, is encoded by two separate genes,
dnaE-n and dnaE-c.
The dnaE-n product consists of an N-extein sequence followed by a 123-aa (amino acid) intein sequence, whereas the dnaE-c product consists of a 36-aa intein
sequence followed by a C-extein sequence.
Sometimes, the intein of the pre-cursor protein comes from two
genes. In this case, the intein is said to be a split intein.
For example, in Cyanobacteria, DnaE, the catalytic subunit alpha of DNA polymerase III, is encoded by two separate genes, dnaE-n and dnaE-c. The dnaE-n product consists of an N-extein sequence followed by a 123-aa (amino acid) intein sequence, whereas the dnaE-c product consists of a 36-aa intein sequence followed by a C-extein sequence.