Table of Contents (click to expand)
The 21st and 22nd amino acids are selenocysteine (Sec) and pyrrolysine (Pyl). They sit alongside the familiar 20 canonical amino acids but are added to growing proteins through stop codons (UGA for Sec, UAG for Pyl) rather than ordinary sense codons. Selenocysteine appears in 25 human selenoproteins and in roughly a quarter of sequenced bacteria; pyrrolysine is restricted to a handful of methanogenic archaea and a small number of anaerobic bacteria.
Most people are well aware that there are 20 amino acids that build our proteins, but what if I told you that was actually false, and in fact, there are a few additional amino acids that we seldom talk about?

A total of 22 amino acids have been defined in our DNA. Selenocysteine (Sec) and Pyrrolysine (Pyl) are the 21st and 22nd amino acids, respectively. They are referred to as rare amino acids, as they are not as prevalent in nature as the rest of the amino acids.
Recommended Video for you:
Why Was This Finding So Shocking?
Four nucleotide bases, symbolized by the letters A, G, T and C, make up the entirety of our DNA. When the genetic code was discovered in the 1960s, scientists found that it was read 3 letters at a time. These three letters are collectively known as codons and are present on messenger RNA (mRNA).
Each codon was considered to have a single function: either denoting one of the 20 amino acids or denoting the beginning (via start codon) or ending point (via stop codon) of the protein-generating translation machinery.

Though these 20 amino acids are the backbone of every protein, some proteins have different or non-traditional amino acids. Most of these amino acids, scientists found, were derived from the original 20 whose structure had changed after the polypeptide chain formed at the end of translation. These changes are known as post-translational modifications, and they are necessary to give the protein its necessary function.
Even though these amino acids are uncommon, they still have important functions. For example, 4-hydroxylysine and 5-hydroxyproline are derivatives of lysine and proline, respectively, and are found in collagen (a protein found in connective tissue).

Therefore, when selenocysteine (Sec) and pyrrolysine (Pyl) were first discovered in proteins, they were thought to result from such post-translational modifications to cysteine and lysine, respectively. However, in 1986, August Böck and colleagues at the University of Munich showed that selenocysteine is co-translationally inserted at an in-frame UGA codon in the bacterium E. coli (the discovery is now usually credited via their 1991 review). 16 years later, in 2002, Joseph Krzycki's group at Ohio State University showed that pyrrolysine is decoded at the stop codon UAG in a methanogenic enzyme.
By the mid-1960s, it was well known that the start codon (AUG) coded the amino acid methionine, but the stop codons (UAG, UAA, and UGA) were not believed to code for any proteins, but simply terminate translation. In fact, they have even been referred to as nonsense codons, since they do not form amino acids.
Therefore, this discovery was groundbreaking, as it attributed a new role to the stop codons. Additionally, the two discoveries were made separately, in two distinct organisms (E. coli and mice), indicating that such unusual amino acids were present across a wide variety of species.

Why Are They Rare?
Even though Selenocysteine (Sec) and Pyrrolysine (Pyl) are coded for in the DNA, unlike standard amino acids, they require a special mechanism to be incorporated into a protein. In fact, they require two mechanisms, because even though these two rare amino acids are both coded by stop codons, they do so by utilizing completely different mechanisms.
The presence of Pyrrolysine is restricted to a small fraction of organisms. When the original 2009 survey was published, only 11 of the ~1,000 fully sequenced genomes carried the Pyl machinery. With hundreds of thousands of genomes now sequenced, the list has grown substantially, but pyrrolysine remains far rarer than selenocysteine, concentrated in methanogenic archaea (notably Methanosarcinaceae and all Methanomassiliicoccales), some Asgard archaea, and a few anaerobic bacterial phyla.
On the other hand, selenocysteine is present in a plethora of organisms across all three domains (Archaea, Bacteria, Eukarya) of life. It is believed to be synthesized by nearly a quarter of sequenced bacteria. However, interestingly, the current human selenoproteome stands at 25 selenoproteins, including five glutathione peroxidases (GPx1–4, 6), three iodothyronine deiodinases (DIO1–3), three thioredoxin reductases (TXNRD1–3) and selenoprotein P (SELENOP).
Are They Important?
Scientists already knew that even though many non-standard amino acids may not be incorporated into proteins, they are still crucial intermediates in numerous metabolic processes. Since selenocysteine and pyrrolysine are actually coded by the DNA, they were even more likely to be essential. However, due to their rarity, the importance of these two amino acids has been overlooked until recently.
Selenocysteine is structurally similar to cysteine and contains an essential micronutrient, selenium, in place of the sulfur atom found in cysteine. It is a crucial amino acid found in selenoproteins and is associated with a number of metabolic and cellular processes. A deficit of selenium in the brain has been found to cause neurological abnormalities like seizures. It has been linked with a number of other diseases, in addition to neurodegenerative disorders. However, researchers have yet to identify its exact role in the disease mechanism.

Pyrrolysine was identified in Methanosarcina barkeri, a strictly anaerobic methanogen that lives in sediments, sewage digesters, and the rumen of cattle. Pyl sits in the active site of the methylamine methyltransferases (MtmB, MtbB, MttB) that catalyze the first step of methanogenesis from methylamines, so for years it was assumed to be a methanogen-only oddity. Since then, pyrrolysine-encoding machinery has been spotted across the entire 7th order of methanogens (Methanomassiliicoccales), in some Asgard archaea, and in a handful of anaerobic bacteria. A 2024 Science paper even reported an archaeal lineage with an entirely new genetic code (Code 34) in which every UAG codon is read as pyrrolysine, the most dramatic expansion of the Pyl story since its discovery.

Conclusion
Why and when during the evolution of life these two amino acids were added to the genomes of a few organisms remains a mystery. While they are currently categorized as rare, further research might prove us wrong. After all, in the words of Albert Einstein, ‘We still do not know one-thousandth of one percent of what nature has revealed to us’.
References (click to expand)
- Atkins, J. F., & Gesteland, R. F. (2000, September). The twenty-first amino acid. Nature. Springer Science and Business Media LLC.
- Borrel, G., Gaci, N., Peyret, P., O'Toole, P. W., Gribaldo, S., & Brugère, J.-F. (2014). Unique Characteristics of the Pyrrolysine System in the 7th Order of Methanogens: Implications for the Evolution of a Genetic Code Expansion Cassette. Archaea. Hindawi Limited.
- Wirth, E. K., Conrad, M., Winterer, J., Wozny, C., Carlson, B. A., Roth, S., … Schweizer, U. (2009, November 4). Neuronal selenoprotein expression is required for interneuron development and prevents seizures and neurodegeneration. The FASEB Journal. Wiley.
- Korotkov, K. V., Novoselov, S. V., Hatfield, D. L., & Gladyshev, V. N. (2002, March). Mammalian Selenoprotein in Which Selenocysteine (Sec) Incorporation Is Supported by a New Form of Sec Insertion Sequence Element. Molecular and Cellular Biology. Informa UK Limited.
- New Amino Acid Debuts. Science. AAAS
- Rother, M., & Krzycki, J. A. (2010). Selenocysteine, Pyrrolysine, and the Unique Energy Metabolism of Methanogenic Archaea. Archaea. Hindawi Limited.
- Yuan, J., O'Donoghue, P., Ambrogelly, A., Gundllapalli, S., Sherrer, R. L., Palioura, S., … Söll, D. (2009, November 10). Distinct genetic code expansion strategies for selenocysteine and pyrrolysine are reflected in different aminoacyl-tRNA formation systems. FEBS Letters. Wiley.
- Böck, A., Forchhammer, K., Heider, J., Leinfelder, W., Sawers, G., Veprek, B., & Zinoni, F. (1991, March). Selenocysteine: the 21st amino acid. Molecular Microbiology. Wiley.
- Krzycki, J. A. (2004, October). Function of genetically encoded pyrrolysine in corrinoid-dependent methylamine methyltransferases. Current Opinion in Chemical Biology. Elsevier BV.












