Table of Contents (click to expand)
When the Human Genome Project officially finished in 2003 it had pieced together only about 92% of the genome. The remaining 8% sat in highly repetitive regions (centromeres, telomeres, and the short arms of acrocentric chromosomes) that short-read sequencing of the time could not unambiguously resolve. The Telomere-to-Telomere (T2T) consortium finally closed the gaps using long-read sequencing and announced the first gapless human genome on April 1, 2022.
It’s amazing how four different molecular letters can combine to make living organisms as small as a microbe and as ginormous as blue whales. These four molecular molecules, A (Adenine), T (Thymine), G (Guanine), and C (Cytosine) make up our DNA (along with sugars and phosphates).
The entire library of the sequences of ATCGs that hold the key is called the genome. The genome contains all of the information necessary to construct that organism and allow it to grow and develop over time. The size and complexity of a genome varies from species to species, and it is governed by a set of instructions in the form of DNA.
Think of the genome as a multi-story building constructed by the repetition of building blocks, while the different stories of the building store the information necessary for the proper functioning, signaling, and survival of the organism.
It is necessary to conduct a detailed analysis of this building in order to understand the problems (genetic disorders) that may be occurring in any part of it, which can be accomplished by beginning with the foundation.

Sequencing is the process of learning about a species’ genome in a detailed order, and is accomplished through research.
Recommended Video for you:
The Human Genome Project
The Human Genome Project began in October 1990 with the goal of sequencing the entire human genome. By 2003, however, the project had managed to sequence only about 92% of it. The remaining 8% was finally decoded after almost two more decades of work and announced by the Telomere-to-Telomere (T2T) consortium in April 2022. But why did it take so long to sequence the last 8%?
The answer lies in the framework of the human genome.
Each letter is a nucleobase that, attached to a sugar and a phosphate, makes one nucleotide. On the double helix, nucleotides on the two opposite strands pair up — A always with T, and C always with G — to form what are called base pairs. A long string of these base pairs makes up our entire genome. A string of nucleotides that codes for particular information (usually a protein) is a gene. The genes and all the strings of DNA that are not genes is collectively the genome.

Scientists now know that the human genome contains about 3.055 billion base pairs (T2T-CHM13, 2022), of which only 1-2% directly encodes proteins. The majority of the remaining 99% of the genome was considered useless—junk. And much of this 99% were long repeats of nucleotides, which was a significant problem.
It is like placing identical-looking bricks in a building, but these identical bricks must be placed in an appropriate order and sequence, which posed a significant challenge. Due to the similarity in structure and content of these repeating sequences, the technology we used to sequence couldn’t pick up and tell apart these sequences.
In other words, let’s say we have a cast that can only hold 100 bricks at a time; how can we be certain that the specific 100 bricks should be placed in this section? As a result, we needed a more extensive cast to lay all the bricks at once.
How Did They Solve The Problem?
This difficulty was overcome by a group of scientists working together as part of the Telomere to Telomere Consortium. The consortium managed to piece together the long repeats thanks to advancements in biotechnology and computational methods. The new methods developed required less memory than previous techniques, which allowed researchers to process the lengthy repeats. The new technology also helped bring down the costs of processing the data.
Conclusion
A complete genome refers to an individual’s entire genetic sequence; consequently, the complete human genome will serve as a reference for comparing various people’s genomes and identifying genetic differences that make us unique. It will also aid in the comparison of a family’s genome and understanding the source of genetic differences in order to identify the genes (active or inactive) that cause various inheritable diseases. This represents a significant step forward in the understanding and treatment of numerous genetic illnesses and mutations in people, as well as in the advancement of mankind.
References (click to expand)
- The Human Genome Project pieced together only 92% of the .... The Conversation
- International Human Genome Sequencing Consortium, Lander, E. S., Linton, L. M., Birren, B., Nusbaum, C., Zody, M. C., … The Wellcome Trust:. (2001, February 15). Initial sequencing and analysis of the human genome. Nature. Springer Science and Business Media LLC.
- The human genome sequence is now complete. National Human Genome Research Institute, 2022.
- (2008) The Human Genome Project. The Stanford Encyclopedia of PhilosophyIt












