Scientists say they have completed the first full and seamless catalogue of genetic instructions of humans.
Until now, about 8% of the human genome code was missing from the blueprint, experts told the journal Science.
The quest to decipher and complete a gapless DNA sequence has taken decades.
The resource helps explain how to create every human cell in the body, which could shed light on how to treat, prevent and cure disease.
The first one, deciphered 20 years ago, included most of the regions that code for proteins but left out about 200 million bases – the rungs that make up the ladder of the famous double helix DNA structure.
Now, just 10 million bases or letters of human DNA out of roughly 3 billion, are unknown.
Tour de force
Most of the newly added DNA coding sit in the dense middle sections of each chromosome – the neat bundles of tightly coiled DNA found in the nucleus of almost every cell in our body – as well as the protective caps or ends, called telomeres.
The breakthrough was being able to interpret the very repetitive stretches of DNA sequences that occur in these regions.
In the early 90s, when the first attempt at cracking the human genome began, the science of reading DNA was still somewhat limited. The repeats could too easily be skipped or their bases linked together incorrectly.
Since then, technology has moved on, making it possible to accurately read all of the DNA – including the long parts containing reams of repeating code.
Dr Adam Phillippy from the National Human Genome Research Institute in the US, one of the many scientists behind the work, said: “The biggest change within the last decade has been the emergence of so-called long-read sequencing technology.”
It means scientists can study longer portions of the code in one go.
“Our puzzle pieces just got way bigger, and that allowed us to put these regions of the genome back together again,” Dr Phillipy explained.
That might be a gamechanger for understanding human traits and evolution.
Backbone for research
Dr Phillipy said: “Because these regions are so dynamic and changing so rapidly, there’s some hope that any of the clues to what makes humans uniquely human in terms of cognition, and an increased brain size, etcetera, might have some connection to these regions that we’ve uncovered.”
Prof Colin Johnson, an expert in Medical and Molecular Genetics at the University of Leeds, called the work ‘truly ground-breaking’.
He told me: “They’ve uncovered the missing 8% – the dark matter of the genome that we knew was there but couldn’t get to.
“It gives us the full description of the book of life, with all of the paragraphs and pages assembled.”
The deciphered sample, from an anonymous donor, contained an X chromosome but no ‘male’ Y.
To get round that, the researchers also studied some DNA from a 51-year-old Harvard University biologist, Leonid Peshkin, who says he was excited to be part of the project.
Now the sequencing is done, the researchers want to decode full sets of DNA from other individuals, to capture all of the variation that exists in people.
Dr Phillipy said: “Because we spent all of the hard work at the outset, getting this one complete and correct, we can now start to layer on these additional genomes on top of it, and do a so-called pan-genome representation that will have this as a basis but then have all of the variation kind of branching off of it.”
The Human Pangenome team has made a start by deciphering 70 more, with a goal of 350 from people of diverse ancestries.