Researchers have constructed a rough draft of a human pangenome, a collection of DNA that represents the genetic information from 47 individuals. The development — a landmark in genomics — magnifies and expands scientists’ view of the code that makes us both uniquely human and different from one another.
The new pangenome, which was described in articles published Wednesday in the scientific journal Nature, could help scientists unlock small genetic differences that contribute to the development of conditions like heart disease or schizophrenia, among others, researchers said Tuesday at a news briefing.
“We are finding remarkable patterns of genetic variation,” said Evan Eichler, a genome sciences professor at the University of Washington School of Medicine who was involved in the project. “There were hints of this before, but we didn’t actually have the right microscope to see this.”
The pangenome dramatically expands upon the original human genome reference, which scientists have been using for about two decades. That original reference came largely from a single person, which limits its effectiveness.
Because the DNA of 47 people is represented, the new pangenome better reflects the genetic information of people across the globe and from different backgrounds. Scientists say it represents much of the genetic diversity among humans, including the most common variations.
“We now understand that having one map of a single human genome cannot adequately represent all of humanity,” said Karen Miga, an author of the research who is the associate director for human pangenomics at the University of California, Santa Cruz, Genomics Institute. Miga added that the new effort was a more inclusive and “equitable” approach.
The new pangenome could help scientists identify influential genetic differences that have flown under the radar because of limitations of the old reference sequence. Researchers behind the project say it has the potential to contribute to personalized, precision medicine — the development of medical care that’s tailored to an individual’s genetic makeup.
Such a powerful tool also raises important ethical questions over genetic privacy and whose data is included and how. The research group said it’s sensitive to those concerns.
More than 100 scientists contributed as authors to the research papers describing the work. The scientists, who are calling themselves the Human Pangenome Reference Consortium, plan to ultimately include genetic samples from about 350 people in the reference. The data will be publicly available and shared openly.
To build the pangenome, scientists used data from the 1000 Genomes Project, which included participants from across ethnic groups. The data is anonymous, meaning that some steps have been taken to protect the identities of those whose DNA is on display. The scientists said the 47 people used in this project represent a diverse geographic distribution across different human populations.
Researchers and clinicians typically use the human genome reference as a baseline with which to compare a subject’s DNA and to scrutinize it for key differences that could indicate things like a genetic disease trigger. It has limitations: The original reference genome introduces bias, and its performance isn’t equal in all groups of people, the researchers said.
The scientists hope the pangenome reference will improve performance and make it more inclusive.
“By sampling broadly across the genetic tree of humanity, it benefits everybody,” said Ira Hall, a professor who is part of the research group and is the director of the Yale Center for Genomic Health at Yale University.
The new pangenome takes up about 3 gigabytes of space on a computer. The researchers have developed algorithms that reference, map and search within the pangenome’s structure.
In the immediate future, the consortium expects that the expanded scope of the reference will help scientists better study particular genes.
This research follows another recent landmark moment in genomics. Last year, scientists announced that they had finally sequenced a human genome in its entirety, including the extremely repetitive and complex pieces of DNA that had left them in the dark for decades.
Both of these breakthroughs were made possible by the advance of new technology that can perform much longer reads of DNA, which eliminates much of the puzzle work involved in matching together short fragments of genetic code.
The pangenome research does raise concerns about privacy, representation and over who controls genomic information.
Sleuth scientists examining previous genome projects found that it was possible to identify the names of subjects based on published data that was shared publicly after they consented to share it with researchers.
Genetic research includes a history of concern over mistreatment, exploitation and a lack of engagement in indigenous and tribal communities.
The research group members said they are committed to respecting tribes and groups that have formal policies against contributing genomic data to their project and “not to work around issues of data sovereignty,” said Eimear Kenny, director of the institute for genomic health at the Icahn School of Medicine at Mount Sinai, who is an author on the research.