This ambitious project aims to sequence the DNA of all complex life on Earth

“We are just beginning to understand the full majesty of life on Earth,” the founding members of the Earth BioGenome Project wrote in 2018. The ambitious project raised eyebrows when it was first announced. They are trying to genetically profile more than a million plants, animals and fungi. Documenting these genomes is the first step toward building an atlas of complex life on Earth.

Many living species remain mysterious to science. The database resulting from the project would be a valuable resource for biodiversity monitoring. It could also shed light on the genetic “dark matter” of complex life and inspire new biomaterials, drugs or spark ideas for synthetic biology. Additional insights could adapt agricultural practices to increase food production and feed a growing global population.

In other words, digging into the genetic data of living creatures is supposed to reveal “unimaginable biological secrets,” the team wrote.

Problem? A hefty price tag. With an estimated cost of $4.7 billion, even the founders of the project called it a moonshot. However, against all expectations, the project has progressed, with 3,000 genomes already sequenced by 2026 and another 10,000 species expected by 2026.

While the project is falling short of its original goal of sequencing roughly 1.7 million genomes in a decade, it still hopes to reach that goal by 2032 — later than the original stick, but at a much lower cost thanks to more efficient DNA sequencing technologies.

Meanwhile, the international team has also built an infrastructure to share gene sequencing data, and machine learning methods are further helping the consortium analyze thousands of data sets – helping to characterize new species and monitor DNA data for those at risk.

Expanding the range

Genetic material is everywhere. It is a rich resource for understanding life on Earth. As genetic sequencing becomes faster, cheaper and more reliable, recent studies have begun digging into the information represented by DNA from species around the world.

One method, called metagenomics, captures and analyzes microbial DNA collected in a variety of environments, from urban sewers to boiling hot springs. This method captures and analyzes all the DNA from a given source to create a broad genetic picture of the bacteria from that environment. Rather than bacteria, the Earth BioGenome Project, or EBP, focuses on sequencing the genomes of individual eukaryotic creatures—essentially those that store most of their DNA in a nut-like structure, or nucleus, inside each cell.

This group includes humans, plants, fungi and other animals. According to one estimate, there are roughly 10 to 15 million eukaryotic species on our planet. However, only a little over two million have been documented.

Sequencing DNA from eukaryotic cells could greatly expand our knowledge of Earth’s genetic diversity. Such a database could also be a treasure trove for synthetic biology. Scientists have already looked at the genetic blueprints of life in bacteria and yeast cells. Deciphering—and then reprogramming—their genes has led to advances such as coaxing bacterial cells to pump biofuels, degradable materials and drugs like insulin.

Mapping the genomes of eukaryotes could further inspire new materials or drugs. For example, cytarabine, a chemotherapy drug, was originally isolated from a spongy sea creature and approved by the FDA to treat blood cancer that has spread to the brain. Other herbal medicines are already used to treat viral infections or to control pain. From almost 400,000 different plant species, hundreds of drugs have already been approved and are on the market. Similarly, deciphering plant genetics has sparked ideas for new biodegradable materials and biofuels.

Genetic sequences from complex organisms can “provide raw materials for genome engineering and synthetic biology to produce valuable bioproducts on an industrial scale,” the team wrote.

In addition to medical and industrial uses, this effort also documents biodiversity. Creating a digital DNA library of all known eukaryotic life can determine which species are most at risk—including species that are not yet fully characterized—and provide data for earlier intervention.

“For the first time in history, it is possible to efficiently sequence the genomes of all known species and use genomics to help discover the remaining 80 to 90 percent of species that are currently hidden from science,” the team wrote.

Soldiers on

The project has three phases.

The first phase lays the groundwork. It determines the species to be sequenced, builds a digital infrastructure for data sharing, develops a toolkit for analysis. The most important goal is to create a reference DNA sequence for species of similar genetic make-up – that is, those in a “family”.

Reference genomes are incredibly important for genetic studies. True to their name, scientists rely on them as a basis for comparing genetic variants—for example, tracking genes associated with inherited diseases in humans or the sugar content of different crop variants.

The second phase of the project will begin to analyze the sequencing data and create strategies for biodiversity conservation. The final phase integrates all previous work to potentially revise how different species fit into our evolutionary tree. Scientists will also incorporate climate data into this phase and explain the impacts of climate change on biodiversity.

The international project began in 2018 and involved the US, UK, Denmark and China, with the majority of DNA samples sequenced at facilities in China and the UK. Today, 28 countries on six continents signed up. Most of the DNA material isolated from individual species is sequenced on site, which reduces transport costs while increasing fidelity.

Not all participants have easy access to DNA sequencing equipment. One institution, Wellcome Sanger, has developed a portable DNA sequencing lab that could help scientists working in rural areas capture the genetic blueprints of exotic plants and animals. The facility sequenced the DNA of a species of sunflower with potential medicinal properties in Africa, among other samples from exotic locations.

EBP follows in the footsteps of other global projects that aim to sequence terrestrial microbes, such as the National Microbiome Initiative or the Earth Microbiome Project. Also once considered moonshots, they secured funding from government agencies and private investment.

Despite the enthusiasm of its participants, the EBP is still billions of dollars short of bringing it to full completion. But the cost of the project — initially estimated at trillions of dollars — may be much lower.

With more efficient and cheaper methods of genetic sequencing, the current cost of the first phase is expected to be half of the original estimate – about $265 million.

It’s still a hefty sum, but for the participants, the resulting database and methods are worth it. “We now have a common forum to learn together how to produce genomes of the highest possible quality,” said Alexandre Aleixo of the Vale Institute of Technology, who participated in the project. Science.

Given the impact that bacterial genetics has already had on biomedicine and biofuels, it is likely that deciphering eukaryotic DNA may spark further inspiration. Ultimately, the project relies on global cooperation for the benefit of humanity.

“The far-reaching potential benefits of creating an open digital repository of genomic information for life on Earth can only be realized through a coordinated international effort,” the team wrote.

Image Credit: M. Richter on Pixabay

Leave a Comment