Three decades ago, scientists from across the world undertook a hugely ambitious endeavour. Their goal was to read the entire sequence of base pairs (molecular letters G, A, C and T) that make up human DNA, and to identify all of the genes within the human genome.
This project was a monumental undertaking that took over a decade and cost roughly $3 billion, but it was successful. Since then, genome sequencing techniques have advanced so far that it is now possible to fully sequence the human genome for less than $1000. Now that the ability to know ones own genetic makeup is available to many of us, a question arises: what should we actually do with this information?
Firstly, let’s not understate the impact that genome sequencing has had. Genome sequencing has fundamentally changed multiple fields: it has allowed us to link genetic variants to diseases like cancer. It has advanced our understanding of human evolution, improved drug design and forensic science, and allows us to rapidly genotype viruses to understand their pathology and develop treatments.
However, while we may now have the technology to rapidly and accurately sequence the genome, we are still missing a lot of information when it comes to actually interpreting what that sequence means. There are regions of DNA for which the function is unknown and genes with roles that are not well understood. There are even parts of the genome that were never actually fully sequenced.
Getting your genome sequenced might tell you if you have a predisposition to a given disease and help you adjust your lifestyle accordingly. However, genetic susceptibility to some of the most common and deadly diseases (like heart disease and diabetes) are the product of thousands of minute genetic variations, and we are still in the process of working out how to add these up to produce a useful risk score. Thus, for the majority of healthy individuals, there’s really not much you can actually do with your genetic sequence today – the data is available, but our lack of understanding of what that data means holds us back.
So, how do we move forward? Identifying the biological functions of human genes is an ongoing process, and one for which Crispr gene editing has been a significant boost. Crispr allows small edits to be made to the DNA sequence, allowing scientists to add, remove or alter genetic variants and study their effects in cell lines in the lab. Once we have generated this data, the next step will be to train machine learning algorithms to look for patterns and link genetic variants to their biological functions. This would theoretically remove the need to even conduct experiments using Crispr, as a machine learning algorithm should be able to simply predict the effects of a mutation based on masses of existing data.
Many of the obstacles currently limiting the application of genome sequencing involve data analysis and computation. In a way, this is a good sign – a symptom of success. Sequencing technology has progressed far enough that we now have more data than we know what to do with. That means that as other genetic technologies continue to be developed, we will already have the foundational tools we need to apply them to human disease.
30 Years Since the Human Genome Project Began, What’s Next?: https://www.wired.com/story/30-years-since-the-human-genome-project-began-whats-next/