An international consortium of researchers has completed the pilot phase of the 1000 Genomes Project, a public private effort to systematically investigate and map genetic differences among individuals and populations that could have huge medical implications.
The aim of the pilot phase of the project was to design, develop and compare different strategies for genome wide sequencing with high-throughput platforms.
As well as coming up with new methods of analysing genomic data, the researchers used several techniques to sequence the whole genomes of 179 individuals, generating a catalogue of 8m previously unknown variants affecting single nucleotides and around 1m structural variants due to small insertions or deletions of DNA, according to Rasmus Nielsen of the Department of Integrative Biology and Statistics at the University of California, Berkeley (Nature, 2010, 467, 1050).
The pilot involved three projects whose results were combined: low coverage whole genome sequencing of 179 individuals from four populations; high coverage sequencing of two mother father child trios; and exontargeted sequencing of 697 individuals from seven populations (Nature, doi: 10.1038/nature09534).
‘The amount of information delivered by this first stage of the project is remarkable,’ said coauthor Richard Durbin of the Sanger Institute in the UK. ‘In less than two years, we identified 15m single letter changes, 1m small deletions or insertions and 20,000 larger variants. The majority of these variants – around 8m – had never been seen before. This is the largest catalogue of its kind, and having it in the public domain will help maximise the efficiency of human genetics research.’ The methods and data resulting from the pilot programme will support the next phase of human genetic research, the researchers say.
The 1000 Genomes Project, which was launched in 2008 and uses nextgeneration sequencing technology to compare the genomes of people from different parts of the world, builds upon the International HapMap Project (C&I 2005, 21). It is hoped that the comprehensive sets of genetic variants discovered will shed light on the relationship between genotype and phenotype, enabling researchers to associate physical characteristics and traits, such as disease susceptibility and response to drugs, with particular regions on the genome.
‘For most of the many diseases known to have a genetic component, currently we don’t understand the basis of that genetic component,’ says Durbin. ‘Getting at that is a major current goal.’
Improved understanding of the genetic differences between people also lays the groundwork for personalised medicine. ‘Finding comprehensive sets of variants will lead to better localisation of chromosome regions that affect disease risk. Fairly quickly – the next few years – this information will become clinically useful to predict a person’s risk of getting various diseases, based on their variants,’ explains coauthor Lisa Brooks, from the US National Human Genome Research Institute. ‘Longer term, localising disease associated regions and then figuring out which genes and variants cause the disease risk will allow better prediction and the tailoring of therapies to particular variants in an individual.’
While the results of the project will not directly throw new light onto what causes certain diseases, they establish an information base that will make disease genetics research easier and more powerful, says Durbin. ‘This is a case of [DNA sequencing] technology enabling science, enabling applied research, enabling improvements in therapy.’
The 1000 Genomes Project, working in nine different centres, is expected to sequence 2500 genomes from five large population groups by the end of 2012.
Meanwhile, around 30,000 genomes are expected to be sequenced as part of large academic projects by 2011 (Nature, 2010, 467, 1026). ‘The technology is improving quickly, allowing many more genomes to be sequenced for a fixed cost,’ explains Brooks. ‘As the value of this sequencing becomes clear, more resources are being allocated to it, increasing the numbers even more.’