Interestingly, the approach for generating and analyzing the Kalahari huntergatherer whole genome sequence presented in the paper is unique to other recently published Asian, Yoruban, and European individual genomes. "We recognized that the genomes of the southern African participants in this study would diverge more from the human reference genome than other publically available genome sequences," explained Stephan C. Schuster, lead author and Professor of Biochemistry and Molecular Biology at Penn State University. "As a result, the goal was to generate data of sufficient quality for de novo genome assembly rather than simply mapping against the human reference."
In order to generate the massive amounts of highquality data required for de novo assembly of a human genome, the researchers turned to the GS FLX System with longread GS FLX Titanium Series chemistry. "The long reads were critical to identifying the full range of genetic variation in this unique population," explained Schuster. "In the end, we were able to generate the complete sequence of one Kalahari Bushman genome at 10fold coverage, using both shotgun and 17 Kb span pairedend reads, as well as the proteincoding regions of all five participant's genomes at 16fold coverage using target enrichment with NimbleGen Sequence Capture arrays."
The study results were consistent with the belief that southern Africans are among the most divergent of all human populations. The researchers identified more SNPs in their genomes than in other individual human genomes sequenced to date, as well as thousands of novel SNPs. "These results will be a rich resource for future work, providing many new candidate functional sites that have not been included in wholegenome association studies," said Vanessa Hayes, a project coleader and Group Leader Cancer Genetics at the Children's Cancer Institute Australia for Medical Research at the University of New South Wales. "Ultimately, we hope that these sequences will serve as an important cultural and genetic archive of this indigenous population, one of the last remaining huntergather societies."
"This study exemplifies the power of the 454 Sequencing System to comprehensively analyze whole genomes or targeted regions to identify both known and novel variants. We applaud the research team for recognizing the importance of de novo assembly, particularly for this genetically distinct human population," said Michael Egholm, Vice President of R&D and Chief Technology Officer 454 Life Sciences. "In this study, the combination of long GS FLX Titanium reads and NimbleGen Sequence Capture Exome arrays also allowed the researchers to obtain a highresolution picture of the proteincoding regions of all five study participants, offering an economical alternative whole genome sequencing method."
For more information on 454 Sequencing Systems and NimbleGen Sequence Capture arrays, visit www.454.com or www.nimblegen.com.