Constructing a Genome-Wide LD Map of WildA. gambiaeUsing Next-Generation Sequencing
Anopheles gambiaeis the major malaria vector in Africa. Examining the molecular basis ofA. gambiaetraits requires knowledge of both genetic variation and genome-wide linkage disequilibrium (LD) map of wildA. gambiaepopulations from malaria-endemic areas. We sequenced the genomes of nine wildA. gambiaemosquitoes individually using next-generation sequencing technologies and detected 2,219,815 common single nucleotide polymorphisms (SNPs), 88% of which are novel. SNPs are not evenly distributed acrossA. gambiaechromosomes. The low SNP-frequency regions overlay heterochromatin and chromosome inversion domains, consistent with the lower recombinant rates at these regions. Nearly one million SNPs that were genotyped correctly in all individual mosquitoes with 99.6% confidence were extracted from these high-throughput sequencing data. Based on these SNP genotypes, we constructed a genome-wide LD map for wildA. gambiaefrom malaria-endemic areas in Kenya and made it available through a public Website. The average size of LD blocks is less than 40 bp, and several large LD blocks were also discovered clustered around theparagene, which is consistent with the effect of insecticide selective sweeps. The SNPs and the LD map will be valuable resources for scientific communities to dissect theA. gambiaegenome.