Library preparation and MiSeq sequencing for the genotyping-by-sequencing of the Huntington disease HTT exon one trinucleotide repeat and the quantification of somatic mosaicism
Abstract Huntington disease \(HD) is an autosomal dominant neurodegenerative disorder caused by the expansion of a CAG repeat in the first exon of the _HTT_ gene. Affected individuals inherit more than 40 repeats and the CAG repeat is genetically unstable in both the germline and soma. Molecular diagnosis and genotyping of the CAG repeat is traditionally performed by estimation of PCR fragment size. However, this approach is complicated by the presence of an adjacent polymorphic CCG repeat and provides no information on the presence of variant repeats, flanking sequence variants or on the degree of somatic mosaicism. To overcome these limitations, we have developed an amplicon-sequencing protocol that allows the sequencing of hundreds of samples in a single MiSeq run. The composition of the _HTT_ exon one trinucleotide repeat locus can be determined from the MiSeq sequencing reads generated. With sufficient sequencing depth, such MiSeq data can also be used to quantify the degree of somatic mosaicism of the _HTT_ CAG repeat in the tissue analysed.