Identification and prioritisation of causal variants in human genetic disorders from exome or whole genome sequencing data
AbstractWith genome sequencing entering the clinics as diagnostic tool to study genetic disorders, there is an increasing need for bioinformatics solutions that enable precise causal variant identification in a timely manner.BackgroundWorkflows for the identification of candidate disease-causing variants perform usually the following tasks: i) identification of variants; ii) filtering of variants to remove polymorphisms and technical artifacts; and iii) prioritization of the remaining variants to provide a small set of candidates for further analysis.MethodsHere, we present a pipeline designed to identify variants and prioritize the variants and genes from trio sequencing or pedigree-based sequencing data into different tiers.ResultsWe show how this pipeline was applied in a study of patients with neurodevelopmental disorders of unknown cause, where it helped to identify the causal variants in more than 35% of the cases.ConclusionsClassification and prioritization of variants into different tiers helps to select a small set of variants for downstream analysis.