Understanding the Function of a Locus Using the Knowledge Available at Single-Nucleotide Polymorphisms
Understanding the function of a locus is an issue in molecular biology. Although numerous molecular data have been generated in the last decades, it remains difficult to grasp how these data are related at a locus. In this study, we describe an analytical workflow that can solve this problem using the knowledge available at the single-nucleotide polymorphism (SNP) level. The underlying algorithm uses SNPs as connectors to link biological entities and identify correlations between them through a joint bioinformatics/statistics approach. We demonstrate its application in finding the mechanism whereby a mutation causes a phenotype and in revealing the path whereby a gene is regulated and impacts a phenotype. We translate our workflow into publicly available shell scripts. Our approach provides a basic framework to solve the information overload problem in biology surrounding the annotation of a locus and is a step toward repurposing GWAS data for new applications.