Refining Bulk Segregant Analyses: Ontology-Mediated Discovery of Flowering Time Genes in Brassica oleracea
Bulk segregant analysis (BSA) can help identify quantitative trait loci (QTLs), but this may result in substantial bycatch of functionally irrelevant genes. Here we develop a Gene Ontology-mediated approach to zoom in on specific markers implicated in flowering time from among QTLs identified by BSA of the giant woody Jersey kale phenotyped in four bulks of flowering onset. Our BSA yielded tens of thousands of candidate genes. We reduced this by two orders of magnitude by focusing on genes annotated with terms contained within relevant subgraphs of the Gene Ontology. A further enrichment test led to the pathway for circadian rhythm in plants. The genes that enriched this pathway are attested from previous research as regulating flowering time. Some of these genes were also identified as having functionally significant variation compared to Arabidopsis. We validated and confirmed our ontology-mediated results through a more targeted, homology-based approach. However, our ontology-mediated approach produced additional genes of putative importance, showing that the approach aids in exploration and discovery. We view our method as potentially applicable to the study of other complex traits and therefore make our workflows available as open-source code and a reusable Docker container.