Gene discovery in Atlantic Forest plant species using GR-RSC simplified genomes
The Atlantic Forest is one of the most import biodiversity hotspots in the world, nevertheless, its 20,000 plant species are poorly characterized genetically, what could undermine conservational efforts and bioprospection of natural products. We used a genome reduction using restriction site conservation (GR-RSC) technique to minimize sequencing effort and build in a short period a data bank of gene sequences from 35 plant species from the Atlantic Forest in a private natural protected area in Southwest Brazil. After Illumina sequencing and standard bioinformatics, we produced more than 66 million super reads, of which 11 million (17\%) were annotated using Diamond and UNIREF90 database and 55 million were 'No hit'. We picked 17 enzymes from 2 secondary metabolite synthesis pathways that are both important representatives of biological processes for plants and also of industrial interest, to test the usefulness of the databank we created for gene discovery. All 17 genes were detected in at least one of the 35 species and all species exhibited at least one of the genes. Eight of the 35 species exhibited all 17 genes. These results shows that genome simplification by restriction enzyme can be applied to preliminary screen thousands of species in tropical forests, generating useful databanks for scientific and entreprenurial activities both in conservational biology and bioprospection.