Wikipedia Drug Safety Advisory Committee: Distilling a Drug Adverse Effect Reference Set Using Wisdom of the Crowd
Large datasets of relational medical data, such as the adverse effects of drugs or vaccines, typically attain their large size, by relying on automatic, or semi-automatic, methods for generation. This often comes with a compromise on the precision of generated data, which can be at least partially alleviated by having experts curate the data. Since having experts review a large dataset can be costly and time consuming, here we suggest using Wikipedia for this task - that is, augment the automatic generation step by an automatic curation step based on the expert knowledge accumulated in Wikipedia. We use the method to curate two large adverse drug effects datasets, and show that the obtained datasets have a much higher precision relative to their originating ones.