Automatic Misinformation Detection About COVID-19 in Brazilian Portuguese WhatsApp Messages
During the coronavirus pandemic, the problem of misinformation arose once again, quite intensely, through social networks. In Brazil, one of the primary sources of misinformation is the messaging application WhatsApp. However, due to WhatsApp's private messaging nature, there still few methods of misinformation detection developed specifically for this platform. In this context, the automatic misinformation detection (MID) about COVID-19 in Brazilian Portuguese WhatsApp messages becomes a crucial challenge. In this work, we present the COVID-19.BR, a data set of WhatsApp messages about coronavirus in Brazilian Portuguese, collected from Brazilian public groups and manually labeled. Then, we are investigating different machine learning methods in order to build an efficient MID for WhatsApp messages. So far, our best result achieved an F1 score of 0.774 due to the predominance of short texts. However, when texts with less than 50 words are filtered, the F1 score rises to 0.85.