Analysis Model for Identifying Negative Posts Based on Social Media
Cyberbullying is an act that violates where this crime is committed on social media, e.g. the Twitter application. This action is difficult to detect, thus someone has to report the case before detection. Identification of cyberbullying tweets aims to classify tweets containing the bullying content. Several studies gave output results in the identification of whether the tweet is positive or negative, or bully or not. It can be confusing when analyzing the classification results as it only results in two classes. In this research, by using the conception of text mining Naïve Bayes, the model that can categorize into more detail was developed. It does not only categorize the contents are bullying or not, however it can classify the contents into five detail categories. The classification process done based on the dataset and label where the schema to build dataset was proposed scientifically from this study. The contribution of this research is to offer the algorithm to collect and label the Indonesian language dataset and then classify the types of sarcasm, namely animal, psychology and stupidity, disabled person, attitude, and general bullying. The research hypothesis is that analysis from the classification results can be improved by classifying bully content into the five classes. Dataset was collected by the researcher and labelling was done manually based on study literature. The result proves the model can use to classify cyberbullying content in social media with 99.15% accuracy.