Cyber Threat Discovery from Dark Web
In the darknet, hackers are constantly sharing information with each other and learning from each other. These conversations in online forums for example can contain data that may help assist in the discovery of cyber threat intelligence. Cyber Threat Intelligence (CTI) is information or knowledge about threats that can help prevent security breaches in cyberspace. In addition, monitoring and analysis of this data manually is challenging because forum posts and other data on the darknet are high in volume and unstructured. This paper uses descriptive analytics and predicative analytics using machine learning on forum posts dataset from darknet to discover valuable cyber threat intelligence. The IBM Watson Analytics and WEKA machine learning tool were used. Watson Analytics showed trends and relationships in the data. WEKA provided machine learning models to classify the type of exploits targeted by hackers from the form posts. The results showed that Crypter, Password cracker and RATs (Remote Administration Tools), buffer overflow exploit tools, and Keylogger system exploits tools were the most common in the darknet and that there are influential authors who are frequent in the forums. In addition, machine learning helps build classifiers for exploit types. The Random Forest classifier provided a higher accuracy than the Random Tree and Naïve Bayes classifiers. Therefore, analyzing darknet forum posts can provide actionable information as well as machine learning is effective in building classifiers for prediction of exploit types. Predicting exploit types as well as knowing patterns and trends on hackers’ plan helps defend the cyberspace proactively.