Application Research of KNN Algorithm Based on Clustering in Big Data Talent Demand Information Classification
With the growth of massive data in the current mobile Internet, network recruitment is gradually growing into a new recruitment channel. How to effectively mine available information in the massive network recruitment data has become the technical bottleneck of current education and social supply and demand development. The renewal of talent demand information is carried out every day, which produces a large amount of text data. How to manage these talents’ demand information reasonably becomes more and more important. Artificial classification is time-consuming and laborious, which is unrealistic naturally. Therefore, using automatic text categorization technology to classify and manage this information becomes particularly important. To break through the bottleneck of this technology, a heuristic KNN text categorization algorithm based on ABC (artificial bee colony) is proposed to adjust the weight of features, and the similarity between test observation and training observation is measured by using the method of fuzzy distance measurement. Firstly, the recruitment information is segmented and feature selection and noise data elimination are carried out by using term frequency-inverse document frequency (TF-IDF) algorithm and AP (affinity propagation) clustering algorithm. Finally, the text information is classified by using KNN algorithm combined with heuristic search and fuzzy distance measurement. The experimental results show that this method effectively solves the problem of poor stability and low classification accuracy of traditional KNN algorithm in text categorization method for talent demand.