scholarly journals Hybrid feature selection model based on machine learning and knowledge graph

2021 ◽  
Vol 2079 (1) ◽  
pp. 012028
Author(s):  
Xiaoqing Peng ◽  
Yong Shuai ◽  
Yaxi Gan ◽  
Yaokai Chen

Abstract Aiming at the problem that the current feature selection algorithm can not adapt to both supervised learning data and unsupervised learning data, and had poor feature interpretability, this paper proposed a hybrid feature selection model based on machine learning and knowledge graph. By the idea of hybridization, this model used supervised learning algorithms, unsupervised learning algorithms and knowledge graph technology to model from the perspective of data features and text features. Firstly, the data-based feature weights were obtained through the machine learning model, and then the text-based weights were obtained by using the knowledge graph technology, and the weight sets are combined to obtain a feature matrix with good explanatory properties that meets both the data and text features. Finally, the case analysis proves that the method proposed in this paper has good effects and interpretability.

Author(s):  
M. Govindarajan

Big data mining involves knowledge discovery from these large data sets. The purpose of this chapter is to provide an analysis of different machine learning algorithms available for performing big data analytics. The machine learning algorithms are categorized in three key categories, namely, supervised, unsupervised, and semi-supervised machine learning algorithm. The supervised learning algorithms are trained with a complete set of data, and thus, the supervised learning algorithms are used to predict/forecast. Example algorithms include logistic regression and the back propagation neural network. The unsupervised learning algorithms starts learning from scratch, and therefore, the unsupervised learning algorithms are used for clustering. Example algorithms include: the Apriori algorithm and K-Means. The semi-supervised learning combines both supervised and unsupervised learning algorithms. The semi-supervised algorithms are trained, and the algorithms also include non-trained learning.


The supervised and unsupervised learning methods in Machine Learning are successfully applied to solve various real time problems in different domains. The Indian Music has a base of Raga structure. The Raga is melodious framework for composition and improvisation. The identification and indexing of Raga for Indian Music data will improve efficiency and accuracy of retrieval being expected by e-learners, composers and classical music listeners. The identification of Raga in Indian Music is very difficult task for naïve user. The application of machine learning algorithms will definitely be best key idea. The paper demonstrates K-means and Agglomerative clustering methods from unsupervised learning nonetheless K Nearest Neighbor, Decision Tree and Support Vector Machine and Naïve Bayes classifiers are implemented from supervised learning. The partition of 70:30 is done for training data and testing data. Pitch Class Distribution features are extracted by identifying Pitch for every frame in an audio signal using Autocorrelation method. The comparison of above algorithms is done and observed supervised learning methods outperformed.


2020 ◽  
Vol 86 (1) ◽  
pp. 35-41
Author(s):  
Mathieu Quenu ◽  
Steven A Trewick ◽  
Fabrice Brescia ◽  
Mary Morgan-Richards

Abstract Size and shape variations of shells can be used to identify natural phenotypic clusters and thus delimit snail species. Here, we apply both supervised and unsupervised machine learning algorithms to a geometric morphometric dataset to investigate size and shape variations of the shells of the endemic land snail Placostylus from New Caledonia. We sampled eight populations of Placostylus from the Isle of Pines, where two species of this genus reportedly coexist. We used neural network analysis as a supervised learning algorithm and Gaussian mixture models as an unsupervised learning algorithm. Using a training dataset of individuals assigned to species using nuclear markers, we found that supervised learning algorithms could not unambiguously classify all individuals of our expanded dataset using shell size and shape. Unsupervised learning showed that the optimal division of our data consisted of three phenotypic clusters. Two of these clusters correspond to the established species Placostylus fibratus and P. porphyrostomus, while the third cluster was intermediate in both shape and size. Most of the individuals that were not clearly classified using supervised learning were classified to this intermediate phenotype by unsupervised learning, and most of these individuals came from previously unsampled populations. These results may indicate the presence of persistent putative-hybrid populations of Placostylus in the Isle of Pines.


2020 ◽  
Vol 15 ◽  
Author(s):  
Shuwen Zhang ◽  
Qiang Su ◽  
Qin Chen

Abstract: Major animal diseases pose a great threat to animal husbandry and human beings. With the deepening of globalization and the abundance of data resources, the prediction and analysis of animal diseases by using big data are becoming more and more important. The focus of machine learning is to make computers learn how to learn from data and use the learned experience to analyze and predict. Firstly, this paper introduces the animal epidemic situation and machine learning. Then it briefly introduces the application of machine learning in animal disease analysis and prediction. Machine learning is mainly divided into supervised learning and unsupervised learning. Supervised learning includes support vector machines, naive bayes, decision trees, random forests, logistic regression, artificial neural networks, deep learning, and AdaBoost. Unsupervised learning has maximum expectation algorithm, principal component analysis hierarchical clustering algorithm and maxent. Through the discussion of this paper, people have a clearer concept of machine learning and understand its application prospect in animal diseases.


2020 ◽  
pp. 1-11
Author(s):  
Tang Yan ◽  
Li Pengfei

In marketing, problems such as the increase in customer data, the increase in the difficulty of data extraction and access, the lack of reliability and accuracy of data analysis, the slow efficiency of data processing, and the inability to effectively transform massive amounts of data into valuable information have become increasingly prominent. In order to study the effect of customer response, based on machine learning algorithms, this paper constructs a marketing customer response scoring model based on machine learning data analysis. In the context of supplier customer relationship management, this article analyzes the supplier’s precision marketing status and existing problems and uses its own development and management characteristics to improve marketing strategies. Moreover, this article uses a combination of database and statistical modeling and analysis to try to establish a customer response scoring model suitable for supplier precision marketing. In addition, this article conducts research and analysis with examples. From the research results, it can be seen that the performance of the model constructed in this article is good.


Sign in / Sign up

Export Citation Format

Share Document