An efficient distributed hierarchical-clustering algorithm for large scale data

Author(s):  
Cheng-Hsien Tang ◽  
An-Ching Huang ◽  
Meng-Feng Tsai ◽  
Wei-Jen Wang
2019 ◽  
Vol 163 ◽  
pp. 416-428 ◽  
Author(s):  
Xingwang Zhao ◽  
Jiye Liang ◽  
Chuangyin Dang

2019 ◽  
Vol 31 (2) ◽  
pp. 329-338 ◽  
Author(s):  
Jian Hu ◽  
Haiwan Zhu ◽  
Yimin Mao ◽  
Canlong Zhang ◽  
Tian Liang ◽  
...  

Landslide hazard prediction is a difficult, time-consuming process when traditional methods are used. This paper presents a method that uses machine learning to predict landslide hazard levels automatically. Due to difficulties in obtaining and effectively processing rainfall in landslide hazard prediction, and to the existing limitation in dealing with large-scale data sets in the M-chameleon algorithm, a new method based on an uncertain DM-chameleon algorithm (developed M-chameleon) is proposed to assess the landslide susceptibility model. First, this method designs a new two-phase clustering algorithm based on M-chameleon, which effectively processes large-scale data sets. Second, the new E-H distance formula is designed by combining the Euclidean and Hausdorff distances, and this enables the new method to manage uncertain data effectively. The uncertain data model is presented at the same time to effectively quantify triggering factors. Finally, the model for predicting landslide hazards is constructed and verified using the data from the Baota district of the city of Yan’an, China. The experimental results show that the uncertain DM-chameleon algorithm of machine learning can effectively improve the accuracy of landslide prediction and has high feasibility. Furthermore, the relationships between hazard factors and landslide hazard levels can be extracted based on clustering results.


Sign in / Sign up

Export Citation Format

Share Document