High-dimensional indexing algorithm based on the hyperplane tree-structure

Author(s):  
Lian Liu ◽  
Fenghong Xiang ◽  
Jianlin Mao ◽  
Maoxing Zhang
2020 ◽  
Author(s):  
Xiao Lai ◽  
Pu Tian

AbstractSupervised machine learning, especially deep learning based on a wide variety of neural network architectures, have contributed tremendously to fields such as marketing, computer vision and natural language processing. However, development of un-supervised machine learning algorithms has been a bottleneck of artificial intelligence. Clustering is a fundamental unsupervised task in many different subjects. Unfortunately, no present algorithm is satisfactory for clustering of high dimensional data with strong nonlinear correlations. In this work, we propose a simple and highly efficient hierarchical clustering algorithm based on encoding by composition rank vectors and tree structure, and demonstrate its utility with clustering of protein structural domains. No record comparison, which is an expensive and essential common step to all present clustering algorithms, is involved. Consequently, it achieves linear time and space computational complexity hierarchical clustering, thus applicable to arbitrarily large datasets. The key factor in this algorithm is definition of composition, which is dependent upon physical nature of target data and therefore need to be constructed case by case. Nonetheless, the algorithm is general and applicable to any high dimensional data with strong nonlinear correlations. We hope this algorithm to inspire a rich research field of encoding based clustering well beyond composition rank vector trees.


Author(s):  
Karthik Ganesan Pillai ◽  
Liessman Sturlaugson ◽  
Juan M. Banda ◽  
Rafal A. Angryk

2011 ◽  
pp. 259-268
Author(s):  
M. V. Ramakrishna ◽  
S. Nepal ◽  
S. Sumanasekara ◽  
S. M.M. Tahaghoghi

Content Based Image Retrieval (CBIR) systems that are able to “retrieve images of Clinton with Lewinsky” are unrealistic at present. However, this area has seen much research and development activity since IBM’s QBIC announcement in 1994. The CHITRA CBIR system under development at the RMIT and Monash Universities, addresses the need for a test bed system. Users can dynamically incorporate new features and similarity measures in to the system, enabling it to act as a testbed for CBIR research. The system uses a 4-level data model we have developed and supports definition and querying of high level concepts such as MOUNTAIN and SUNSET. These advanced capabilities are supported by a powerful graphical query mechanism and a high-dimensional indexing structure based on linear mapping. In this paper we describe the design of the system, our contributions to the state of the art and provide some implementation details.


Sign in / Sign up

Export Citation Format

Share Document