density based clustering Latest Research Papers

In view of the fact that the density-based clustering algorithm is sensitive to the input data, which results in the limitation of computing space and poor timeliness, a new method is proposed based on grid information entropy clustering algorithm for mining hotspots of taxi passengers. This paper selects representative geographical areas of Nanjing and Beijing as the research areas and uses information entropy and aggregation degree to analyze the distribution of passenger-carrying points. This algorithm uses a grid instead of original trajectory data to calculate and excavate taxi passenger hotspots. Through the comparison and analysis of the data of taxi loading points in Nanjing and Beijing, it is found that the experimental results are consistent with the actual urban passenger hotspots, which verifies the effectiveness of the algorithm. It overcomes the shortcomings of a density-based clustering algorithm that is limited by computing space and poor timeliness, reduces the size of data needed to be processed, and has greater flexibility to process and analyze massive data. The research results can provide an important scientific basis for urban traffic guidance and urban management.

Download Full-text

Toward Baggage-Free Airport Terminals: A Case Study of London City Airport

Sustainability ◽

10.3390/su14010212 ◽

2021 ◽

Vol 14 (1) ◽

pp. 212

Author(s):

Yirui Jiang ◽

Runjin Yang ◽

Chenxi Zang ◽

Zhiyuan Wei ◽

John Thompson ◽

...

Keyword(s):

Carbon Emission ◽

Clustering Algorithm ◽

Network Design Problem ◽

Aviation Industry ◽

Net Zero ◽

Density Based Clustering ◽

Airport Terminal ◽

Zero Carbon ◽

Airport Terminals

Nowadays, the aviation industry pays more attention to emission reduction toward the net-zero carbon goals. However, the volume of global passengers and baggage is exponentially increasing, which leads to challenges for sustainable airports. A baggage-free airport terminal is considered a potential solution in solving this issue. Removing the baggage operation away from the passenger terminals will reduce workload for airport operators and promote passengers to use public transport to airport terminals. As a result, it will bring a significant impact on energy and the environment, leading to a reduction of fuel consumption and mitigation of carbon emission. This paper studies a baggage collection network design problem using vehicle routing strategies and augmented reality for baggage-free airport terminals. We use a spreadsheet solver tool, based on the integration of the modified Clark and Wright savings heuristic and density-based clustering algorithm, for optimizing the location of logistic hubs and planning the vehicle routes for baggage collection. This tool is applied for the case study at London City Airport to analyze the impacts of the strategies on carbon emission quantitatively. The result indicates that the proposed baggage collection network can significantly reduce 290.10 tonnes of carbon emissions annually.

Download Full-text

A Hierarchical Spatial Network Index for Arbitrarily Distributed Spatial Objects

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10120814 ◽

2021 ◽

Vol 10 (12) ◽

pp. 814

Author(s):

Xiangqiang Min ◽

Dieter Pfoser ◽

Andreas Züfle ◽

Yehua Sheng

Keyword(s):

Spatial Data ◽

Main Idea ◽

Range Query ◽

Access Methods ◽

Spatial Objects ◽

Density Based Clustering ◽

Node Distribution ◽

Query Algorithm ◽

Information Systems Use ◽

Network Space

The range query is one of the most important query types in spatial data processing. Geographic information systems use it to find spatial objects within a user-specified range, and it supports data mining tasks, such as density-based clustering. In many applications, ranges are not computed in unrestricted Euclidean space, but on a network. While the majority of access methods cannot trivially be extended to network space, existing network index structures partition the network space without considering the data distribution. This potentially results in inefficiency due to a very skewed node distribution. To improve range query processing on networks, this paper proposes a balanced Hierarchical Network index (HN-tree) to query spatial objects on networks. The main idea is to recursively partition the data on the network such that each partition has a similar number of spatial objects. Leveraging the HN-tree, we present an efficient range query algorithm, which is empirically evaluated using three different road networks and several baselines and state-of-the-art network indices. The experimental evaluation shows that the HN-tree substantially outperforms existing methods.

Download Full-text

Research on the Application of Data Mining Technology in the Analysis of College Students’ Sports Psychology

Mobile Information Systems ◽

10.1155/2021/6529174 ◽

2021 ◽

Vol 2021 ◽

pp. 1-7

Author(s):

Shujun Hou

Keyword(s):

College Students ◽

Data Mining ◽

College Athletics ◽

Data Extraction ◽

Clustering Algorithms ◽

Sports Psychology ◽

Clustering Methods ◽

Use Of Data ◽

Use Of Technology ◽

Density Based Clustering

The advent of the information age has changed every existing career and revolutionized most if not all fields, notwithstanding many benefits that came along with it. There has been an exponential rise in information and, alongside it, an increase in data. Data centers have erupted with details as the number of rows in databases grows by the day. The use of technology has nevertheless become essential in many company models and organizations, warranting its usage in virtually every channel. College physical education and sports are not an exception as students studying such subjects are skyrocketing. As the information is getting more complex, improved methods are needed to research and analyze data. Fortunately, data mining has come to the rescue. Data mining is a collection of analytical methods and procedures used exclusively for the sake of data extraction. It may be used to analyze features and trends from vast quantities of data. The objective of this study is to explore the use of data mining technologies in the analysis of college students’ sports psychology. This study uses clustering methods for the examination of sports psychology. We utilize three clustering methods for this aim: expectation-maximization (EM) algorithm, k-means, COBWEB, density-based clustering of applications with noise (DBSCAN), and agglomerative hierarchal clustering algorithms. We perform our forecasts based on various metrics combined with the past outcomes of college sports using these methods. In contrast to conventional data research and analysis techniques, our approaches have relatively high prediction accuracy as far as college athletics is concerned.

Download Full-text

Target recognition in tandem WW domains: complex structures for parallel and antiparallel ligand orientation in h-FBP21 tandem WW

10.1101/2021.11.22.469489 ◽

2021 ◽

Author(s):

Marius T. Wenz ◽

Miriam Bertazzon ◽

Jana Sticht ◽

Stevan Aleksić ◽

Daniela Gjorgjevikj ◽

...

Keyword(s):

Protein Interactions ◽

Tandem Repeats ◽

Molecular Simulations ◽

Peptide Sequence ◽

Complex Structures ◽

Protein Protein Interactions ◽

Ww Domains ◽

Binding Partner ◽

Density Based Clustering ◽

Ligand Orientation

Protein-protein interactions often rely on specialized recognition domains, such as WW domains, which bind to specific proline-rich sequences. The specificity of these protein-protein interactions can be increased by tandem repeats, i.e. two WW domains connected by a linker. With a flexible linker, the WW domains can move freely with respect to each other. Additionally, the tandem WW domains can bind in two different orientations to their target sequences. This makes the elucidation of complex structures of tandem WW domains extremely challenging. Here, we identify and characterize two complex structures of the tandem WW domain of human formin-binding protein 21 and a peptide sequence from its natural binding partner, the core-splicing protein SmB/B′. The two structures differ in the ligand orientation, and consequently also in the relative orientation of the two WW domains. We analyze and probe the interactions in the complexes by molecular simulations and NMR experiments. The workflow to identify the complex structures uses molecular simulations, density-based clustering and peptide docking. It is designed to systematically generate possible complex structures for repeats of recognition domains. These stuctures will help us to understand the synergistic and multivalency effects that generate the astonishing versatility and specificity of protein-protein interactions.

Download Full-text

Genetic CFL: Hyperparameter Optimization in Clustered Federated Learning

Computational Intelligence and Neuroscience ◽

10.1155/2021/7156420 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Shaashwat Agrawal ◽

Sagnik Sarkar ◽

Mamoun Alazab ◽

Praveen Kumar Reddy Maddikunta ◽

Thippa Reddy Gadekallu ◽

...

Keyword(s):

Distributed Model ◽

Communication Overhead ◽

Distributed Data ◽

Hyperparameter Optimization ◽

Density Based Clustering ◽

Cluster Accuracy ◽

And Performance ◽

The Individual ◽

Ambiguous Data ◽

Technological Limitations

Federated learning (FL) is a distributed model for deep learning that integrates client-server architecture, edge computing, and real-time intelligence. FL has the capability of revolutionizing machine learning (ML) but lacks in the practicality of implementation due to technological limitations, communication overhead, non-IID (independent and identically distributed) data, and privacy concerns. Training a ML model over heterogeneous non-IID data highly degrades the convergence rate and performance. The existing traditional and clustered FL algorithms exhibit two main limitations, including inefficient client training and static hyperparameter utilization. To overcome these limitations, we propose a novel hybrid algorithm, namely, genetic clustered FL (Genetic CFL), that clusters edge devices based on the training hyperparameters and genetically modifies the parameters clusterwise. Then, we introduce an algorithm that drastically increases the individual cluster accuracy by integrating the density-based clustering and genetic hyperparameter optimization. The results are bench-marked using MNIST handwritten digit dataset and the CIFAR-10 dataset. The proposed genetic CFL shows significant improvements and works well with realistic cases of non-IID and ambiguous data. An accuracy of 99.79% is observed in the MNIST dataset and 76.88% in CIFAR-10 dataset with only 10 training rounds.

Download Full-text

A Spatiotemporal Study and Location-Specific Trip Pattern Categorization of Shared E-Scooter Usage

Sustainability ◽

10.3390/su132212527 ◽

2021 ◽

Vol 13 (22) ◽

pp. 12527

Author(s):

Maximilian Heumann ◽

Tobias Kraschewski ◽

Tim Brauner ◽

Lukas Tilch ◽

Michael H. Breitner

Keyword(s):

Clustering Algorithm ◽

City Center ◽

Time Saving ◽

Work Related ◽

Spatially Resolved ◽

Point Of Interest ◽

Density Based Clustering ◽

Time Distance ◽

The City ◽

Average Cluster

This study analyzes the temporally resolved location and trip data of shared e-scooters over nine months in Berlin from one of Europe’s most widespread operators. We apply time, distance, and energy consumption filters on approximately 1.25 million trips for outlier detection and trip categorization. Using temporally and spatially resolved trip pattern analyses, we investigate how the built environment and land use affect e-scooter trips. Further, we apply a density-based clustering algorithm to examine point of interest-specific patterns in trip generation. Our results suggest that e-scooter usage has point of interest related characteristics. Temporal peaks in e-scooter usage differ by point of interest category and indicate work-related trips at public transport stations. We prove these characteristic patterns with the statistical metric of cosine similarity. Considering average cluster velocities, we observe limited time-saving potential of e-scooter trips in congested areas near the city center.

Download Full-text

Rail Steel Health Analysis Based on a Novel Genetic Density-based Clustering Technique and Manifold Representation of Acoustic Emission Signals

Applied Artificial Intelligence ◽

10.1080/08839514.2021.2004346 ◽

2021 ◽

pp. 1-25

Author(s):

Kangwei Wang ◽

Xin Zhang ◽

Shuzhi Song ◽

Yan Wang ◽

Yi Shen ◽

...

Keyword(s):

Acoustic Emission ◽

Rail Steel ◽

Clustering Technique ◽

Density Based Clustering

Download Full-text

Missing Data Reconstruction Based on Spectral k-Support Norm Minimization for NB-IoT Data

Mathematical Problems in Engineering ◽

10.1155/2021/1336900 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Luo Xuegang ◽

Lv Junrui ◽

Wang Juan

Keyword(s):

Relative Density ◽

Missing Values ◽

Clustering Algorithm ◽

Optimal Solution ◽

Sensor Data ◽

Low Rank ◽

Data Reconstruction ◽

Reconstruction Method ◽

Norm Minimization ◽

Density Based Clustering

An effective fraction of data with missing values from various physiochemical sensors in the Internet of Things is still emerging owing to unreliable links and accidental damage. This phenomenon will limit the predicative ability and performance for supporting data analyses by IoT-based platforms. Therefore, it is necessary to exploit a way to reconstruct these lost data with high accuracy. A new data reconstruction method based on spectral k-support norm minimization (DR-SKSNM) is proposed for NB-IoT data, and a relative density-based clustering algorithm is embedded into model processing for improving the accuracy of reconstruction. First, sensors are grouped by similar patterns of measurement. A relative density-based clustering, which can effectively identify clusters in data sets with different densities, is applied to separate sensors into different groups. Second, based on the correlations of sensor data and its joint low rank, an algorithm based on the matrix spectral k-support norm minimization with automatic weight is developed. Moreover, the alternating direction method of multipliers (ADMM) is used to obtain its optimal solution. Finally, the proposed method is evaluated by using two simulated and real sensor data sources from Panzhihua environmental monitoring station with random missing patterns and consecutive missing patterns. From the simulation results, it is proved that our algorithm performs well, and it can propagate through low-rank characteristics to estimate a large missing region’s value.

Download Full-text

density based clustering
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

IMPLEMENTASI DENSITY-BASED CLUSTERING PADA SEGMENTASI CITRA Betta Fish

Mining Taxi Pick-Up Hotspots Based on Grid Information Entropy Clustering Algorithm

Toward Baggage-Free Airport Terminals: A Case Study of London City Airport

A Hierarchical Spatial Network Index for Arbitrarily Distributed Spatial Objects

Research on the Application of Data Mining Technology in the Analysis of College Students’ Sports Psychology

Target recognition in tandem WW domains: complex structures for parallel and antiparallel ligand orientation in h-FBP21 tandem WW

Genetic CFL: Hyperparameter Optimization in Clustered Federated Learning

A Spatiotemporal Study and Location-Specific Trip Pattern Categorization of Shared E-Scooter Usage

Rail Steel Health Analysis Based on a Novel Genetic Density-based Clustering Technique and Manifold Representation of Acoustic Emission Signals

Missing Data Reconstruction Based on Spectral k-Support Norm Minimization for NB-IoT Data

Export Citation Format

density based clusteringRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

IMPLEMENTASI DENSITY-BASED CLUSTERING PADA SEGMENTASI CITRA Betta Fish

Mining Taxi Pick-Up Hotspots Based on Grid Information Entropy Clustering Algorithm

Toward Baggage-Free Airport Terminals: A Case Study of London City Airport

A Hierarchical Spatial Network Index for Arbitrarily Distributed Spatial Objects

Research on the Application of Data Mining Technology in the Analysis of College Students’ Sports Psychology

Target recognition in tandem WW domains: complex structures for parallel and antiparallel ligand orientation in h-FBP21 tandem WW

Genetic CFL: Hyperparameter Optimization in Clustered Federated Learning

A Spatiotemporal Study and Location-Specific Trip Pattern Categorization of Shared E-Scooter Usage

Rail Steel Health Analysis Based on a Novel Genetic Density-based Clustering Technique and Manifold Representation of Acoustic Emission Signals

Missing Data Reconstruction Based on Spectral k-Support Norm Minimization for NB-IoT Data

density based clustering
Recently Published Documents