density based clustering
Recently Published Documents


TOTAL DOCUMENTS

678
(FIVE YEARS 231)

H-INDEX

32
(FIVE YEARS 9)

2022 ◽  
Vol 16 (1) ◽  
pp. 8
Author(s):  
Yunda Heningtyas ◽  
Fathur Rahmi ◽  
Kurnia Muludi

2021 ◽  
Vol 2021 ◽  
pp. 1-25
Author(s):  
Shuoben Bi ◽  
Ruizhuang Xu ◽  
Aili Liu ◽  
Luye Wang ◽  
Lei Wan

In view of the fact that the density-based clustering algorithm is sensitive to the input data, which results in the limitation of computing space and poor timeliness, a new method is proposed based on grid information entropy clustering algorithm for mining hotspots of taxi passengers. This paper selects representative geographical areas of Nanjing and Beijing as the research areas and uses information entropy and aggregation degree to analyze the distribution of passenger-carrying points. This algorithm uses a grid instead of original trajectory data to calculate and excavate taxi passenger hotspots. Through the comparison and analysis of the data of taxi loading points in Nanjing and Beijing, it is found that the experimental results are consistent with the actual urban passenger hotspots, which verifies the effectiveness of the algorithm. It overcomes the shortcomings of a density-based clustering algorithm that is limited by computing space and poor timeliness, reduces the size of data needed to be processed, and has greater flexibility to process and analyze massive data. The research results can provide an important scientific basis for urban traffic guidance and urban management.


2021 ◽  
Vol 14 (1) ◽  
pp. 212
Author(s):  
Yirui Jiang ◽  
Runjin Yang ◽  
Chenxi Zang ◽  
Zhiyuan Wei ◽  
John Thompson ◽  
...  

Nowadays, the aviation industry pays more attention to emission reduction toward the net-zero carbon goals. However, the volume of global passengers and baggage is exponentially increasing, which leads to challenges for sustainable airports. A baggage-free airport terminal is considered a potential solution in solving this issue. Removing the baggage operation away from the passenger terminals will reduce workload for airport operators and promote passengers to use public transport to airport terminals. As a result, it will bring a significant impact on energy and the environment, leading to a reduction of fuel consumption and mitigation of carbon emission. This paper studies a baggage collection network design problem using vehicle routing strategies and augmented reality for baggage-free airport terminals. We use a spreadsheet solver tool, based on the integration of the modified Clark and Wright savings heuristic and density-based clustering algorithm, for optimizing the location of logistic hubs and planning the vehicle routes for baggage collection. This tool is applied for the case study at London City Airport to analyze the impacts of the strategies on carbon emission quantitatively. The result indicates that the proposed baggage collection network can significantly reduce 290.10 tonnes of carbon emissions annually.


2021 ◽  
Vol 10 (12) ◽  
pp. 814
Author(s):  
Xiangqiang Min ◽  
Dieter Pfoser ◽  
Andreas Züfle ◽  
Yehua Sheng

The range query is one of the most important query types in spatial data processing. Geographic information systems use it to find spatial objects within a user-specified range, and it supports data mining tasks, such as density-based clustering. In many applications, ranges are not computed in unrestricted Euclidean space, but on a network. While the majority of access methods cannot trivially be extended to network space, existing network index structures partition the network space without considering the data distribution. This potentially results in inefficiency due to a very skewed node distribution. To improve range query processing on networks, this paper proposes a balanced Hierarchical Network index (HN-tree) to query spatial objects on networks. The main idea is to recursively partition the data on the network such that each partition has a similar number of spatial objects. Leveraging the HN-tree, we present an efficient range query algorithm, which is empirically evaluated using three different road networks and several baselines and state-of-the-art network indices. The experimental evaluation shows that the HN-tree substantially outperforms existing methods.


2021 ◽  
Vol 2021 ◽  
pp. 1-7
Author(s):  
Shujun Hou

The advent of the information age has changed every existing career and revolutionized most if not all fields, notwithstanding many benefits that came along with it. There has been an exponential rise in information and, alongside it, an increase in data. Data centers have erupted with details as the number of rows in databases grows by the day. The use of technology has nevertheless become essential in many company models and organizations, warranting its usage in virtually every channel. College physical education and sports are not an exception as students studying such subjects are skyrocketing. As the information is getting more complex, improved methods are needed to research and analyze data. Fortunately, data mining has come to the rescue. Data mining is a collection of analytical methods and procedures used exclusively for the sake of data extraction. It may be used to analyze features and trends from vast quantities of data. The objective of this study is to explore the use of data mining technologies in the analysis of college students’ sports psychology. This study uses clustering methods for the examination of sports psychology. We utilize three clustering methods for this aim: expectation-maximization (EM) algorithm, k-means, COBWEB, density-based clustering of applications with noise (DBSCAN), and agglomerative hierarchal clustering algorithms. We perform our forecasts based on various metrics combined with the past outcomes of college sports using these methods. In contrast to conventional data research and analysis techniques, our approaches have relatively high prediction accuracy as far as college athletics is concerned.


2021 ◽  
Author(s):  
Marius T. Wenz ◽  
Miriam Bertazzon ◽  
Jana Sticht ◽  
Stevan Aleksić ◽  
Daniela Gjorgjevikj ◽  
...  

Protein-protein interactions often rely on specialized recognition domains, such as WW domains, which bind to specific proline-rich sequences. The specificity of these protein-protein interactions can be increased by tandem repeats, i.e. two WW domains connected by a linker. With a flexible linker, the WW domains can move freely with respect to each other. Additionally, the tandem WW domains can bind in two different orientations to their target sequences. This makes the elucidation of complex structures of tandem WW domains extremely challenging. Here, we identify and characterize two complex structures of the tandem WW domain of human formin-binding protein 21 and a peptide sequence from its natural binding partner, the core-splicing protein SmB/B′. The two structures differ in the ligand orientation, and consequently also in the relative orientation of the two WW domains. We analyze and probe the interactions in the complexes by molecular simulations and NMR experiments. The workflow to identify the complex structures uses molecular simulations, density-based clustering and peptide docking. It is designed to systematically generate possible complex structures for repeats of recognition domains. These stuctures will help us to understand the synergistic and multivalency effects that generate the astonishing versatility and specificity of protein-protein interactions.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Shaashwat Agrawal ◽  
Sagnik Sarkar ◽  
Mamoun Alazab ◽  
Praveen Kumar Reddy Maddikunta ◽  
Thippa Reddy Gadekallu ◽  
...  

Federated learning (FL) is a distributed model for deep learning that integrates client-server architecture, edge computing, and real-time intelligence. FL has the capability of revolutionizing machine learning (ML) but lacks in the practicality of implementation due to technological limitations, communication overhead, non-IID (independent and identically distributed) data, and privacy concerns. Training a ML model over heterogeneous non-IID data highly degrades the convergence rate and performance. The existing traditional and clustered FL algorithms exhibit two main limitations, including inefficient client training and static hyperparameter utilization. To overcome these limitations, we propose a novel hybrid algorithm, namely, genetic clustered FL (Genetic CFL), that clusters edge devices based on the training hyperparameters and genetically modifies the parameters clusterwise. Then, we introduce an algorithm that drastically increases the individual cluster accuracy by integrating the density-based clustering and genetic hyperparameter optimization. The results are bench-marked using MNIST handwritten digit dataset and the CIFAR-10 dataset. The proposed genetic CFL shows significant improvements and works well with realistic cases of non-IID and ambiguous data. An accuracy of 99.79% is observed in the MNIST dataset and 76.88% in CIFAR-10 dataset with only 10 training rounds.


2021 ◽  
Vol 13 (22) ◽  
pp. 12527
Author(s):  
Maximilian Heumann ◽  
Tobias Kraschewski ◽  
Tim Brauner ◽  
Lukas Tilch ◽  
Michael H. Breitner

This study analyzes the temporally resolved location and trip data of shared e-scooters over nine months in Berlin from one of Europe’s most widespread operators. We apply time, distance, and energy consumption filters on approximately 1.25 million trips for outlier detection and trip categorization. Using temporally and spatially resolved trip pattern analyses, we investigate how the built environment and land use affect e-scooter trips. Further, we apply a density-based clustering algorithm to examine point of interest-specific patterns in trip generation. Our results suggest that e-scooter usage has point of interest related characteristics. Temporal peaks in e-scooter usage differ by point of interest category and indicate work-related trips at public transport stations. We prove these characteristic patterns with the statistical metric of cosine similarity. Considering average cluster velocities, we observe limited time-saving potential of e-scooter trips in congested areas near the city center.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Luo Xuegang ◽  
Lv Junrui ◽  
Wang Juan

An effective fraction of data with missing values from various physiochemical sensors in the Internet of Things is still emerging owing to unreliable links and accidental damage. This phenomenon will limit the predicative ability and performance for supporting data analyses by IoT-based platforms. Therefore, it is necessary to exploit a way to reconstruct these lost data with high accuracy. A new data reconstruction method based on spectral k-support norm minimization (DR-SKSNM) is proposed for NB-IoT data, and a relative density-based clustering algorithm is embedded into model processing for improving the accuracy of reconstruction. First, sensors are grouped by similar patterns of measurement. A relative density-based clustering, which can effectively identify clusters in data sets with different densities, is applied to separate sensors into different groups. Second, based on the correlations of sensor data and its joint low rank, an algorithm based on the matrix spectral k-support norm minimization with automatic weight is developed. Moreover, the alternating direction method of multipliers (ADMM) is used to obtain its optimal solution. Finally, the proposed method is evaluated by using two simulated and real sensor data sources from Panzhihua environmental monitoring station with random missing patterns and consecutive missing patterns. From the simulation results, it is proved that our algorithm performs well, and it can propagate through low-rank characteristics to estimate a large missing region’s value.


Sign in / Sign up

Export Citation Format

Share Document