Data Analysis of Vessel Traffic Flow Using Clustering Algorithms

Many data collected in sport science come from time dependent phenomenon. This article focuses on Functional Data Analysis (FDA), which study longitudinal data by modelling them as continuous functions. After a brief review of several FDA methods, some useful practical tools such as Functional Principal Component Analysis (FPCA) or functional clustering algorithms are presented and compared on simulated data. Finally, the problem of the detection of promising young swimmers is addressed through a curve clustering procedure on a real data set of performance progression curves. This study reveals that the fastest improvement of young swimmers generally appears before 16 years old. Moreover, several patterns of improvement are identified and the functional clustering procedure provides a useful detection tool.

Download Full-text

INFLUENCE OF TRUCK TRAFFIC ON ACOUSTIC POLLUTION IN KAUNAS DISTRICTS CROSSED BY HIGHWAYS/KROVININIO AUTOTRANSPORTO ĮTAKA AKUSTINEI TARŠAI RESPUBLIKINĖS REIKŠMĖS MAGISTRALIŲ KERTAMUOSE KAUNO MIKRORAJONUOSE/ ВЛИЯНИЕ ГРУЗОВОГО АВТОТРАНСПОРТА НА АКУСТИЧЕСКОЕ ЗАГРЯЗНЕНИЕ В МИКРОРАЙОНАX ГОРОДА KАУНАСА С ТРАССАМИ ГОСУДАРСТВЕННОГО ЗНАЧЕНИЯ

Journal of Environmental Engineering and Landscape Management ◽

10.3846/1648-6897.2009.17.198-204 ◽

2009 ◽

Vol 17 (4) ◽

pp. 198-204 ◽

Cited By ~ 7

Author(s):

Regina Gražulevičienė ◽

Inga Bendokienė

Keyword(s):

Data Analysis ◽

Traffic Flow ◽

Noise Level ◽

Traffic Noise ◽

Heavy Vehicles ◽

Truck Traffic ◽

Statistical Software ◽

Study Results ◽

Noise Measurements ◽

Main Streets

The aim of the study was to assess the influence of truck traffic on acoustic pollution in two Kaunas districts crossed by highways‐ Eiguliai and Šilainiai. Composition of traffic flow and noise measurements were conducted near the main streets and national highways that cross the districts. GIS and statistical software SPSS 12.01 were used for the data analysis. The study results showed that mean noise level near the main streets was 70 dB(A) in the daytime,‐ 68.6 dB(A) in the evening and at night it was 61.1 dB(A) in Eiguliai, and in Šilainiai it was 67 dB(A), 65 dB(A) and 58 dB(A), correspondingly. On the highways, crossing the districts, heavy vehicles compose about 3 times higher part of total traffic flow during the day and about 2 times in the evening compared to other main streets. The noise level depended on the traffic flow and correlation coefficient fluctuated from 0.77 to 0.85. The modelling of traffic flow showed, that the increase of trucks proportion by 2 percent would increase the traffic noise by 1.1 dB(A) in the streets with traffic flow of 300 veh./hour or more, and by 1.8 dB(A) with traffic flow of 200 veh./hour or less. Our findings suggest that the influence of heavy vehicles on acoustic pollution is higher in the districts with lower traffic flow. Santrauka Tyrimo tikslas – nustatyti krovininio autotransporto įtaką akustinei taršai Kauno mikrorajonuose, kuriuos kerta respublikinės reikšmės magistralės – Islandijos plentas ir vakarinis lankstas. Aplinkos triukšmo lygis ir transporto srautų intensyvumas Eigulių ir Šilainių seniūnijoje buvo matuotas 34 taškuose – dieną, vakare ir naktį. Duomenims apdoroti taikyta geografinių informacinių (GIS) sistemų technologijos, SPSS 12.0.1 ir Statistica 15 statistinės analizės paketai. Tyrimų rezultatai: vidutinis ekvivalentinis triukšmo lygis Eigulių seniūnijoje dieną prie pagrindinių gatvių siekė 70 dBA, vakare – 68,6 dBA, o naktį – 61,1 dBA ir iš esmės nesiskyrė nuo Šilainių seniūnijos, atitinkamai 67 dBA, 65 dBA ir 58 dBA. Magistraliniuose keliuose, kertančiuose Eigulių ir Šilainių seniūnijas, vidutinis transporto srautų intensyvumas dieną ir vakare buvo 5 kartus, naktį 6 kartus didesnis nei vidutinis srautų intensyvumas pagrindinėse gatvėse tuo pačiu metu, o krovininio autotransporto dalis dieną 3 kartus, o vakare 2 kartus viršijo vidutinius pagrindinių gatvių srautus. Nustatyta sąsaja tarp transporto srautų intensyvumo ir triukšmo lygio: Eigulių seniūnijos dienos koreliacijos koeficientas buvo 0,85, vakaro ir nakties – 0,83, o Šilainių seniūnijos – atitinkamai 0,78, 0,77 ir 0,80. Transporto srautų sudėties modeliavimo duomenimis, padidėjus krovininio transporto proporcijai 2 %, gatvėse, kuriose transporto srautas didesnis nei 300 aut./val., triukšmo lygis padidėtų 1,1 dBA, o kur transporto srautas mažesnis nei 200 aut./val., triukšmo lygis padidėtų 1,8 dBA (koreliacijos koeficientas – 0,63). Krovininio transporto įtaka akustinei taršai didesnė mikrorajonuose, kuriuose transporto srautai nedideli. Резюме Целью данной работы было изучить влияние грузового автотранспорта на акустическое загрязнение в микрорайонах города Каунаса, которые пересекают трассы государственного значения. Это шоссе Исландиос и объезд Вакаринис. Состав транспортного потока определялся и уровень шума измерялся около главных улиц микрорайонов. Результаты исследования показали, что средний уровень шума днем был 70 dBA, вечером – 68,6 dBA, ночью – 61,1 dBA. На трассах государственного значения, пересекающих микрорайоны, по сравнению с другими улицами потоки грузовых автомобилей были в 3 раза больше днем и 2 раза больше вечером. Установлена зaвисимость между величиной транспортного потока и шума (r = 0,77–0,85). Моделирование состава транспортного потока показало, что при увеличении на улицах грузового транспорта на 2% с 300 авт./час и больше шум увеличивается на 1,1 dBA, а при количестве грузового транспорта, составляющем 200 авт./час и меньше, шум возрастает на 1,8 dBA. Влияние грузового автотранспорта на акустическое загрязнение больше в микрорайонах с небольшим транспортным потоком.

Download Full-text

Clustering Algorithms in Gene Expression: Data Analysis

10.1109/icrito51393.2021.9596549 ◽

2021 ◽

Author(s):

Karuna Ghai ◽

Jaspreet Singh

Keyword(s):

Gene Expression ◽

Data Analysis ◽

Gene Expression Data ◽

Clustering Algorithms ◽

Expression Data ◽

Gene Expression Data Analysis

Download Full-text

Application and visualization of typical clustering algorithms in seismic data analysis

Procedia Computer Science ◽

10.1016/j.procs.2019.04.026 ◽

2019 ◽

Vol 151 ◽

pp. 171-178

Author(s):

Z. Fan ◽

X. Xu

Keyword(s):

Data Analysis ◽

Seismic Data ◽

Clustering Algorithms

Download Full-text

Software implementation of the main cluster analysis tools

Revista Amazonia Investiga ◽

10.34069/ai/2021.47.11.9 ◽

2021 ◽

Vol 10 (47) ◽

pp. 81-92

Author(s):

Andrey V. Silin ◽

Olga N. Grinyuk ◽

Tatyana A. Lartseva ◽

Olga V. Aleksashina ◽

Tatiana S. Sukhova

Keyword(s):

Cluster Analysis ◽

Data Analysis ◽

Nearest Neighbor ◽

Clustering Algorithms ◽

Point Of View ◽

Software Implementation ◽

Practical Significance ◽

Data Set ◽

Main Cluster ◽

Analysis Tools

This article discusses an approach to creating a complex of programs for the implementation of cluster analysis methods. A number of cluster analysis tools for processing the initial data set and their software implementation are analyzed, as well as the complexity of the application of cluster data analysis. An approach to data is generalized from the point of view of factual material that supplies information for the problem under study and is the basis for discussion, analysis and decision-making. Cluster analysis is a procedure that combines objects or variables into groups based on a given rule. The work provides a grouping of multivariate data using proximity measures such as sample correlation coefficient and its module, cosine of the angle between vectors and Euclidean distance. The authors proposed a method for grouping by centers, by the nearest neighbor and by selected standards. The results can be used by analysts in the process of creating a data analysis structure and will improve the efficiency of clustering algorithms. The practical significance of the results of the application of the developed algorithms is expressed in the software package created by means of the C ++ language in the VS environment.

Download Full-text

Review and compare clustering algorithms for navigation data analysis tasks

Proceedings of the 2016 Conference on Information Technologies in Science, Management, Social Sphere and Medicine ◽

10.2991/itsmssm-16.2016.58 ◽

2016 ◽

Author(s):

Anna Ponomareva ◽

Roman Meyta

Keyword(s):

Data Analysis ◽

Clustering Algorithms ◽

Navigation Data

Download Full-text

A Comparison of K-Means and Mean Shift Algorithms

10.20944/preprints202108.0140.v1 ◽

2021 ◽

Author(s):

Mehak Nigar Shumaila

Keyword(s):

Cluster Analysis ◽

Data Analysis ◽

Time Complexity ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Mean Shift ◽

Prediction Performance ◽

Learning Problem ◽

Cluster A ◽

Formation Of Groups

Clustering, or otherwise known as cluster analysis, is a learning problem that takes place without any human supervision. This technique has often been utilized, much efficiently, in data analysis, and serves for observing and identifying interesting, useful, or desired patterns in the said data. The clustering technique functions by performing a structured division of the data involved, in similar objects based on the characteristics that it identifies. This process results in the formation of groups, and each group that is formed, is called a cluster. A single said cluster consists of objects from the data, that have similarities among other objects found in the same cluster, and resemble differences when compared to objects identified from the data that now exist in other clusters. The process of clustering is very significant in various aspects of data analysis, as it determines and presents the intrinsic grouping of objects present in the data, based on their attributes, in a batch of unlabeled raw data. A textbook or otherwise said, good criteria, does not exist in this method of cluster analysis. That is because this process is so different and so customizable for every user, that needs it in his/her various and different needs. There is no outright best clustering algorithm, as it massively depends on the user’s scenario and needs. This paper is intended to compare and study two different clustering algorithms. The algorithms under investigation are k-mean and mean shift. These algorithms are compared according to the following factors: time complexity, training, prediction performance and accuracy of the clustering algorithms.

Download Full-text

Subpopulation identification for single-cell RNA-sequencing data using functional data analysis

10.1101/760413 ◽

2019 ◽

Author(s):

Kyungmin Ahn ◽

Hironobu Fujiwara

Keyword(s):

Gene Expression ◽

Data Analysis ◽

Single Cell ◽

Gene Expression Data ◽

Functional Data Analysis ◽

Functional Data ◽

Clustering Algorithms ◽

Expression Data ◽

Clustering Methods ◽

Single Cell Rna Sequencing

AbstractBackgroundIn single-cell RNA-sequencing (scRNA-seq) data analysis, a number of statistical tools in multivariate data analysis (MDA) have been developed to help analyze the gene expression data. This MDA approach is typically focused on examining discrete genomic units of genes that ignores the dependency between the data components. In this paper, we propose a functional data analysis (FDA) approach on scRNA-seq data whereby we consider each cell as a single function. To avoid a large number of dropouts (zero or zero-closed values) and reduce the high dimensionality of the data, we first perform a principal component analysis (PCA) and assign PCs to be the amplitude of the function. Then we use the index of PCs directly from PCA for the phase components. This approach allows us to apply FDA clustering methods to scRNA-seq data analysis.ResultsTo demonstrate the robustness of our method, we apply several existing FDA clustering algorithms to the gene expression data to improve the accuracy of the classification of the cell types against the conventional clustering methods in MDA. As a result, the FDA clustering algorithms achieve superior accuracy on simulated data as well as real data such as human and mouse scRNA-seq data.ConclusionsThis new statistical technique enhances the classification performance and ultimately improves the understanding of stochastic biological processes. This new framework provides an essentially different scRNA-seq data analytical approach, which can complement conventional MDA methods. It can be truly effective when current MDA methods cannot detect or uncover the hidden functional nature of the gene expression dynamics.

Download Full-text