Indian Cheilanthoid fern - A numerical taxonomic approach

Twenty one species belonging to five genera (viz. Aleuritopteris F?e, Cheilanthes Sw., Doryopteris J. Sm., Notholaena R. Brown, Pellaea Link.) of the Indian cheilanthoid ferns were studied to develop the new data set of micromorphological details viz. epidermal cells, stomatal morphotypes, venation pattern and spore ultrastructre. Cluster analysis was performed by using the two- state of multiple characters that separate the genus Aleuritopteris from Cheilanthes at the Eucladian distance of 5.1, though completely linked with other closely related genera, viz. Doryopteris, Notholaena and Pellaea. The taxonomic conundrum lies within these genera was resolved with numerical taxonomic study.Bangladesh J. Plant Taxon. 23(2): 133-142, 2016 (December)

Download Full-text

Application of multivariable statistical techniques in plant-wide WWTP control strategies analysis

Water Science & Technology ◽

10.2166/wst.2007.586 ◽

2007 ◽

Vol 56 (6) ◽

pp. 75-83 ◽

Cited By ~ 3

Author(s):

X. Flores ◽

J. Comas ◽

I.R. Roda ◽

L. Jiménez ◽

K.V. Gernaey

Keyword(s):

Cluster Analysis ◽

Control Strategies ◽

Treatment Plant ◽

Principal Component ◽

Statistical Techniques ◽

Data Sets ◽

Data Set ◽

Casual Relation ◽

Evaluation Matrix ◽

Natural Groups

The main objective of this paper is to present the application of selected multivariable statistical techniques in plant-wide wastewater treatment plant (WWTP) control strategies analysis. In this study, cluster analysis (CA), principal component analysis/factor analysis (PCA/FA) and discriminant analysis (DA) are applied to the evaluation matrix data set obtained by simulation of several control strategies applied to the plant-wide IWA Benchmark Simulation Model No 2 (BSM2). These techniques allow i) to determine natural groups or clusters of control strategies with a similar behaviour, ii) to find and interpret hidden, complex and casual relation features in the data set and iii) to identify important discriminant variables within the groups found by the cluster analysis. This study illustrates the usefulness of multivariable statistical techniques for both analysis and interpretation of the complex multicriteria data sets and allows an improved use of information for effective evaluation of control strategies.

Download Full-text

Empirical Evaluation of Genetic Clustering Methods Using Multilocus Genotypes From 20 Chicken Breeds

Genetics ◽

10.1093/genetics/159.2.699 ◽

2001 ◽

Vol 159 (2) ◽

pp. 699-713

Author(s):

Noah A Rosenberg ◽

Terry Burke ◽

Kari Elo ◽

Marcus W Feldman ◽

Paul J Freidlin ◽

...

Keyword(s):

Cluster Analysis ◽

Population Structure ◽

Clustering Algorithm ◽

Empirical Evaluation ◽

Unknown Origin ◽

Clustering Methods ◽

Genetic Cluster ◽

Data Set ◽

Multilocus Genotypes ◽

Chicken Breeds

Abstract We tested the utility of genetic cluster analysis in ascertaining population structure of a large data set for which population structure was previously known. Each of 600 individuals representing 20 distinct chicken breeds was genotyped for 27 microsatellite loci, and individual multilocus genotypes were used to infer genetic clusters. Individuals from each breed were inferred to belong mostly to the same cluster. The clustering success rate, measuring the fraction of individuals that were properly inferred to belong to their correct breeds, was consistently ~98%. When markers of highest expected heterozygosity were used, genotypes that included at least 8–10 highly variable markers from among the 27 markers genotyped also achieved >95% clustering success. When 12–15 highly variable markers and only 15–20 of the 30 individuals per breed were used, clustering success was at least 90%. We suggest that in species for which population structure is of interest, databases of multilocus genotypes at highly variable markers should be compiled. These genotypes could then be used as training samples for genetic cluster analysis and to facilitate assignments of individuals of unknown origin to populations. The clustering algorithm has potential applications in defining the within-species genetic units that are useful in problems of conservation.

Download Full-text

Cluster analysis of the results of intraoperative optical spectroscopic diagnostics In brain glioma neurosurgery

Biomedical Photonics ◽

10.24931/2413-9432-2018-7-4-23-34 ◽

2019 ◽

Vol 7 (4) ◽

pp. 23-34

Author(s):

I. A. Osmakov ◽

T. A. Savelieva ◽

V. B. Loschenov ◽

S. A. Goryajnov ◽

A. A. Potapov

Keyword(s):

Cluster Analysis ◽

Optical Spectroscopy ◽

Protoporphyrin Ix ◽

Spectroscopy Data ◽

Clustering Methods ◽

Agglomerative Clustering ◽

Data Set ◽

Broadband Radiation ◽

Degree Of Malignancy ◽

Normal White Matter

The paper presents the results of a comparative study of methods of cluster analysis of optical intraoperative spectroscopy data during surgery of glial tumors with varying degree of malignancy. The analysis was carried out both for individual patients and for the entire dataset. The data were obtained using combined optical spectroscopy technique, which allowed simultaneous registration of diﬀuse reﬂectance spectra of broadband radiation in the 500–600 nm spectral range (for the analysis of tissue blood supply and the degree of hemoglobin oxygenation), ﬂuorescence spectra of 5‑ALA induced protoporphyrin IX (Pp IX) (for analysis of the malignancy degree) and signal of diffusely reﬂected laser light used to excite Pp IX ﬂuorescence (to take into account the scattering properties of tissues). To determine the threshold values of these parameters for the tumor, the infltration zone and the normal white matter, we searched for the natural clusters in the available intraoperative optical spectroscopy data and compared them with the results of the pathomorphology. It was shown that, among the considered clustering methods, EM‑algorithm and k‑means methods are optimal for the considered data set and can be used to build a decision support system (DSS) for spectroscopic intraoperative navigation in neurosurgery. Results of clustering relevant to thepathological studies were also obtained using the methods of spectral and agglomerative clustering. These methods can be used to postprocess combined spectroscopy data.

Download Full-text

Exploring the Unknown Nature of Data

Handbook of Research on Machine Learning Applications and Trends ◽

10.4018/978-1-60566-766-9.ch001 ◽

2010 ◽

pp. 1-27

Author(s):

Rui Xu ◽

Donald C. Wunsch II

Keyword(s):

Neural Networks ◽

Cluster Analysis ◽

Data Structures ◽

Clustering Algorithms ◽

Ground Truth ◽

Human Beings ◽

Data Set ◽

Learning Capabilities ◽

Good Learning ◽

Hidden Data

To classify objects based on their features and characteristics is one of the most important and primitive activities of human beings. The task becomes even more challenging when there is no ground truth available. Cluster analysis allows new opportunities in exploring the unknown nature of data through its aim to separate a finite data set, with little or no prior information, into a finite and discrete set of “natural,” hidden data structures. Here, the authors introduce and discuss clustering algorithms that are related to machine learning and computational intelligence, particularly those based on neural networks. Neural networks are well known for their good learning capabilities, adaptation, ease of implementation, parallelization, speed, and flexibility, and they have demonstrated many successful applications in cluster analysis. The applications of cluster analysis in real world problems are also illustrated. Portions of the chapter are taken from Xu and Wunsch (2008).

Download Full-text

Cluster Analysis with Various Algorithms for Mixed Data

Pattern and Data Analysis in Healthcare Settings - Advances in Medical Technologies and Clinical Practice ◽

10.4018/978-1-5225-0536-5.ch014 ◽

2017 ◽

pp. 282-317

Author(s):

Abha Sharma ◽

R. S. Thakur

Keyword(s):

Cluster Analysis ◽

Clustering Algorithms ◽

Complex Problem ◽

Experimental Results ◽

Mixed Data ◽

Data Set ◽

Fuzzy C Means ◽

Conversion Method ◽

Inherent Structure ◽

Numeric Data

Analyzing clustering of mixed data set is a complex problem. Very useful clustering algorithms like k-means, fuzzy c-means, hierarchical methods etc. developed to extract hidden groups from numeric data. In this paper, the mixed data is converted into pure numeric with a conversion method, the various algorithm of numeric data has been applied on various well known mixed datasets, to exploit the inherent structure of the mixed data. Experimental results shows how smoothly the mixed data is giving better results on universally applicable clustering algorithms for numeric data.

Download Full-text

Cluster Analysis: An Application to a Real Mixed-Type Data Set

Models and Theories in Social Systems - Studies in Systems, Decision and Control ◽

10.1007/978-3-030-00084-4_27 ◽

2018 ◽

pp. 525-533 ◽

Cited By ~ 2

Author(s):

G. Caruso ◽

S. A. Gattone ◽

A. Balzanella ◽

T. Di Battista

Keyword(s):

Cluster Analysis ◽

Mixed Type ◽

Data Set ◽

Type Data

Download Full-text

Cluster analysis of European surface ozone observations for evaluation of MACC reanalysis data

Atmospheric Chemistry and Physics ◽

10.5194/acp-16-6863-2016 ◽

2016 ◽

Vol 16 (11) ◽

pp. 6863-6881 ◽

Cited By ~ 12

Author(s):

Olga Lyapina ◽

Martin G. Schultz ◽

Andreas Hense

Keyword(s):

Cluster Analysis ◽

Statistical Tests ◽

Surface Ozone ◽

Atmospheric Composition ◽

Regional Differentiation ◽

Mixing Ratio ◽

Data Set ◽

Diurnal Cycles ◽

Mixing Ratios ◽

Weekly Cycles

Abstract. The high density of European surface ozone monitoring sites provides unique opportunities for the investigation of regional ozone representativeness and for the evaluation of chemistry climate models. The regional representativeness of European ozone measurements is examined through a cluster analysis (CA) of 4 years of 3-hourly ozone data from 1492 European surface monitoring stations in the Airbase database; the time resolution corresponds to the output frequency of the model that is compared to the data in this study. K-means clustering is implemented for seasonal–diurnal variations (i) in absolute mixing ratio units and (ii) normalized by the overall mean ozone mixing ratio at each site. Statistical tests suggest that each CA can distinguish between four and five different ozone pollution regimes. The individual clusters reveal differences in seasonal–diurnal cycles, showing typical patterns of the ozone behavior for more polluted stations or more rural background. The robustness of the clustering was tested with a series of k-means runs decreasing randomly the size of the initial data set or lengths of the time series. Except for the Po Valley, the clustering does not provide a regional differentiation, as the member stations within each cluster are generally distributed all over Europe. The typical seasonal, diurnal, and weekly cycles of each cluster are compared to the output of the multi-year global reanalysis produced within the Monitoring of Atmospheric Composition and Climate (MACC) project. While the MACC reanalysis generally captures the shape of the diurnal cycles and the diurnal amplitudes, it is not able to reproduce the seasonal cycles very well and it exhibits a high bias up to 12 nmol mol−1. The bias decreases from more polluted clusters to cleaner ones. Also, the seasonal and weekly cycles and frequency distributions of ozone mixing ratios are better described for clusters with relatively clean signatures. Due to relative sparsity of CO and NOx measurements these were not included in the CA. However, simulated CO and NOx mixing ratios are consistent with the general classification into more polluted and more background sites. Mean CO mixing ratios are within 140–145 nmol mol−1 (CL1–CL3) and 130–135 nmol mol−1 (CL4 and CL5), and NOx mixing ratios are within 4–6 nmol mol−1 and 2–3 nmol mol−1, respectively. These results confirm that relatively coarse-scale global models are more suitable for simulation of regional background concentrations, which are less variable in space and time. We conclude that CA of surface ozone observations provides a powerful and robust way to stratify sets of stations, being thus more suitable for model evaluation.

Download Full-text

Origin of aerosol particles in the mid-latitude and subtropical upper troposphere and lowermost stratosphere from cluster analysis of CARIBIC data

Atmospheric Chemistry and Physics ◽

10.5194/acp-9-8413-2009 ◽

2009 ◽

Vol 9 (21) ◽

pp. 8413-8430 ◽

Cited By ~ 13

Author(s):

M. Köppe ◽

M. Hermann ◽

C. A. M. Brenninkmeijer ◽

J. Heintzenberg ◽

H. Schlager ◽

...

Keyword(s):

Boundary Layer ◽

Cluster Analysis ◽

Particle Number ◽

Aerosol Particles ◽

Free Troposphere ◽

Data Set ◽

Upper Troposphere ◽

Eurasian Continent ◽

Aitken Mode ◽

Seasonal Data

Abstract. The origin of aerosol particles in the upper troposphere and lowermost stratosphere over the Eurasian continent was investigated by applying cluster analysis methods to in situ measured data. Number concentrations of submicrometer aerosol particles and trace gas mixing ratios derived by the CARIBIC (Civil Aircraft for Regular Investigation of the Atmosphere Based on an Instrument Container) measurement system on flights between Germany and South-East Asia were used for this analysis. Four cluster analysis methods were applied to a test data set and their capability of separating the data points into scientifically reasonable clusters was assessed. The best method was applied to seasonal data subsets for summer and winter resulting in five cluster or air mass types: stratosphere, tropopause, free troposphere, high clouds, and boundary layer influenced. Other source clusters, like aircraft emissions could not be resolved in the present data set with the used methods. While the cluster separation works satisfactory well for the summer data, in winter interpretation is more difficult, which is attributed to either different vertical transport pathways or different chemical lifetimes in both seasons. The geographical distribution of the clusters together with histograms for nucleation and Aitken mode particles within each cluster are presented. Aitken mode particle number concentrations show a clear vertical gradient with the lowest values in the lowermost stratosphere (750–2820 particles/cm3 STP, minimum of the two 25% – and maximum of the two 75%-percentiles of both seasons) and the highest values for the boundary-layer-influenced air (4290–22 760 particles/cm3 STP). Nucleation mode particles are also highest in the boundary-layer-influenced air (1260–29 500 particles/cm3 STP), but are lowest in the free troposphere (0–450 particles/cm3 STP). The given submicrometer particle number concentrations represent the first large-scale seasonal data sets for the upper troposphere and lowermost stratosphere over the Eurasian continent.

Download Full-text

Nonuniqueness in traveltime tomography: Ensemble inference and cluster analysis

Geophysics ◽

10.1190/1.1444040 ◽

1996 ◽

Vol 61 (4) ◽

pp. 1209-1227 ◽

Cited By ~ 30

Author(s):

Don W. Vasco ◽

John E. Peterson ◽

Ernest L. Majer

Keyword(s):

Cluster Analysis ◽

Conjugate Gradient ◽

Seismic Inversion ◽

Low Velocity ◽

Data Set ◽

Traveltime Tomography ◽

All Solutions ◽

And Cluster Analysis ◽

Starting Model ◽

Norm Penalty

We examine the nonlinear aspects of seismic traveltime tomography. This is accomplished by completing an extensive set of conjugate gradient inversions on a parallel virtual machine, with each initiated by a different starting model. The goal is an exploratory analysis of a set of conjugate gradient solutions to the traveltime tomography problem. We find that distinct local minima are generated when prior constraints are imposed on traveltime tomographic inverse problems. Methods from cluster analysis determine the number and location of the isolated solutions to the traveltime tomography problem. We apply the cluster analysis techniques to a cross‐borehole traveltime data set gathered at the Gypsy Pilot Site in Pawnee County, Oklahoma. We find that the 1075 final models, satisfying the traveltime data and a model norm penalty, form up to 61 separate solutions. All solutions appear to contain a central low velocity zone bounded above and below by higher velocity layers. Such a structure agrees with well‐logs, hydrological well tests, and a previous seismic inversion.

Download Full-text

Shallow structural setting of an active normal fault zone in the 30 October 2016 Mw 6.5 central Italy earthquake imaged through a multidisciplinary geophysical approach.

10.5194/egusphere-egu2020-7242 ◽

2020 ◽

Author(s):

Fabio Villani ◽

Stefano Maraio ◽

Pier Paolo Bruno ◽

Lisa Serri ◽

Vincenzo Sapia ◽

...

Keyword(s):

Cluster Analysis ◽

Fault Zone ◽

Central Italy ◽

Normal Fault ◽

Alluvial Fan ◽

Fault System ◽

Depth Range ◽

Inversion Algorithm ◽

Data Set ◽

Refraction Tomography

We investigate the shallow structure of an active normal fault-zone that ruptured the surface during the 30 October 2016 Mw 6.5 Norcia earthquake (central Italy) using a multidisciplinary geophysical approach. The survey site is located in the Castelluccio basin, an intramontane Quaternary depression in the hangingwall of the SW-dipping Vettore-Bove fault system. The Norcia earthquake caused widespread surface faulting affecting also the Castelluccio basin, where the rupture trace follows the 2 km-long Valle delle Fonti fault (VF), displaying a ~3 m-high fault scarp due to cumulative surface slip of Holocene paleo-earthquakes. We explored the subsurface of the VF fault along a 2-D transect orthogonal to the coseismic rupture on recent alluvial fan deposits, combining very high-resolution seismic refraction tomography, multichannel analysis of surface waves (MASW), reflection seismology and electrical resistivity tomography (ERT).We acquired the ERT profile using an array of 64 steel electrodes, 2 m-spaced. Apparent resistivity data were then modeled via a linearized inversion algorithm with smoothness constraints to recover the subsurface resistivity distribution. The seismic data were recorded by&#160; a190 m-long single array centered on the surface rupture, using 96 vertical geophones 2 m-spaced and a 5 kg hammer source.Input data for refraction tomography are ~9000 handpicked first arrival travel-times, inverted through a fully non-linear multi-scale algorithm based on a finite-difference Eikonal solver. The data for MASW were extracted from common receiver configurations with 24 geophones; the dispersion curves were inverted to generate several S-wave 1-D profiles, subsequently interpolated to generate a pseudo-2D Vs section. For reflection data, after a pre-processing flow, the picking of the maximum of semblance on CMP super-gathers was used to define a velocity model (VNMO) for CMP ensemble stack; the final stack velocity macro-model (VNMO) from the CMP processing was smoothed and used for post-stack depth conversion. We further processed Vp, Vs and resistivity models through the K-means algorithm, which performs a cluster analysis for the bivariate data set to individuate relationships between the two sets of variables. The result is an integrated model with a finite number of homogeneous clusters.In the depth converted reflection section, the subsurface of the VF fault displays abrupt reflection truncations in the 5-60 m depth range suggesting a cumulative fault throw of ~30 m. Furthermore, another normal fault appears in the in the footwall. The reflection image points out alternating high-amplitude reflections that we interpret as a stack of alluvial sandy-gravels layers that thickens in the hangingwall of the VF fault. Resistivity, Vp and Vs models provide hints on the physical properties of the active fault zone, appearing as a moderately conductive (< 150 &#8486;m) elongated body with relatively high-Vp (~1500 m/s) and low-Vs (< 500 m/s). The Vp/Vs ratio > 3 and the Poisson&#8217;s coefficient > 0.4 in the fault zone suggest this is a granular nearly-saturated medium, probably related to the increase of permeability due to fracturing and shearing. The results from the K-means cluster analysis also identify a homogeneous cluster in correspondence of the saturated fault zone.

Download Full-text