ON THE APPLICATION OF METHODS USED TO CALCULATE THE FRACTAL DIMENSION OF FRACTURE SURFACES

This paper presents an evaluation of the methods applied to calculate the fractal dimension of fracture surfaces. Variogram (applicable to 1D self-affine sets) and power spectral density analyses (applicable to 2D self-affine sets) are selected to calculate the fractal dimension of synthetic 2D data sets generated using fractional Brownian motion (fBm). Then, the calculated values are compared with the actual fractal dimensions assigned in the generation of the synthetic surfaces. The main factor considered is the size of the 2D data set (number of data points). The critical sample size that yields the best agreement between the calculated and actual values is defined for each method. Limitations and the proper use of each method are clarified after an extensive analysis. The two methods are also applied to synthetically and naturally developed fracture surfaces of different types of rocks. The methods yield inconsistent fractal dimensions for natural fracture surfaces and the reasons of this are discussed. The anisotropic feature of fractal dimension that may lead to a correlation of fracturing mechanism and multifractality of the fracture surfaces is also addressed.

Download Full-text

A Support Based Initialization Algorithm for Categorical Data Clustering

Journal of Information Technology Research ◽

10.4018/jitr.2018040104 ◽

2018 ◽

Vol 11 (2) ◽

pp. 53-67

Author(s):

Ajay Kumar ◽

Shishir Kumar

Keyword(s):

Categorical Data ◽

Selection Process ◽

Numerical Data ◽

Real Data ◽

Data Sets ◽

Data Set ◽

Data Object ◽

Data Points ◽

Wu Method ◽

Selection Algorithms

Several initial center selection algorithms are proposed in the literature for numerical data, but the values of the categorical data are unordered so, these methods are not applicable to a categorical data set. This article investigates the initial center selection process for the categorical data and after that present a new support based initial center selection algorithm. The proposed algorithm measures the weight of unique data points of an attribute with the help of support and then integrates these weights along the rows, to get the support of every row. Further, a data object having the largest support is chosen as an initial center followed by finding other centers that are at the greatest distance from the initially selected center. The quality of the proposed algorithm is compared with the random initial center selection method, Cao's method, Wu method and the method introduced by Khan and Ahmad. Experimental analysis on real data sets shows the effectiveness of the proposed algorithm.

Download Full-text

Roughness and fractality of fracture surfaces as indicators of mechanical quantities of porous solids

Open Physics ◽

10.2478/s11534-011-0061-0 ◽

2011 ◽

Vol 9 (6) ◽

Cited By ~ 2

Author(s):

Tomáš Ficker ◽

Dalibor Martišek

Keyword(s):

Mechanical Properties ◽

Fractal Dimension ◽

Fractal Dimensions ◽

Point Of View ◽

Porous Solids ◽

Surface Parameter ◽

Cement Pastes ◽

Fracture Surfaces ◽

Profile Parameter ◽

3D Profile

AbstractThe 3D profile surface parameter H q and fractal dimension D were tested as indicators of mechanical properties inferred from fracture surfaces of porous solids. High porous hydrated cement pastes were used as prototypes of porous materials. Both the profile parameter H q and the fractal dimension D showed capability to assess compressive strength from the fracture surfaces of hydrated pastes. From a practical point of view the 3D profile parameter H q seems to be more convenient as an indicator of mechanical properties, as its values suffer much less from statistical scatter than those of fractal dimensions.

Download Full-text

An Incremental Isomap Method for Hyperspectral Dimensionality Reduction and Classification

Photogrammetric Engineering & Remote Sensing ◽

10.14358/pers.87.7.445 ◽

2021 ◽

Vol 87 (6) ◽

pp. 445-455

Author(s):

Yi Ma ◽

Zezhong Zheng ◽

Yutang Ma ◽

Mingcang Zhu ◽

Ran Huang ◽

...

Keyword(s):

Manifold Learning ◽

Nearest Neighbor ◽

Hyperspectral Image ◽

Hyperspectral Data ◽

Training Data ◽

Support Vector ◽

Data Sets ◽

K Nearest Neighbor ◽

Data Set ◽

Data Points

Many manifold learning algorithms conduct an eigen vector analysis on a data-similarity matrix with a size of N×N, where N is the number of data points. Thus, the memory complexity of the analysis is no less than O(N2). We pres- ent in this article an incremental manifold learning approach to handle large hyperspectral data sets for land use identification. In our method, the number of dimensions for the high-dimensional hyperspectral-image data set is obtained with the training data set. A local curvature varia- tion algorithm is utilized to sample a subset of data points as landmarks. Then a manifold skeleton is identified based on the landmarks. Our method is validated on three AVIRIS hyperspectral data sets, outperforming the comparison algorithms with a k–nearest-neighbor classifier and achieving the second best performance with support vector machine.

Download Full-text

Feature-Based Uncertainty Visualization

Big Data ◽

10.4018/978-1-4666-9840-6.ch014 ◽

2016 ◽

pp. 261-287

Author(s):

Keqin Wu ◽

Song Zhang

Keyword(s):

Scientific Data ◽

Data Sets ◽

Data Set ◽

Uncertainty Visualization ◽

Scalar Data ◽

Critical Issues ◽

Contour Level ◽

Feature Based ◽

2D Data ◽

The Impact

While uncertainty in scientific data attracts an increasing research interest in the visualization community, two critical issues remain insufficiently studied: (1) visualizing the impact of the uncertainty of a data set on its features and (2) interactively exploring 3D or large 2D data sets with uncertainties. In this chapter, a suite of feature-based techniques is developed to address these issues. First, an interactive visualization tool for exploring scalar data with data-level, contour-level, and topology-level uncertainties is developed. Second, a framework of visualizing feature-level uncertainty is proposed to study the uncertain feature deviations in both scalar and vector data sets. With quantified representation and interactive capability, the proposed feature-based visualizations provide new insights into the uncertainties of both data and their features which otherwise would remain unknown with the visualization of only data uncertainties.

Download Full-text

SPSM: A NEW HYBRID DATA CLUSTERING ALGORITHM FOR NONLINEAR DATA ANALYSIS

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001409007685 ◽

2009 ◽

Vol 23 (08) ◽

pp. 1701-1737 ◽

Cited By ~ 3

Author(s):

UREERAT WATTANACHON ◽

CHIDCHANOK LURSINSAP

Keyword(s):

Clustering Algorithm ◽

Color Image ◽

Clustering Algorithms ◽

Noisy Data ◽

Second Phase ◽

Data Sets ◽

Data Set ◽

Cluster Distance ◽

Data Points ◽

Hybrid Data

Existing clustering algorithms, such as single-link clustering, k-means, CURE, and CSM are designed to find clusters based on predefined parameters specified by users. These algorithms may be unsuccessful if the choice of parameters is inappropriate with respect to the data set being clustered. Most of these algorithms work very well for compact and hyper-spherical clusters. In this paper, a new hybrid clustering algorithm called Self-Partition and Self-Merging (SPSM) is proposed. The SPSM algorithm partitions the input data set into several subclusters in the first phase and, then, removes the noisy data in the second phase. In the third phase, the normal subclusters are continuously merged to form the larger clusters based on the inter-cluster distance and intra-cluster distance criteria. From the experimental results, the SPSM algorithm is very efficient to handle the noisy data set, and to cluster the data sets of arbitrary shapes of different density. Several examples for color image show the versatility of the proposed method and compare with results described in the literature for the same images. The computational complexity of the SPSM algorithm is O(N2), where N is the number of data points.

Download Full-text

A dynamic K-means clustering for data mining

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v13.i2.pp521-526 ◽

2019 ◽

Vol 13 (2) ◽

pp. 521

Author(s):

Md. Zakir Hossain ◽

Md.Nasim Akhtar ◽

R.B. Ahmad ◽

Mostafijur Rahman

Keyword(s):

Data Mining ◽

Clustering Algorithm ◽

Large Data ◽

Threshold Value ◽

Specific Pattern ◽

Large Data Sets ◽

Data Sets ◽

Data Set ◽

Number Of Clusters ◽

Data Points

<span>Data mining is the process of finding structure of data from large data sets. With this process, the decision makers can make a particular decision for further development of the real-world problems. Several data clusteringtechniques are used in data mining for finding a specific pattern of data. The K-means method isone of the familiar clustering techniques for clustering large data sets. The K-means clustering method partitions the data set based on the assumption that the number of clusters are fixed.The main problem of this method is that if the number of clusters is to be chosen small then there is a higher probability of adding dissimilar items into the same group. On the other hand, if the number of clusters is chosen to be high, then there is a higher chance of adding similar items in the different groups. In this paper, we address this issue by proposing a new K-Means clustering algorithm. The proposed method performs data clustering dynamically. The proposed method initially calculates a threshold value as a centroid of K-Means and based on this value the number of clusters are formed. At each iteration of K-Means, if the Euclidian distance between two points is less than or equal to the threshold value, then these two data points will be in the same group. Otherwise, the proposed method will create a new cluster with the dissimilar data point. The results show that the proposed method outperforms the original K-Means method.</span>

Download Full-text

Clustering, Connectivity and Flow Responses of Deterministic Fractal-Fracture Networks

Advances in Geosciences ◽

10.5194/adgeo-54-149-2020 ◽

2020 ◽

Vol 54 ◽

pp. 149-156

Author(s):

Ajay K. Sahu ◽

Ankur Roy

Keyword(s):

Fractal Dimension ◽

Spatial Clustering ◽

Flow Behavior ◽

Flow Simulation ◽

Fractal Dimensions ◽

Fracture Network ◽

Natural Fracture ◽

Fracture Networks ◽

Fracture Models ◽

Fractal Fracture

Abstract. It is well known that fracture networks display self-similarity in many cases and the connectivity and flow behavior of such networks are influenced by their respective fractal dimensions. In the past, the concept of lacunarity, a parameter that quantifies spatial clustering, has been implemented by one of the authors in order to demonstrate that a set of seven nested natural fracture maps belonging to a single fractal system, but of different visual appearances, have different clustering attributes. Any scale-dependency in the clustering of fractures will also likely have significant implications for flow processes that depend on fracture connectivity. It is therefore important to address the question as to whether the fractal dimension alone serves as a reasonable proxy for the connectivity of a fractal-fracture network and hence, its flow response or, if it is the lacunarity, a measure of scale-dependent clustering, that may be used instead. The present study attempts to address this issue by exploring possible relationships between the fractal dimension, lacunarity and connectivity of fractal-fracture networks. It also endeavors to study the relationship between lacunarity and fluid flow in such fractal-fracture networks. A set of deterministic fractal-fracture models generated at different iterations and, that have the same theoretical fractal dimension are used for this purpose. The results indicate that such deterministic synthetic fractal-fracture networks with the same theoretical fractal dimension have differences in their connectivity and that the latter is fairly correlated with lacunarity. Additionally, the flow simulation results imply that lacunarity influences flow patterns in fracture networks. Therefore, it may be concluded that at least in synthetic fractal-fracture networks, rather than fractal dimension, it is the lacunarity or scale-dependent clustering attribute that controls the connectivity and hence the flow behavior.

Download Full-text

Extreme data compression while searching for new physics

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/staa2589 ◽

2020 ◽

Vol 498 (3) ◽

pp. 3440-3451

Author(s):

Alan F Heavens ◽

Elena Sellentin ◽

Andrew H Jaffe

Keyword(s):

Data Compression ◽

Principal Component ◽

New Physics ◽

Standard Theory ◽

Data Sets ◽

Data Set ◽

Formidable Challenge ◽

Public Data ◽

Bayesian Evidence ◽

Data Points

ABSTRACT Bringing a high-dimensional data set into science-ready shape is a formidable challenge that often necessitates data compression. Compression has accordingly become a key consideration for contemporary cosmology, affecting public data releases, and reanalyses searching for new physics. However, data compression optimized for a particular model can suppress signs of new physics, or even remove them altogether. We therefore provide a solution for exploring new physics during data compression. In particular, we store additional agnostic compressed data points, selected to enable precise constraints of non-standard physics at a later date. Our procedure is based on the maximal compression of the MOPED algorithm, which optimally filters the data with respect to a baseline model. We select additional filters, based on a generalized principal component analysis, which are carefully constructed to scout for new physics at high precision and speed. We refer to the augmented set of filters as MOPED-PC. They enable an analytic computation of Bayesian Evidence that may indicate the presence of new physics, and fast analytic estimates of best-fitting parameters when adopting a specific non-standard theory, without further expensive MCMC analysis. As there may be large numbers of non-standard theories, the speed of the method becomes essential. Should no new physics be found, then our approach preserves the precision of the standard parameters. As a result, we achieve very rapid and maximally precise constraints of standard and non-standard physics, with a technique that scales well to large dimensional data sets.

Download Full-text

Determination of Optimal Clusters Using a Genetic Algorithm

Data Mining and Knowledge Discovery Technologies ◽

10.4018/978-1-59904-960-1.ch005 ◽

2008 ◽

pp. 98-117 ◽

Cited By ~ 1

Author(s):

Tushar ◽

Shibendu Shekhar Roy ◽

Dilip Kumar Pratihar

Keyword(s):

Genetic Algorithm ◽

Threshold Value ◽

Data Sets ◽

Self Organizing Map ◽

Data Set ◽

Fcm Algorithm ◽

Data Points ◽

The Relationship

Clustering is a potential tool of data mining. A clustering method analyzes the pattern of a data set and groups the data into several clusters based on the similarity among themselves. Clusters may be either crisp or fuzzy in nature. The present chapter deals with clustering of some data sets using Fuzzy C-Means (FCM) algorithm and Entropy-based Fuzzy Clustering (EFC) algorithm. In FCM algorithm, the nature and quality of clusters depend on the pre-defined number of clusters, level of cluster fuzziness and a threshold value utilized for obtaining the number of outliers (if any). On the other hand, the quality of clusters obtained by the EFC algorithm is dependent on a constant used to establish the relationship between the distance and similarity of two data points, a threshold value of similarity and another threshold value used for determining the number of outliers. The clusters should ideally be distinct and at the same time compact in nature. Moreover, the number of outliers should be as minimum as possible. Thus, the above problem may be posed as an optimization problem, which will be solved using a Genetic Algorithm (GA). The best set of multi-dimensional clusters will be mapped into 2-D for visualization using a Self-Organizing Map (SOM).

Download Full-text

HACO2 Method for Evolving Hyperbox Classifiers with Ant Colony Optimization

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2009.p0338 ◽

2009 ◽

Vol 13 (3) ◽

pp. 338-346

Author(s):

Guilherme N. Ramos ◽

◽

Fangyan Dong ◽

Kaoru Hirota

Keyword(s):

Ant Colony Optimization ◽

Ant Colony ◽

Data Sets ◽

Parameter Setting ◽

Data Set ◽

2D Data ◽

Iris Data

A method, called HACO2 (Hyperbox classifier with Ant Colony Optimization - type 2), is proposed for evolving a hyperbox classifier using the ant colony meta-heuristic. It reshapes the hyperboxes in a near-optimal way to better fit the data, improving the accuracy and possibly indicating its most discriminative features. HACO2 is validated using artificial 2D data showing over 90% accuracy. It is also applied to the benchmark iris data set (4 features), providing results with over 93% accuracy, and to the MIS data set (11 features), with almost 85% accuracy. For these sets, the two most discriminative features obtained from the method are used in simplified classifiers which result in accuracies of 100% for the iris and 83% for the MIS data sets. Further modifications (automatic parameter setting), extensions (initialization short comings) and applications are discussed.

Download Full-text