Tsallis Entropy for Assessing Spatial Uncertainty Associated with Mean Annual Runoff of Quaternary Catchments of the Middle Vaal Basin in South Africa

This study assesses mainly the uncertainty of the mean annual runoff (MAR) for quaternary catchments (QCs) considered as metastable nonextensive systems (from Tsalllis entropy) in the Middle Vaal catchment. The study is applied to the surface water resources (WR) of the South Africa 1990 (WR90), 2005 (WR2005) and 2012 (WR2012) data sets. The q-information index (from the Tsalllis entropy) is used here as a deviation indicator for the spatial evolution of uncertainty for the different QCs, using the Shannon entropy as a baseline. It enables the determination of a (virtual) convergence point, zone of positive and negative uncertainty deviation, zone of null deviation and chaotic zone for each data set. Such a determination is not possible on the basis of the Shannon entropy alone as a measure for the MAR uncertainty of QCs, i.e., when they are viewed as extensive systems. Finally, the spatial distributions for the zones of the q-uncertainty deviation (gain or loss in information) of the MAR are derived and lead to iso q-uncertainty deviation maps.

Download Full-text

Shannon Entropy for Measuring Spatial Complexity Associated with Mean Annual Runoff of Tertiary Catchments of the Middle Vaal Basin in South Africa

Entropy ◽

10.3390/e21040366 ◽

2019 ◽

Vol 21 (4) ◽

pp. 366 ◽

Cited By ~ 1

Author(s):

Masengo Ilunga

Keyword(s):

South Africa ◽

Information Gain ◽

Annual Runoff ◽

Data Sets ◽

Information Redundancy ◽

Spatial Complexity ◽

Surface Water Resources ◽

Gain Loss ◽

Relative Change ◽

Hydrological Phases

This study evaluates essentially mean annual runoff (MAR) information gain/loss for tertiary catchments (TCs) in the Middle Vaal basin. Data sets from surface water resources (WR) of South Africa 1990 (WR90), 2005 (WR2005) and 2012 (WR2012) referred in this study as hydrological phases, are used in this evaluation. The spatial complexity level or information redundancy associated with MAR of TCs is derived as well as the relative change in entropy of TCs between hydrological phases. Redundancy and relative change in entropy are shown to coincide under specific conditions. Finally, the spatial distributions of MAR iso-information transmission (i.e., gain or loss) and MAR iso-information redundancy are established for the Middle Vaal basin.

Download Full-text

Improved retrieval of nitrogen dioxide (NO<sub>2</sub>) column densities by means of MKIV Brewer spectrophotometers

Atmospheric Measurement Techniques ◽

10.5194/amt-7-4009-2014 ◽

2014 ◽

Vol 7 (11) ◽

pp. 4009-4022 ◽

Cited By ~ 8

Author(s):

H. Diémoz ◽

A. M. Siani ◽

A. Redondas ◽

V. Savastiouk ◽

C. T. McElroy ◽

...

Keyword(s):

Nitrogen Dioxide ◽

Ad Hoc ◽

Absorption Spectrometry ◽

Atmospheric Composition ◽

Data Sets ◽

Data Set ◽

Noise Interference ◽

Atmospheric Species ◽

Mass Factor

Abstract. A new algorithm to retrieve nitrogen dioxide (NO2) column densities using MKIV ("Mark IV") Brewer spectrophotometers is described. The method includes several improvements, such as a more recent spectroscopic data set, the reduction of measurement noise, interference by other atmospheric species and instrumental settings, and a better determination of the zenith sky air mass factor. The technique was tested during an ad hoc calibration campaign at the high-altitude site of Izaña (Tenerife, Spain) and the results of the direct sun and zenith sky geometries were compared to those obtained by two reference instruments from the Network for the Detection of Atmospheric Composition Change (NDACC): a Fourier Transform Infrared Radiometer (FTIR) and an advanced visible spectrograph (RASAS-II) based on the differential optical absorption spectrometry (DOAS) technique. To determine the extraterrestrial constant, an easily implementable extension of the standard Langley technique for very clean sites without tropospheric NO2 was developed which takes into account the daytime linear drift of stratospheric nitrogen dioxide due to photochemistry. The measurement uncertainty was thoroughly determined by using a Monte Carlo technique. Poisson noise and wavelength misalignments were found to be the most influential contributors to the overall uncertainty, and possible solutions are proposed for future improvements. The new algorithm is backward-compatible, thus allowing for the reprocessing of historical data sets.

Download Full-text

Spectral analysis and inversion of experimental codas

Geophysics ◽

10.1190/1.1443424 ◽

1993 ◽

Vol 58 (3) ◽

pp. 408-418 ◽

Cited By ~ 3

Author(s):

L. R. Jannaud ◽

P. M. Adler ◽

C. G. Jacquin

Keyword(s):

Spectral Analysis ◽

Characteristic Length ◽

Gaussian Model ◽

Seismic Survey ◽

Data Sets ◽

Data Set ◽

Anisotropic Elastic ◽

And Inversion ◽

Good Agreement

A method developed for the determination of the characteristic lengths of an heterogeneous medium from the spectral analysis of codas is based on an extension of Aki’s theory to anisotropic elastic media. An equivalent Gaussian model is obtained and seems to be in good agreement with the two experimental data sets that illustrate the method. The first set was obtained in a laboratory experiment with an isotropic marble sample. This sample is characterized by a submillimetric length scale that can be directly observed on a thin section. The spectral analysis of codas and their inversion yields an equivalent correlation length that is in good agreement with the observed one. The second data set is obtained in a crosshole experiment at the usual scale of a seismic survey. The codas are recorded, analysed, and inverted. The analysis yields a vertical characteristic length for the studied subsurface that compares well with the characteristic length measured by seismic and stratigraphic logs.

Download Full-text

Determination of site occupancies by the intermeasurement minimization method. I. Anomalous scattering usage for noncentrosymmetric crystals

Journal of Applied Crystallography ◽

10.1107/s002188980705621x ◽

2008 ◽

Vol 41 (1) ◽

pp. 83-95 ◽

Cited By ~ 10

Author(s):

Alexander Dudka

Keyword(s):

Site Occupancy ◽

Anomalous Scattering ◽

Data Sets ◽

Minimization Method ◽

Data Set ◽

Occupancy Factor ◽

Site Occupancy Factor ◽

Three Samples ◽

Scattering Contribution

New methods for the determination of site occupancy factors are described. The methods are based on the analysis of differences between intensities of Friedel reflections in noncentrosymmetric crystals. In the first method (Anomalous-Expert) the site occupancy factor is determined by the condition that it is identical for two data sets: (1) initial data without averaging of Friedel intensities and (2) data that are averaged on Friedel pairs after the reduction of the anomalous scattering contribution. In the second method (anomalous anisotropic intermeasurement minimization method, Anomalous-AniMMM) the site occupancy factor is refined to satisfy the condition that the differences between the intensities of Friedel reflections that are reduced on the anomalous scattering contribution must be minimal. The methods were checked for three samples of RbTi1−xZrxOPO4crystals (A,BandC) with KTiOPO4structure, at 295 and 105 K (five experimental data sets). Microprobe measurements yield compositionsxA,B= 0.034 (5) andxC= 0.022 (4). The corresponding site occupancy factors areQA,B= 0.932 (10) andQC= 0.956 (8). Using Anomalous-AniMMM and three independent refinements for the first and second samples, the initial occupancy factor ofQA,B= 0.963 (15) was improved toQA,B= 0.938 (7). Of the three room-temperature data sets, one was improved toQA,B= 0.934 (2). For the third sample and one data set, the initial occupancy factor ofQC= 1.000 was improved toQC= 0.956 (1). The methods improve the Hirshfeld rigid-bond test. It is discussed how the description of chemical bonding influences the site occupancy factor.

Download Full-text

Applicability of CHIRPS-based satellite rainfall estimates for South Africa

Journal of the South African Institution of Civil Engineering ◽

10.17159/2309-8775/2021/v63n3a4 ◽

2021 ◽

Vol 63 (3) ◽

pp. 1-12

Author(s):

J A du Plessis ◽

J K Kibii

Keyword(s):

South Africa ◽

Temporal Distribution ◽

Pairwise Comparison ◽

Monthly Rainfall ◽

Rainfall Data ◽

Coefficient Of Determination ◽

Data Sets ◽

Good Precision ◽

Alternative Source

Long-term rainfall data with good spatial and temporal distribution is essential for all climate-related analyses. The availability of observed rainfall data has become increasingly problematic over the years due to a limited and deteriorating rainfall station network, occasioned by limited reporting and/or quality control of rainfall and, in some cases, closure of these stations. Remotely sensed satellite-based rainfall data sets offer an alternative source of information. In this study, daily and monthly rainfall data derived from Climate Hazards Group InfraRed Precipitation (CHIRPS) is compared with observed rainfall data from 46 stations evenly distributed across South Africa. Various metrics, based on a pairwise comparison between the observed and CHIRPS data, were applied to evaluate CHIRPS performance in the estimation of daily and monthly rainfall. The results show that CHIRPS data correlate well with observed monthly rainfall data for all stations used, having an average coefficient of determination of 0.6 and bias of 0.95. This study concludes that monthly CHIRPS data corresponds well, with good precision and relatively little bias when compared to observed monthly rainfall data, and can therefore be considered for use in conjunction with observed rainfall data where no or limited data is available in South Africa for hydrological analysis.

Download Full-text

Novel Automated K-means++ Algorithm for Financial Data Sets

Mathematical Problems in Engineering ◽

10.1155/2021/5521119 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Guoyu Du ◽

Xuehua Li ◽

Lanjie Zhang ◽

Libo Liu ◽

Chaohua Zhao

Keyword(s):

Linear Time ◽

Sparse Matrix ◽

Data Sets ◽

Volume Data ◽

Text Data ◽

Data Set ◽

Inverse Document Frequency ◽

Document Frequency ◽

Initial Cluster

The K-means algorithm has been extensively investigated in the field of text clustering because of its linear time complexity and adaptation to sparse matrix data. However, it has two main problems, namely, the determination of the number of clusters and the location of the initial cluster centres. In this study, we propose an improved K-means++ algorithm based on the Davies-Bouldin index (DBI) and the largest sum of distance called the SDK-means++ algorithm. Firstly, we use the term frequency-inverse document frequency to represent the data set. Secondly, we measure the distance between objects by cosine similarity. Thirdly, the initial cluster centres are selected by comparing the distance to existing initial cluster centres and the maximum density. Fourthly, clustering results are obtained using the K-means++ method. Lastly, DBI is used to obtain optimal clustering results automatically. Experimental results on real bank transaction volume data sets show that the SDK-means++ algorithm is more effective and efficient than two other algorithms in organising large financial text data sets. The F-measure value of the proposed algorithm is 0.97. The running time of the SDK-means++ algorithm is reduced by 42.9% and 22.4% compared with that for K-means and K-means++ algorithms, respectively.

Download Full-text

Determination of Optimal Clusters Using a Genetic Algorithm

Data Mining and Knowledge Discovery Technologies ◽

10.4018/978-1-59904-960-1.ch005 ◽

2008 ◽

pp. 98-117 ◽

Cited By ~ 1

Author(s):

Tushar ◽

Shibendu Shekhar Roy ◽

Dilip Kumar Pratihar

Keyword(s):

Genetic Algorithm ◽

Threshold Value ◽

Data Sets ◽

Self Organizing Map ◽

Data Set ◽

Fcm Algorithm ◽

Data Points ◽

The Relationship

Clustering is a potential tool of data mining. A clustering method analyzes the pattern of a data set and groups the data into several clusters based on the similarity among themselves. Clusters may be either crisp or fuzzy in nature. The present chapter deals with clustering of some data sets using Fuzzy C-Means (FCM) algorithm and Entropy-based Fuzzy Clustering (EFC) algorithm. In FCM algorithm, the nature and quality of clusters depend on the pre-defined number of clusters, level of cluster fuzziness and a threshold value utilized for obtaining the number of outliers (if any). On the other hand, the quality of clusters obtained by the EFC algorithm is dependent on a constant used to establish the relationship between the distance and similarity of two data points, a threshold value of similarity and another threshold value used for determining the number of outliers. The clusters should ideally be distinct and at the same time compact in nature. Moreover, the number of outliers should be as minimum as possible. Thus, the above problem may be posed as an optimization problem, which will be solved using a Genetic Algorithm (GA). The best set of multi-dimensional clusters will be mapped into 2-D for visualization using a Self-Organizing Map (SOM).

Download Full-text

Relationships Among Surface Water Resources in the WR90, WR2005 and WR2012 Datasets of South Africa Using Mean Annual Runoff of Quaternary Catchments

Sustainable Development Goals Series - Climate Variability and Change in Africa ◽

10.1007/978-3-030-31543-6_9 ◽

2020 ◽

pp. 107-112

Author(s):

Masengo Ilunga

Keyword(s):

South Africa ◽

Surface Water ◽

Water Resources ◽

Annual Runoff ◽

Surface Water Resources

Download Full-text

Multicrystal approach to crystal structure solution and refinement

Zeitschrift für Kristallographie - Crystalline Materials ◽

10.1524/zkri.219.12.813.55870 ◽

2004 ◽

Vol 219 (12) ◽

Cited By ~ 12

Author(s):

Gavin B. M. Vaughan ◽

Soeren Schmidt ◽

Henning F. Poulsen

Keyword(s):

Single Crystal ◽

Polycrystalline Sample ◽

Test Case ◽

Data Sets ◽

Data Set ◽

Structure Solution ◽

The Individual ◽

Integrated Intensities ◽

Subsequent Integration

AbstractWe present a method in which the contributions from the individual crystallites in a polycrystalline sample are separated and treated as essentially single crystal data sets. The process involves the simultaneous determination of the orientation matrices of the individual crystallites in the sample, the subsequent integration of the individual peaks, and filtering and summing of the subsequent integrated intensities, in order to arrive at a single-crystal like data set which may be treated normally. In order to demonstrate the method, we consider as a test case a small molecule structure, cupric acetate monohyrade. We show that it is possible to obtain a single-crystal quality structure solution and refinement, in which accurate anisotropic thermal parameters and hydrogen atom positions are obtained.

Download Full-text

Radiocarbon AMS Data Analysis: From Measured Isotopic Ratios to 14C Concentrations

Radiocarbon ◽

10.1017/s0033822200045112 ◽

2010 ◽

Vol 52 (1) ◽

pp. 165-170 ◽

Cited By ~ 9

Author(s):

Ugo Zoppi

Keyword(s):

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Mathematical Framework ◽

Intrinsic Variability ◽

Data Set ◽

Novel Approach ◽

Case Data ◽

Normalization Factors

Radiocarbon accelerator mass spectrometry (AMS) measurements are always carried out relative to internationally accepted standards with known 14C activities. The determination of accurate 14C concentrations relies on the fact that standards and unknown samples must be measured under the same conditions. When this is not the case, data reduction is either performed by splitting the collected data set into subsets with consistent measurement conditions or by applying correction factors.This paper introduces a mathematical framework that exploits the intrinsic variability of an AMS system by combining arbitrary measurement parameters into a normalization function. This novel approach allows the en-masse reduction of large data sets by providing individual normalization factors for each data point. Both general features and practicalities necessary for its efficient application are discussed.

Download Full-text