Exploratory data analyses: describing our data

2022 ◽  
pp. 47-72
Author(s):  
Stephen C. Loftus
2012 ◽  
Vol 5 (1) ◽  
pp. 19-28
Author(s):  
JA Syeda

An attempt was made to investigate the seasonal (three crop seasons) trend and variability of 20 climatic variables of Dinajpur district for 1948-2004. The variety of Exploratory Data Analyses (EDA) tools and different robust and nonrobust measures are used for the analyses. The rates of total rainfall are observed positive for all the three seasons but the residuals are nonormal and/or nonstationary. The rates were found significant positive for average dry bulb temperature (+0.00655°Cyr-1) but negative for   total frequency of zero rainfall (-0.0846daysyr-1) during Kharif season and significant positive for average minimum temperature(+0.0175°Cyr-1) but negative for range temperature (-0.0456°Cyr-1 ) and maximum temperature (-0.0281°Cyr-1) during Rabi season. Historical climatic data needs exploratory analysis and warrants tougher justification in classical analyses for outlier and residual’s nonnormality and nonstationarity. DOI: http://dx.doi.org/10.3329/jesnr.v5i1.11549 J. Environ. Sci. & Natural Resources, 5(1): 19 – 28, 2012  


2012 ◽  
Vol 5 (1) ◽  
pp. 89-98
Author(s):  
JA Syeda

An attempt was made to investigate the monthly variability and trend for 20 climatic variables of Dinajpur district for 1948-2004. The variety of exploratory data analyses (EDA) tools and different robust and nonrobust measures were used for the analyses. The positive rates of total rainfall were fairly high in February, April, May, October and December where the residuals follow nonnormal and stationary but significant positive rate is documented only in September (+ 4.22*). The rate of growth of average cloud was documented less negative for August (- 0.0006) but significant positive for rest of the months. The rates of total frequency of insignificant rainfall were recognized approximately significant negative for April (- 0.035) and significant negative for May (- 0.069*) and September (- 0.079*), and the fairly high negative rates were accounted in February and December with nonnormal residual. Historical climatic data needs exploratory analysis and warrants tougher justification in classical analyses for outlier and residual’s nonnormality and nonstationarity. DOI: http://dx.doi.org/10.3329/jesnr.v5i1.11559 J. Environ. Sci. & Natural Resources, 5(1): 89-98, 2012


Osmia ◽  
2021 ◽  
Vol 9 ◽  
pp. 15-24
Author(s):  
Bernhard Seifert

A new species of the thermophilous Tetramorium caespitum species complex, T. sibiricum n. sp., is described from the Central Siberian region near Ulan Ude that has mean January temperatures of – 24 °C. The new species is clearly separable from the related species, T. indocile Santschi, 1927 and T. caespitum (Linnaeus, 1758), by exploratory data analyses of 35 phenotypic characters and by a discriminant analysis of seven phenotypic characters. A key to these three species, which all might occur in Central Siberia, is provided. The zoogeographic divide called the Reinig Line (De Lattin, 1967) is considered to be important in separating the ranges of Central and East Palaearctic ant species with less strong cold-hardiness. Based on images of type specimens, Tetramorium annectens Pisarski, 1969 is recognized as heterospecific from T. tsushimae Emery, 1925.


Author(s):  
Fabio Rigat

Abstract“What data will show the truth?” is a fundamental question emerging early in any empirical investigation. From a statistical perspective, experimental design is the appropriate tool to address this question by ensuring control of the error rates of planned data analyses and of the ensuing decisions. From an epistemological standpoint, planned data analyses describe in mathematical and algorithmic terms a pre-specified mapping of observations into decisions. The value of exploratory data analyses is often less clear, resulting in confusion about what characteristics of design and analysis are necessary for decision making and what may be useful to inspire new questions. This point is addressed here by illustrating the Popper-Miller theorem in plain terms and using a graphical support. Popper and Miller proved that probability estimates cannot generate hypotheses on behalf of investigators. Consistently with Popper-Miller, we show that probability estimation can only reduce uncertainty about the truth of a merely possible hypothesis. This fact clearly identifies exploratory analysis as one of the tools supporting a dynamic process of hypothesis generation and refinement which cannot be purely analytic. A clear understanding of these facts will enable stakeholders, mathematical modellers and data analysts to better engage on a level playing field when designing experiments and when interpreting the results of planned and exploratory data analyses.


2019 ◽  
Vol 66 (1) ◽  
pp. 55-61 ◽  
Author(s):  
Bernhard Seifert

A study of numeric morphology-based alpha-taxonomy (NUMOBAT) considering the species Formicaexsecta Nylander, 1846 and F.fennica Seifert, 2000 was performed in 166 nest samples with 485 worker individuals originating from 117 localities of the Palaearctic west of 59°E. The presence of intraspecific pilosity dimorphism is shown for F.exsecta. The setae-reduced phenotype, termed the Rubens morph, shows a frequency of about 25%, and the more abundant setae-rich phenotype, termed the Normal morph, one of 75%. The frequency of nests containing workers of both phenotypes is 15.5% in 58 samples from Denmark, Sweden, and Finland. Applying the DIMORPH test of Seifert (2016) on this territory, it is demonstrated that the association of Rubens and Normal phenotypes within the same nest cannot be interpreted as parabiosis of independent species (p=0.017) or as temporary (p=0.0004) and permanent (p=0.0001) socially parasitic association, whereas genetically mediated intraspecific dimorphism is most likely (p=0.659, all p data according to Fisher’s exact test). The Rubens morph of F.exsecta is phenotypically most similar to F.fennica but is safely separable by four different forms of exploratory data analyses using nest centroids (NC) as input data: NC-Ward, NC-part.hclust, NC-part.kmeans, and NC-NMDS-k-means. Data on zoogeography and the narrow climate niche indicate that F.fennica is unlikely to occur in Norway.


Sign in / Sign up

Export Citation Format

Share Document