Using Inquiry-Based Methods to Teach Spatial Statistics

2020 ◽  
Vol 4 (1) ◽  
Author(s):  
Andrew J. Gregory ◽  
Emma S. Spence

Spatial statistics and experimental design are among the most important topics students in the environmental and ecological sciences learn and utilize throughout their careers. These topics are also among the most difficult for students to learn, often due to the use of contrived data sets that present simplified and unrealistic scenarios that fail to engage students in higher level thinking. One way to engage students in higher level thinking is to use an inquiry-based pedagogical framework. The use of inquiry as a pedagogical approach should be instinctive for most scientists, as it mimics how science is conducted, yet most instructors continue to use lecture-based, textbook-driven instructional formats. This type of approach is efficient in covering material, but it suffers in its ability to engage students or enhance learning. Using a Bigfoot data set in an inquiry-based framework, students in a cross-listed graduate/undergraduate statistics class learned ordinary least squares regression and geographically weighted regression techniques. These techniques are among the most frequently applied analyses in the natural sciences. The use of a Bigfoot data set engaged students’ interest, rendering the prospect of learning regression topics as an emergent property of their interest and engagement. This approach also has an additional benefit in that students learned not only key statistical concepts but also learn how to self-diagnose deficiencies with their models as well as how to identify strategies to overcome these deficiencies. We hope that both instructors and students in graduate and undergraduate statistics or spatial modeling courses find this case study, and included data sets, a useful and interesting approach to teach and learn regression and spatial regression.

2009 ◽  
Vol 2009 ◽  
pp. 1-8 ◽  
Author(s):  
Janet Myhre ◽  
Daniel R. Jeske ◽  
Michael Rennie ◽  
Yingtao Bi

A heteroscedastic linear regression model is developed from plausible assumptions that describe the time evolution of performance metrics for equipment. The inherited motivation for the related weighted least squares analysis of the model is an essential and attractive selling point to engineers with interest in equipment surveillance methodologies. A simple test for the significance of the heteroscedasticity suggested by a data set is derived and a simulation study is used to evaluate the power of the test and compare it with several other applicable tests that were designed under different contexts. Tolerance intervals within the context of the model are derived, thus generalizing well-known tolerance intervals for ordinary least squares regression. Use of the model and its associated analyses is illustrated with an aerospace application where hundreds of electronic components are continuously monitored by an automated system that flags components that are suspected of unusual degradation patterns.


2012 ◽  
Vol 12 (5) ◽  
pp. 12357-12389
Author(s):  
F. Hendrick ◽  
E. Mahieu ◽  
G. E. Bodeker ◽  
K. F. Boersma ◽  
M. P. Chipperfield ◽  
...  

Abstract. The trend in stratospheric NO2 column at the NDACC (Network for the Detection of Atmospheric Composition Change) station of Jungfraujoch (46.5° N, 8.0° E) is assessed using ground-based FTIR and zenith-scattered visible sunlight SAOZ measurements over the period 1990 to 2009 as well as a composite satellite nadir data set constructed from ERS-2/GOME, ENVISAT/SCIAMACHY, and METOP-A/GOME-2 observations over the 1996–2009 period. To calculate the trends, a linear least squares regression model including explanatory variables for a linear trend, the mean annual cycle, the quasi-biennial oscillation (QBO), solar activity, and stratospheric aerosol loading is used. For the 1990–2009 period, statistically indistinguishable trends of −3.7 ± 1.1%/decade and −3.6 ± 0.9%/decade are derived for the SAOZ and FTIR NO2 column time series, respectively. SAOZ, FTIR, and satellite nadir data sets show a similar decrease over the 1996–2009 period, with trends of −2.4 ± 1.1%/decade, −4.3 ± 1.4%/decade, and −3.6 ± 2.2%/decade, respectively. The fact that these declines are opposite in sign to the globally observed +2.5%/decade trend in N2O, suggests that factors other than N2O are driving the evolution of stratospheric NO2 at northern mid-latitudes. Possible causes of the decrease in stratospheric NO2 columns have been investigated. The most likely cause is a change in the NO2/NO partitioning in favor of NO, due to a possible stratospheric cooling and a decrease in stratospheric chlorine content, the latter being further confirmed by the negative trend in the ClONO2 column derived from FTIR observations at Jungfraujoch. Decreasing ClO concentrations slows the NO + ClO → NO2 + Cl reaction and a stratospheric cooling slows the NO + O3 → NO2 + O2 reaction, leaving more NOx in the form of NO. The slightly positive trends in ozone estimated from ground- and satellite-based data sets are also consistent with the decrease of NO2 through the NO2 + O3 → NO3 + O2 reaction. Finally, we cannot rule out the possibility that a strengthening of the Dobson-Brewer circulation, which reduces the time available for N2O photolysis in the stratosphere, could also contribute to the observed decline in stratospheric NO2 above Jungfraujoch.


2019 ◽  
Vol 80 (1) ◽  
pp. 38-50
Author(s):  
Kozo Harimaya ◽  
Koichi Kagitani

Purpose The purpose of this paper is to investigate the efficiency of the banking business of Japan’s agricultural cooperatives (JAs), which depend heavily on financial business with non-farmers, contradictory to cooperative principles. Design/methodology/approach The authors construct a panel data set over 2005–2016 from the financial statements of JAs’ prefectural-level federations and use the input distance stochastic frontier model with a time-variant inefficiency effect for analysis. Both the flow and stock measures of the banking output are used in identical models and the efficiency results are compared. The authors also investigate the determinants of efficiency by using the Tobit and ordinary least squares regression models. Findings There is strong evidence of significant prefectural differences in efficiency values. The ratio of lending to non-members to total loans is positively related to efficiency. In contrast, the higher reliance on a central organization and credit business leads to lower efficiency. Research limitations/implications Apart from banking, JAs provide mutual insurance business services. As the authors investigate only the efficiency of JAs’ banking business in this study, it would be necessary to investigate the efficiency of their insurance business as well when evaluating JAs’ overall financial business. Originality/value There are few studies that investigate the efficiency of JAs’ banking business and its determinants, although significant attention has been paid to their excessive dependence on the financial business.


Author(s):  
Pascalis Kadaro Matthew ◽  
Abubakar Yahaya

<p>Some few decades ago, penalized regression techniques for linear regression have been developed specifically to reduce the flaws inherent in the prediction accuracy of the classical ordinary least squares (OLS) regression technique. In this paper, we used a diabetes data set obtained from previous literature to compare three of these well-known techniques, namely: Least Absolute Shrinkage Selection Operator (LASSO), Elastic Net and Correlation Adjusted Elastic Net (CAEN). After thorough analysis, it was observed that CAEN generated a less complex model.</p>


Sensors ◽  
2020 ◽  
Vol 20 (4) ◽  
pp. 1065 ◽  
Author(s):  
Huihui Zhang ◽  
Wenqing Shao ◽  
Shanshan Qiu ◽  
Jun Wang ◽  
Zhenbo Wei

Aroma and taste are the most important attributes of alcoholic beverages. In the study, the self-developed electronic tongue (e-tongue) and electronic nose (e-nose) were used for evaluating the marked ages of rice wines. Six types of feature data sets (e-tongue data set, e-nose data set, direct-fusion data set, weighted-fusion data set, optimized direct-fusion data set, and optimized weighted-fusion data set) were used for identifying rice wines with different wine ages. Pearson coefficient analysis and variance inflation factor (VIF) analysis were used to optimize the fusion matrixes by removing the multicollinear information. Two types of discrimination methods (principal component analysis (PCA) and locality preserving projections (LPP)) were used for classifying rice wines, and LPP performed better than PCA in the discrimination work. The best result was obtained by LPP based on the weighted-fusion data set, and all the samples could be classified clearly in the LPP plot. Therefore, the weighted-fusion data were used as independent variables of partial least squares regression, extreme learning machine, and support vector machines (LIBSVM) for evaluating wine ages, respectively. All the methods performed well with good prediction results, and LIBSVM presented the best correlation coefficient (R2 ≥ 0.9998).


2012 ◽  
Vol 12 (18) ◽  
pp. 8851-8864 ◽  
Author(s):  
F. Hendrick ◽  
E. Mahieu ◽  
G. E. Bodeker ◽  
K. F. Boersma ◽  
M. P. Chipperfield ◽  
...  

Abstract. The trend in stratospheric NO2 column at the NDACC (Network for the Detection of Atmospheric Composition Change) station of Jungfraujoch (46.5° N, 8.0° E) is assessed using ground-based FTIR and zenith-scattered visible sunlight SAOZ measurements over the period 1990 to 2009 as well as a composite satellite nadir data set constructed from ERS-2/GOME, ENVISAT/SCIAMACHY, and METOP-A/GOME-2 observations over the 1996–2009 period. To calculate the trends, a linear least squares regression model including explanatory variables for a linear trend, the mean annual cycle, the quasi-biennial oscillation (QBO), solar activity, and stratospheric aerosol loading is used. For the 1990–2009 period, statistically indistinguishable trends of −3.7 ± 1.1% decade−1 and −3.6 ± 0.9% decade−1 are derived for the SAOZ and FTIR NO2 column time series, respectively. SAOZ, FTIR, and satellite nadir data sets show a similar decrease over the 1996–2009 period, with trends of −2.4 ± 1.1% decade−1, −4.3 ± 1.4% decade−1, and −3.6 ± 2.2% decade−1, respectively. The fact that these declines are opposite in sign to the globally observed +2.5% decade−1 trend in N2O, suggests that factors other than N2O are driving the evolution of stratospheric NO2 at northern mid-latitudes. Possible causes of the decrease in stratospheric NO2 columns have been investigated. The most likely cause is a change in the NO2/NO partitioning in favor of NO, due to a possible stratospheric cooling and a decrease in stratospheric chlorine content, the latter being further confirmed by the negative trend in the ClONO2 column derived from FTIR observations at Jungfraujoch. Decreasing ClO concentrations slows the NO + ClO → NO2 + Cl reaction and a stratospheric cooling slows the NO + O3 → NO2 + O2 reaction, leaving more NOx in the form of NO. The slightly positive trends in ozone estimated from ground- and satellite-based data sets are also consistent with the decrease of NO2 through the NO2 + O3 → NO3 + O2 reaction. Finally, we cannot rule out the possibility that a strengthening of the Dobson-Brewer circulation, which reduces the time available for N2O photolysis in the stratosphere, could also contribute to the observed decline in stratospheric NO2 above Jungfraujoch.


1993 ◽  
Vol 20 (1) ◽  
pp. 133-143 ◽  
Author(s):  
David Hansen ◽  
Dale I. Bray

Sediment rating curves in conjunction with daily flow data have often been used to estimate the total mass of sediment flowing past a given river cross section over relatively long periods of time. Techniques are presented that seek to make the best use of limited noncontinuous suspended sediment concentration data to generate nine partial years of suspended sediment load by means of sediment rating curves for the Kennebecasis River, N.B. (drainage area of 1100 km2). Initially, the data were partitioned in an attempt to improve correlations between concentration and discharge. Such partitioning by season, month, periods of rising stage, and periods of falling stage did not uniformly improve correlations as compared with the correlations for nonpartitioned data. Various combinations of less well-known methods were then used, including a moving intercept method that makes greater use of point concentration observations in time, and correction factor methods for simple power-type relations as suggested by Ferguson and by Duan. In addition, the validity of some of the underlying assumptions for performing ordinary least-squares regression is examined for this data set. Finally, the effect of daily flow averaging on the computed load was examined and found to be small for this basin. Key words: suspended sediment, C–Q rating curves, flow averaging, washload estimates, statistical bias, regression estimates.


2001 ◽  
Vol 44 (2) ◽  
pp. 446-461 ◽  
Author(s):  
Jennifer Windsor ◽  
Rochelle L. Milbrath ◽  
Edward J. Carney ◽  
Susan E. Rakowski

Although the general slowing hypothesis of language impairment (LI) is well established, the conventional method to test the hypothesis is controversial. This paper compares the usual method, ordinary least squares regression (OLS), with another method, hierarchical linear modeling with random coefficients (HLM). The analyses used available response time (RT) data from studies of perceptual-motor, cognitive, and language skills of LI and chronological-age-matched (CA) groups. The data set included RT measures from 25 studies investigating 20 different tasks (e.g., auditory detection, mental rotation, and word recognition tasks). OLS and HLM analyses of the RT data yielded very different results. OLS supported general slowing for the LI groups, and indicated that they were significantly slower than CA groups across studies by an overall estimate of 10%. HLM indicated a larger average extent of LI slowing (18%). However, the variability around this average was much greater than that yielded by OLS, and the extent of slowing was not statistically significant. Importantly, HLM showed a significant difference in the RT relation between LI and CA groups across studies, indicating that study-specific slowing, rather than general slowing across studies, was present. A separate HLM analysis of two types of language tasks, picture naming and word recognition, was performed. Although the extent of slowing was equivalent across these tasks, the slowing was minimal (2%) and not significant. Methodological limitations of each analysis to assess general slowing are highlighted.


2010 ◽  
Vol 62 (4) ◽  
pp. 875-882 ◽  
Author(s):  
A. Dembélé ◽  
J.-L. Bertrand-Krajewski ◽  
B. Barillon

Regression models are among the most frequently used models to estimate pollutants event mean concentrations (EMC) in wet weather discharges in urban catchments. Two main questions dealing with the calibration of EMC regression models are investigated: i) the sensitivity of models to the size and the content of data sets used for their calibration, ii) the change of modelling results when models are re-calibrated when data sets grow and change with time when new experimental data are collected. Based on an experimental data set of 64 rain events monitored in a densely urbanised catchment, four TSS EMC regression models (two log-linear and two linear models) with two or three explanatory variables have been derived and analysed. Model calibration with the iterative re-weighted least squares method is less sensitive and leads to more robust results than the ordinary least squares method. Three calibration options have been investigated: two options accounting for the chronological order of the observations, one option using random samples of events from the whole available data set. Results obtained with the best performing non linear model clearly indicate that the model is highly sensitive to the size and the content of the data set used for its calibration.


1993 ◽  
Vol 25 (8) ◽  
pp. 1201-1209 ◽  
Author(s):  
P J Boyle ◽  
R Flowerdew

Recently, it has been argued that migration models in which ordinary least squares regression is used are inappropriate, because the dependent variable (number of migrants) is a count. Instead, a Poisson regression approach can be adopted. Goodness of fit can be evaluated by using a deviance (log-likelihood) or X2 statistic, whose significance can be compared with a χ2 distribution with appropriate degrees of freedom. In this paper, such an approach is used to model ward-level migration flows within the county of Hereford and Worcester. However, it is shown that for this exceedingly sparse data set the deviance figures attained are very low, suggesting that there may be a problem of underdispersion. This is in contrast to the overdispersion which has been identified as a common problem in Poisson models. The low deviance figures arise from the large number of zeros and small flows in the data matrix, which invalidate the usual χ2 goodness-of-fit test. A simulation approach to the assessment of model goodness of fit is suggested, and the results from applying it to the Hereford and Worcester data set are described.


Sign in / Sign up

Export Citation Format

Share Document