scholarly journals Visualizations with statistical details: The 'ggstatsplot' approach

2021 ◽  
Author(s):  
Indrajeet Patil

Graphical displays can reveal problems in a statistical model that might not be apparent from purely numerical summaries. Such visualizations can also be helpful for the reader to evaluate validity of a model if the said analysis is reported in a scholarly publication/report. But, given the onerous costs involved, researchers can avoid preparing information-rich graphics and exploring several statistical approaches/tests available. The `ggstatsplot` package in R programming language provides a one-line syntax to create densely informative `ggplot2`-based visualizations with the results from statistical analysis embedded in the visualization itself. In doing so, the package helps researchers adopt a rigorous, reliable, and robust data exploratory and reporting workflow.

2019 ◽  
Vol 45 (2) ◽  
pp. 57-84 ◽  
Author(s):  
Polina Lemenkova

This paper introduces an application of R programming language for geostatistical data processing with a case study of the Mariana Trench, Pacific Ocean. The formation of the Mariana Trench, the deepest among all hadal oceanic depth trenches, is caused by complex and diverse geomorphic factors affecting its development. Mariana Trench crosses four tectonic plates: Mariana, Caroline, Pacific and Philippine. The impact of the geographic location and geological factors on its geomorphology has been studied by methods of statistical analysis and data visualization using R libraries. The methodology includes following steps. Firstly, vector thematic data were processed in QGIS: tectonics, bathymetry, geomorphology and geology. Secondly, 25 cross-section profiles were drawn across the trench. The length of each profile is 1000-km. The attribute information has been derived from each profile and stored in a table containing coordinates, depths and thematic information. Finally, this table was processed by methods of the statistical analysis on R. The programming codes and graphical results are presented. The results include geospatial comparative analysis and estimated effects of the data distribution by tectonic plates: slope angle, igneous volcanic areas and depths. The innovativeness of this paper consists in a cross-disciplinary approach combining GIS, statistical analysis and R programming.


2019 ◽  
Vol 23 (1) ◽  
pp. 36-39
Author(s):  
Lampros Intzes ◽  
Zoi-Despoina Tzima ◽  
Christos Gogos

SummaryBackground/Aim: The present study examined the resistance to cyclic fatigue of three different rotary Ni-Ti instruments: K3XF (Kerr, Orange, CA), HyFlex CM (Coltene/Whaledent, Altstätten, Switzerland) and X7 EdgeFile (EdgeEndo, Albuquerque, New Mexico).Material and Methods: Thirty instruments (n=30) of each type were used with tip size 25 and 0.04 taper. All instruments were constrained to 600 of curvature with a radius of 5 mm by the use of two grooved stainless steel rods and rotated at a speed of 300 rpm and 3.0 Ncm of torque. The time until separation was recorded for each of the instruments and the number of cycles to fracture (NCF) was calculated. Statistical analysis was performed using R Programming language.Results: The X7 EdgeFile instrument showed significantly greater resistance to cyclic fatigue when compared to the HyFlex CM and the K3XF with mean NCF for each instrument 1046 ± 311, 707 ± 219 and 360 ± 96 respectively. HyFlex CM performed significantly better than K3XF.Conclusions: The X7 EdgeFile Ni-Ti file appears to be significantly more resistant to fracture, due to flexural fatigue, than the HyFlex CM and the K3XF.


Author(s):  
Polina Lemenkova

This paper introduces an application of R programming language for geostatistical data processing with a case study of the Mariana Trench, Pacific Ocean. The formation of the Mariana Trench, the deepest among all hadal oceanic depth trenches, is caused by complex and diverse geomorphic factors affecting its development. Mariana Trench crosses four tectonic plates: Mariana, Caroline, Pacific and Philippine. The impact of the geographic location and geological factors on its geomorphology has been studied by methods of statistical analysis and data visualization using R libraries. The methodology includes following steps. Firstly, vector thematic data were processed in QGIS: tectonics, bathymetry, geomorphology and geology. Secondly, 25 cross-section profiles were drawn across the trench. The length of each profile is 1000-km. The attribute information has been derived from each profile and stored in a table containing coordinates, depths and thematic information. Finally, this table was processed by methods of the statistical analysis on R. The programming codes and graphical results are presented. The results include geospatial comparative analysis and estimated effects of the data distribution by tectonic plates: slope angle, igneous volcanic areas and depths. The innovativeness of this paper consists in a cross-disciplinary approach combining GIS, statistical analysis and R programming.


Author(s):  
Ramin Nabizadeh ◽  
Mostafa Hadei

Introduction: The wide range of studies on air pollution requires accurate and reliable datasets. However, due to many reasons, the measured concentra-tions may be incomplete or biased. The development of an easy-to-use and reproducible exposure assessment method is required for researchers. There-fore, in this article, we describe and present a series of codes written in R Programming Language for data handling, validating and averaging of PM10, PM2.5, and O3 datasets.   Findings: These codes can be used in any types of air pollution studies that seek for PM and ozone concentrations that are indicator of real concentra-tions. We used and combined criteria from several guidelines proposed by US EPA and APHEKOM project to obtain an acceptable methodology. Separate   .csv files for PM 10, PM 2.5 and O3 should be prepared as input file. After the file was imported to the R Programming software, first, negative and zero values of concentrations within all the dataset will be removed. Then, only monitors will be selected that have at least 75% of hourly concentrations. Then, 24-h averages and daily maximum of 8-h moving averages will be calculated for PM and ozone, respectively. For output, the codes create two different sets of data. One contains the hourly concentrations of the interest pollutant (PM10, PM2.5, or O3) in valid stations and their average at city level. Another is the   final 24-h averages of city for PM10 and PM2.5 or the final daily maximum 8-h averages of city for O3. Conclusion: These validated codes use a reliable and valid methodology, and eliminate the possibility of wrong or mistaken data handling and averaging. The use of these codes are free and without any limitation, only after the cita-tion to this article.


2021 ◽  
Vol 13 (1) ◽  
pp. 15
Author(s):  
Junior Pastor Pérez-Molina ◽  
Carola Scholz ◽  
Roy Pérez-Salazar ◽  
Carolina Alfaro-Chinchilla ◽  
Ana Abarca Méndez ◽  
...  

Introduction: The implementation of wastewater treatment systems such as constructed wetlands has a growing interest in the last decade due to its low cost and high effectiveness in treating industrial and residential wastewater. Objective: To evaluate the spatial variation of physicochemical parameters in a constructed wetland system of sub-superficial flow of Pennisetum alopecuroides (Pennisetum) and a Control (unplanted). The purpose is to provide an analysis of spatial dynamic of physicochemical parameters using R programming language. Methods: Each of the cells (Pennisetum and Control) had 12 piezometers, organized in three columns and four rows with a separation distance of 3,25m and 4,35m, respectively. The turbidity, biochemical oxygen demand (BOD), chemical oxygen demand (COD), total Kjeldahl nitrogen (TKN), ammoniacal nitrogen (N-NH4), organic nitrogen (N-org.) and phosphorous (P-PO4-3) were measured in water under in-flow and out-flow of both conditions Control and Pennisetum (n= 8). Additionally, the oxidation-reduction potential (ORP), dissolved oxygen (DO), conductivity, pH and water temperature, were measured (n= 167) in the piezometers. Results: No statistically significant differences between cells for TKN, N-NH4, conductivity, turbidity, BOD, and COD were found; but both Control and Pennisetum cells showed a significant reduction in these parameters (P<0,05). Overall, TKN and N-NH4 removal were from 65,8 to 84,1% and 67,5 to 90,8%, respectively; and decrease in turbidity, conductivity, BOD, and COD, were between 95,1-95,4%; 15-22,4%; 65,2-77,9% and 57,4-60,3% respectively. Both cells showed ORP increasing gradient along the water-flow direction, contrary to conductivity (p<0,05). However, OD, pH and temperature were inconsistent in the direction of the water flow in both cells. Conclusions: Pennisetum demonstrated pollutant removal efficiency, but presented results similar to the control cells, therefore, remains unclear if it is a superior option or not. Spatial variation analysis did not reflect any obstruction of flow along the CWs; but some preferential flow paths can be distinguished. An open-source repository of R was provided. 


2020 ◽  
Vol 4 (9) ◽  
Author(s):  
Megan Wang

Basketball has existed for almost 130 years, becoming one of the most famous sports worldwide by affecting millions of lives and having national and global tournaments. With the general improvement of people's concern and love for sports competition, sports analytics’ role will become more prominent. Hence, this paper combines the relevant knowledge of statistics and typical basketball competition cases from NBA, expounding the application of statistics in sports competition. The paper first examines the importance of normal distribution (also called Gaussian distribution) in statistics through its probability density function and the function's graph. The function has two parameters: the mean for the maximum and standard deviation for the distance away from the mean[1]. By compiling datasets of past teams and individuals for their basketball performances and making simple calculations of their standard deviation and mean, the paper constructs normal distribution graphs using the R programming language. Finally, the paper examines the Real Plus-Minus value and its importance in basketball.


Sign in / Sign up

Export Citation Format

Share Document