Data Science and Predictive Analytics: Biomedical and Health Applications Using R

Benjamin H. Saracco

doi:10.5195/jmla.2020.901

Data Science and Predictive Analytics: Biomedical and Health Applications Using R

Journal of the Medical Library Association JMLA ◽

10.5195/jmla.2020.901 ◽

2020 ◽

Vol 108 (2) ◽

pp. 334

Author(s):

Benjamin H. Saracco

Keyword(s):

Programming Language ◽

Data Science ◽

Predictive Analytics ◽

Health Sciences ◽

Online Course ◽

Data Sets ◽

R Programming Language ◽

R Programming ◽

Health Applications ◽

Analyze Data

Ivo D. Dinov’s Data Science and Predictive Analytics: Biomedical and Health Applications Using R is a comprehensive twenty-three-chapter text and online course for burgeoning or seasoned biomedical and/or health sciences professionals who analyze data sets using the R programming language.

Download Full-text

An Optimised Method for Fetching and Transforming Survey Data based on SQL and R Programming Language

Baghdad Science Journal ◽

10.21123/bsj.2019.16.2(si).0436 ◽

2019 ◽

Vol 16 (2(SI)) ◽

pp. 0436

Author(s):

Hasan Et al.

Keyword(s):

Programming Language ◽

Data Science ◽

Query Language ◽

Major Drawback ◽

Relational Models ◽

R Programming Language ◽

Survey System ◽

R Programming ◽

Improved Accuracy

The development of information systems in recent years has contributed to various methods of gathering information to evaluate IS performance. The most common approach used to collect information is called the survey system. This method, however, suffers one major drawback. The decision makers consume considerable time to transform data from survey sheets to analytical programs. As such, this paper proposes a method called ‘survey algorithm based on R programming language’ or SABR, for data transformation from the survey sheets inside R environments by treating the arrangement of data as a relational format. R and Relational data format provide excellent opportunity to manage and analyse the accumulated data. Moreover, a survey system based on structured query language and R programming language is designed to optimize methods to manage survey systems by applying large features offered via combining multi data science languages. The experiments verified enhancements of flexibility, technical tools, and data visualization features employed to process the collected data from different aspects; therefore, the proposed approach demonstrates a simple case study to enhance the evaluation requirements of the proposed technique. Finally, the estimated results of this research can be used to improve the methods of information management on different aspects such as survey systems and other data models that hold the relational and non-relational models using SABR. This method demonstrated improved accuracy of data collected, reduced data processing time and arranged data to the willing model.

Download Full-text

FuzzyR: An Extended Fuzzy Logic Toolbox for the R Programming Language

2020 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) ◽

10.1109/fuzz48607.2020.9177780 ◽

2020 ◽

Author(s):

Chao Chen ◽

Tajul Rosli Razak ◽

Jonathan M. Garibaldi

Keyword(s):

Fuzzy Logic ◽

Programming Language ◽

R Programming Language ◽

R Programming

Download Full-text

Understanding the Behavior of Zadeh’s Extension Principle for One-to-One Functions by R Programming Language

Advances in Intelligent Systems and Computing - Intelligent and Fuzzy Techniques: Smart and Innovative Solutions ◽

10.1007/978-3-030-51156-2_153 ◽

2020 ◽

pp. 1309-1315

Author(s):

Abbas Parchami ◽

Parisa Khalilpoor

Keyword(s):

Programming Language ◽

Extension Principle ◽

R Programming Language ◽

One To One ◽

R Programming

Download Full-text

Introduction to the R Programming Language

Handbook of Educational Measurement and Psychometrics Using R ◽

10.1201/b20498-1 ◽

2018 ◽

pp. 1-29

Author(s):

Christopher D. Desjardins ◽

Okan Bulut

Keyword(s):

Programming Language ◽

R Programming Language ◽

R Programming

Download Full-text

Developing codes for validation of PM10, PM2.5, and O3 datasets using R programming language

Journal of Air Pollution and Health ◽

10.18502/japh.v4i1.604 ◽

2019 ◽

Author(s):

Ramin Nabizadeh ◽

Mostafa Hadei

Keyword(s):

Air Pollution ◽

Programming Language ◽

Assessment Method ◽

Daily Maximum ◽

Data Handling ◽

R Programming Language ◽

Wide Range ◽

Us Epa ◽

R Programming ◽

Pm 10

Introduction: The wide range of studies on air pollution requires accurate and reliable datasets. However, due to many reasons, the measured concentra-tions may be incomplete or biased. The development of an easy-to-use and reproducible exposure assessment method is required for researchers. There-fore, in this article, we describe and present a series of codes written in R Programming Language for data handling, validating and averaging of PM10, PM2.5, and O3 datasets. Findings: These codes can be used in any types of air pollution studies that seek for PM and ozone concentrations that are indicator of real concentra-tions. We used and combined criteria from several guidelines proposed by US EPA and APHEKOM project to obtain an acceptable methodology. Separate .csv files for PM 10, PM 2.5 and O3 should be prepared as input file. After the file was imported to the R Programming software, first, negative and zero values of concentrations within all the dataset will be removed. Then, only monitors will be selected that have at least 75% of hourly concentrations. Then, 24-h averages and daily maximum of 8-h moving averages will be calculated for PM and ozone, respectively. For output, the codes create two different sets of data. One contains the hourly concentrations of the interest pollutant (PM10, PM2.5, or O3) in valid stations and their average at city level. Another is the final 24-h averages of city for PM10 and PM2.5 or the final daily maximum 8-h averages of city for O3. Conclusion: These validated codes use a reliable and valid methodology, and eliminate the possibility of wrong or mistaken data handling and averaging. The use of these codes are free and without any limitation, only after the cita-tion to this article.

Download Full-text

Spatial variation of physicochemical parameters in a constructed wetland for wastewater treatment: An example of the use of the R programming language

UNED Research Journal ◽

10.22458/urj.v13i1.3294 ◽

2021 ◽

Vol 13 (1) ◽

pp. 15

Author(s):

Junior Pastor Pérez-Molina ◽

Carola Scholz ◽

Roy Pérez-Salazar ◽

Carolina Alfaro-Chinchilla ◽

Ana Abarca Méndez ◽

...

Keyword(s):

Wastewater Treatment ◽

Spatial Variation ◽

Water Flow ◽

Programming Language ◽

Constructed Wetland ◽

Physicochemical Parameters ◽

Preferential Flow ◽

Oxygen Demand ◽

R Programming Language ◽

R Programming

Introduction: The implementation of wastewater treatment systems such as constructed wetlands has a growing interest in the last decade due to its low cost and high effectiveness in treating industrial and residential wastewater. Objective: To evaluate the spatial variation of physicochemical parameters in a constructed wetland system of sub-superficial flow of Pennisetum alopecuroides (Pennisetum) and a Control (unplanted). The purpose is to provide an analysis of spatial dynamic of physicochemical parameters using R programming language. Methods: Each of the cells (Pennisetum and Control) had 12 piezometers, organized in three columns and four rows with a separation distance of 3,25m and 4,35m, respectively. The turbidity, biochemical oxygen demand (BOD), chemical oxygen demand (COD), total Kjeldahl nitrogen (TKN), ammoniacal nitrogen (N-NH4), organic nitrogen (N-org.) and phosphorous (P-PO4-3) were measured in water under in-flow and out-flow of both conditions Control and Pennisetum (n= 8). Additionally, the oxidation-reduction potential (ORP), dissolved oxygen (DO), conductivity, pH and water temperature, were measured (n= 167) in the piezometers. Results: No statistically significant differences between cells for TKN, N-NH4, conductivity, turbidity, BOD, and COD were found; but both Control and Pennisetum cells showed a significant reduction in these parameters (P<0,05). Overall, TKN and N-NH4 removal were from 65,8 to 84,1% and 67,5 to 90,8%, respectively; and decrease in turbidity, conductivity, BOD, and COD, were between 95,1-95,4%; 15-22,4%; 65,2-77,9% and 57,4-60,3% respectively. Both cells showed ORP increasing gradient along the water-flow direction, contrary to conductivity (p<0,05). However, OD, pH and temperature were inconsistent in the direction of the water flow in both cells. Conclusions: Pennisetum demonstrated pollutant removal efficiency, but presented results similar to the control cells, therefore, remains unclear if it is a superior option or not. Spatial variation analysis did not reflect any obstruction of flow along the CWs; but some preferential flow paths can be distinguished. An open-source repository of R was provided.

Download Full-text

A fuzzy toolbox for the R programming language

2011 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2011) ◽

10.1109/fuzzy.2011.6007743 ◽

2011 ◽

Cited By ~ 11

Author(s):

Christian Wagner ◽

Simon Miller ◽

Jonathan M. Garibaldi

Keyword(s):

Programming Language ◽

R Programming Language ◽

R Programming

Download Full-text

On the use of R programming language in the analyses of spatial data

Acta Silvae et Ligni ◽

10.20315/asetl.102.5 ◽

2013 ◽

Vol 102 ◽

pp. 55-62

Author(s):

Milan Kobal ◽

Andrej Ceglar ◽

Klemen Eler ◽

Barbara Medved-Cvikl ◽

Luka Honzak ◽

...

Keyword(s):

Programming Language ◽

Spatial Data ◽

R Programming Language ◽

R Programming

Download Full-text

NEW LITHOSTRATIGRAPHIC UNITS IN THE CROATIAN OFFSHORE AND THEIR DEFINITION IN THE «R» PROGRAMMING LANGUAGE

Rudarsko-geološko-naftni zbornik ◽

10.17794/rgn.2015.2.4 ◽

2015 ◽

Vol 30 (2) ◽

pp. 13-24

Author(s):

Marijan Šapina ◽

◽

Marko Vekić ◽

Keyword(s):

Programming Language ◽

R Programming Language ◽

Lithostratigraphic Units ◽

R Programming

Download Full-text

Bioinformatic approaches for analysis of coral-associated bacteria using R programming language

Vietnam Journal of Biotechnology ◽

10.15625/1811-4989/18/4/15320 ◽

2021 ◽

Vol 18 (4) ◽

pp. 733-743

Author(s):

Doan Thi Nhung ◽

Bui Van Ngoc

Keyword(s):

Programming Language ◽

Community Analysis ◽

Microbial Community Analysis ◽

Metagenomic Data ◽

Rrna Gene ◽

Marine Microorganisms ◽

Taxonomic Assignment ◽

Associated Bacteria ◽

R Programming Language ◽

R Programming

Recent advances in metagenomics and bioinformatics allow the robust analysis of the composition and abundance of microbial communities, functional genes, and their metabolic pathways. So far, there has been a variety of computational/statistical tools or software for analyzing microbiome, the common problems that occurred in its implementation are, however, the lack of synchronization and compatibility of output/input data formats between such software. To overcome these challenges, in this study context, we aim to apply the DADA2 pipeline (written in R programming language) instead of using a set of different bioinformatics tools to create our own workflow for microbial community analysis in a continuous and synchronous manner. For the first effort, we tried to investigate the composition and abundance of coral-associated bacteria using their 16S rRNA gene amplicon sequences. The workflow or framework includes the following steps: data processing, sequence clustering, taxonomic assignment, and data visualization. Moreover, we also like to catch readers’ attention to the information about bacterial communities living in the ocean as most marine microorganisms are unculturable, especially residing in coral reefs, namely, bacteria are associated with the coral Acropora tenuis in this case. The outcomes obtained in this study suggest that the DADA2 pipeline written in R programming language is one of the potential bioinformatics approaches in the context of microbiome analysis other than using various software. Besides, our modifications for the workflow execution help researchers to illustrate metagenomic data more easily and systematically, elucidate the composition, abundance, diversity, and relationship between microorganism communities as well as to develop other bioinformatic tools more effectively.

Download Full-text