scholarly journals Overview of R and RStudio

2021 ◽  
pp. 31-47
Author(s):  
Joseph F. Hair ◽  
G. Tomas M. Hult ◽  
Christian M. Ringle ◽  
Marko Sarstedt ◽  
Nicholas P. Danks ◽  
...  

AbstractComputational statistics is now an increasingly popular method of analysis for researchers that combines a vast array of algorithms, statistical methods, and the power of functional coding. The R programming language, in particular, has benefitted from this development alongside of traditional graphical user interface (GUI) software. Today, it has become the language of choice for empirical researchers. In this chapter, we introduce the R programming language as well as its popular development environment in the form of RStudio. We walk the reader through downloading both the R language and the RStudio integrated development environment (IDE). Then, we discuss the software layout and demonstrate how to interact with the software. Finally, we address creating and managing R projects and scripts, gaining access to documentation and help via various sources. This chapter is not intended as a tutorial on the writing of code in the R programming language. We do, however, provide useful open-source resources for learning R, which can be accessed from the R console RStudio environment.

2009 ◽  
Vol 2009 ◽  
pp. 1-3 ◽  
Author(s):  
Kyongryun Lee ◽  
Florian Hahne ◽  
Deepayan Sarkar ◽  
Robert Gentleman

Flow cytometry (FCM) has become an important analysis technology in health care and medical research, but the large volume of data produced by modern high-throughput experiments has presented significant new challenges for computational analysis tools. The development of an FCM software suite in Bioconductor represents one approach to overcome these challenges. In the spirit of the R programming language (Tree Star Inc., “FlowJo”), these tools are predominantly console-driven, allowing for programmatic access and rapid development of novel algorithms. Using this software requires a solid understanding of programming concepts and of the R language. However, some of these tools|in particular the statistical graphics and novel analytical methods|are also useful for nonprogrammers. To this end, we have developed an open source, extensible graphical user interface (GUI) iFlow, which sits on top of the Bioconductor backbone, enabling basic analyses by means of convenient graphical menus and wizards. We envision iFlow to be easily extensible in order to quickly integrate novel methodological developments.


2021 ◽  
Vol 9 (09) ◽  
pp. 703-705
Author(s):  
Victor Sitnic ◽  
Valentina Stratan ◽  
Valeri Tutuianu ◽  
Cristina Popa ◽  
Veronica Balan

R is a free licensed programming language which presents a big interest as a tool for bioinformatics data analysis. It is essential in research activities related to the analysis of molecular-biological data and the identification of molecular markers. In this article we describe two simple techniquesof using FASTA type sequences and genomic data for the research of genetic markers. In order to apply the functions described below it is necessary to have installed the R language, the seqRFLP&Maftoolspackages, and optionally - the Integrated Development Environment Rstudio.


Author(s):  
Ramin Nabizadeh ◽  
Mostafa Hadei

Introduction: The wide range of studies on air pollution requires accurate and reliable datasets. However, due to many reasons, the measured concentra-tions may be incomplete or biased. The development of an easy-to-use and reproducible exposure assessment method is required for researchers. There-fore, in this article, we describe and present a series of codes written in R Programming Language for data handling, validating and averaging of PM10, PM2.5, and O3 datasets.   Findings: These codes can be used in any types of air pollution studies that seek for PM and ozone concentrations that are indicator of real concentra-tions. We used and combined criteria from several guidelines proposed by US EPA and APHEKOM project to obtain an acceptable methodology. Separate   .csv files for PM 10, PM 2.5 and O3 should be prepared as input file. After the file was imported to the R Programming software, first, negative and zero values of concentrations within all the dataset will be removed. Then, only monitors will be selected that have at least 75% of hourly concentrations. Then, 24-h averages and daily maximum of 8-h moving averages will be calculated for PM and ozone, respectively. For output, the codes create two different sets of data. One contains the hourly concentrations of the interest pollutant (PM10, PM2.5, or O3) in valid stations and their average at city level. Another is the   final 24-h averages of city for PM10 and PM2.5 or the final daily maximum 8-h averages of city for O3. Conclusion: These validated codes use a reliable and valid methodology, and eliminate the possibility of wrong or mistaken data handling and averaging. The use of these codes are free and without any limitation, only after the cita-tion to this article.


2021 ◽  
Vol 13 (1) ◽  
pp. 15
Author(s):  
Junior Pastor Pérez-Molina ◽  
Carola Scholz ◽  
Roy Pérez-Salazar ◽  
Carolina Alfaro-Chinchilla ◽  
Ana Abarca Méndez ◽  
...  

Introduction: The implementation of wastewater treatment systems such as constructed wetlands has a growing interest in the last decade due to its low cost and high effectiveness in treating industrial and residential wastewater. Objective: To evaluate the spatial variation of physicochemical parameters in a constructed wetland system of sub-superficial flow of Pennisetum alopecuroides (Pennisetum) and a Control (unplanted). The purpose is to provide an analysis of spatial dynamic of physicochemical parameters using R programming language. Methods: Each of the cells (Pennisetum and Control) had 12 piezometers, organized in three columns and four rows with a separation distance of 3,25m and 4,35m, respectively. The turbidity, biochemical oxygen demand (BOD), chemical oxygen demand (COD), total Kjeldahl nitrogen (TKN), ammoniacal nitrogen (N-NH4), organic nitrogen (N-org.) and phosphorous (P-PO4-3) were measured in water under in-flow and out-flow of both conditions Control and Pennisetum (n= 8). Additionally, the oxidation-reduction potential (ORP), dissolved oxygen (DO), conductivity, pH and water temperature, were measured (n= 167) in the piezometers. Results: No statistically significant differences between cells for TKN, N-NH4, conductivity, turbidity, BOD, and COD were found; but both Control and Pennisetum cells showed a significant reduction in these parameters (P<0,05). Overall, TKN and N-NH4 removal were from 65,8 to 84,1% and 67,5 to 90,8%, respectively; and decrease in turbidity, conductivity, BOD, and COD, were between 95,1-95,4%; 15-22,4%; 65,2-77,9% and 57,4-60,3% respectively. Both cells showed ORP increasing gradient along the water-flow direction, contrary to conductivity (p<0,05). However, OD, pH and temperature were inconsistent in the direction of the water flow in both cells. Conclusions: Pennisetum demonstrated pollutant removal efficiency, but presented results similar to the control cells, therefore, remains unclear if it is a superior option or not. Spatial variation analysis did not reflect any obstruction of flow along the CWs; but some preferential flow paths can be distinguished. An open-source repository of R was provided. 


Author(s):  
Diego Reforgiato Recupero ◽  
Valentino Artizzu ◽  
Francesca Cella ◽  
Alessandro Cotza ◽  
Davide Curcio ◽  
...  

Arduino is a famous board, which incorporates serial communication interfaces, including universal serial bus (USB) and an integrated development environment (IDE) based on Processing, a programming language that supports C and C++. It consists of a microcontroller with several other components that provide easy interconnections with other devices. Arduino and its components have been studied during the class of Computer Architecture for the degree in Computer Science at the University of Cagliari in 2016. At the end of the class, seven groups of students have been selected and chosen to carry out a device prototype on top of Arduino and show their methodology, the sensors they embedded on top, how data could be extracted, collected, stored in database for further processing and analytics. The development has been performed following the open source best practices; documentation and codes of these projects have been made online for free downloading and sharing in order to further contribute to the advancement and widespread usage of the Arduino platform.


2020 ◽  
Vol 4 (s1) ◽  
pp. 63-63
Author(s):  
Jeffrey Robinson ◽  
Annica Wayman

OBJECTIVES/GOALS: Introduce students to programming and software development practices in the life sciences by analyzing standard clinical diagnostic bloodwork for differential immune responses. Including lectures and a semester project with the goal of enhancing undergraduate students’ education to prepare them for careers in translational science. METHODS/STUDY POPULATION: The educational content was taught for the first time as a component of the newly developed course BTEC 330 “Software Applications in the Life Sciences” in UMBC’s Translational Life Science Technology (TLST) Bachelor’s degree program at the Universities at Shady Grove campus. Eleven students took the course. All were beginners with no programming background. Lectures provided background on the diagnostic components of the CBC, criteria for differential diagnosis in the clinical setting, and introduction to hematology and flow cytometry, forming underpinnings for interpretation of the CBC results. Weekly computer lab practical sessions provided training fundamentals of R programming language, the R-studio integrated development environment (IDE), and the GitHub.com open-source software development platform. RESULTS/ANTICIPATED RESULTS: The graded assignment consisted of a coding project in which students were each assigned an individual parameter from the CBC results. These include, for example, relative lymphocyte count or hemoglobin readouts. Students each created their own R-language script using R-studio, with functional code which: 1) Read in data from a file provided, 2) Performed statistical testing, 3) Read out statistical results as text, and charts as image files, 4) “Diagnosed” individuals in the dataset as being inside or outside the clinical normal range for that parameter. Each student also registered their own GitHub account and published their open-source code. Grading was performed on code functionality by downloading each student repository and running the code with the instructor as an outside developer using the resource. DISCUSSION/SIGNIFICANCE OF IMPACT: In this curriculum, students with no background in programming learned to code a basic R-language script and use GitHub to automate interpretation of CBC results. With advanced automation now becoming commonplace in translational science, such course content can provide introductory level of literacy in development of clinical informatics software.


Sign in / Sign up

Export Citation Format

Share Document