Comparative Approaches to Using R and Python for Statistical Data Analysis - Advances in Systems Analysis, Software Engineering, and High Performance Computing

Descriptive Analysis

Comparative Approaches to Using R and Python for Statistical Data Analysis - Advances in Systems Analysis, Software Engineering, and High Performance Computing ◽

10.4018/978-1-68318-016-6.ch004 ◽

2017 ◽

pp. 83-113

Keyword(s):

Computational Methods ◽

Descriptive Analysis ◽

Descriptive Statistics ◽

High Quality ◽

Initial Stage

Descriptive statistics is the initial stage of analysis used to describe and summarize data. The availability of a large amount of data and very efficient computational methods strengthened this area of the statistic. In this chapter, we introduce the main concepts related to descriptive analysis. These will provide a vast quantity of knowledge to perform a high-quality descriptive analysis.

Download Full-text

Clusters

Comparative Approaches to Using R and Python for Statistical Data Analysis - Advances in Systems Analysis, Software Engineering, and High Performance Computing ◽

10.4018/978-1-68318-016-6.ch008 ◽

2017 ◽

pp. 179-190

Keyword(s):

Cluster Analysis ◽

Data Analysis ◽

Statistical Data ◽

Similarity Measures ◽

Trial And Error ◽

Multi Objective Optimization ◽

Statistical Data Analysis ◽

Multi Objective ◽

Common Technique

Cluster analysis, which we approach in this chapter, is the task of grouping a set of objects in such a way that objects in the same group or cluster are more similar to each other than to those in other groups or clusters. It is a common technique for statistical data analysis. Cluster analysis can be achieved by various algorithms that might differ significantly. Therefore, cluster analysis as such is not a trivial task. It is an interactive multi-objective optimization that involves trial and error. Therefore, in cluster analysis, the clustering of subjects or variables are made from similarity measures or dissimilarity (distance) between two subjects initially, and later between two clusters. These groups can be done using hierarchical or non-hierarchical techniques.

Download Full-text

Factor Analysis

Comparative Approaches to Using R and Python for Statistical Data Analysis - Advances in Systems Analysis, Software Engineering, and High Performance Computing ◽

10.4018/978-1-68318-016-6.ch007 ◽

2017 ◽

pp. 148-178

Keyword(s):

Factor Analysis ◽

Statistical Method ◽

Latent Variables ◽

Lower Amount ◽

Linear Combinations ◽

Correlated Variables ◽

Error Quantification ◽

Unobserved Variables

Factor analysis is a statistical method used to describe variability among observed, correlated variables. The goal of performing factor analysis is to search for some unobserved variables called factors. The observed variables are modelled as linear combinations of the possible factors, added the error quantification of this approximation. This added information about the interaction of observed variables could be used for further analysis of the importance of each variable in the context of the dataset. This way, some observed variables are substituted by a set of latent variables in a lower amount, and that, therefore, represents the data in a summarized fashion.

Download Full-text

Statistical Inference

Comparative Approaches to Using R and Python for Statistical Data Analysis - Advances in Systems Analysis, Software Engineering, and High Performance Computing ◽

10.4018/978-1-68318-016-6.ch005 ◽

2017 ◽

pp. 114-139

Keyword(s):

Statistical Inference ◽

Random Sample ◽

T Test ◽

Entire Population ◽

Inferential Statistics ◽

Chi Square ◽

Whitney Test ◽

Chi Square Test ◽

Student’S T ◽

Student’S T Test

Statistical inference allows drawing conclusions from data. These analyses use a random sample of data taken from a population to describe and make inferences about the population. Inferential statistics are valuable when it is not convenient or possible to examine each member of an entire population. In this chapter, some concepts like ANOVA, Student's t-test, Chi-Square test, Mann-Whitney test and Kruskal-Wallis test will be presented. Given the insight of a particular phenomenon, after reading this chapter, the analyst will be able to, from that knowledge, infer possible new results.

Download Full-text

Dataset

Comparative Approaches to Using R and Python for Statistical Data Analysis - Advances in Systems Analysis, Software Engineering, and High Performance Computing ◽

10.4018/978-1-68318-016-6.ch003 ◽

2017 ◽

pp. 78-82

In this chapter, we present the dataset used in the course of this book. Our case study is built upon fictional data. This process implied “collecting” data, and we will explain each of the chosen variables with more detail.

Download Full-text

Statistics

Comparative Approaches to Using R and Python for Statistical Data Analysis - Advances in Systems Analysis, Software Engineering, and High Performance Computing ◽

10.4018/978-1-68318-016-6.ch001 ◽

2017 ◽

pp. 1-31

Keyword(s):

Decision Making ◽

Data Analysis ◽

Decision Making Process ◽

Learning From Data ◽

Analyze Data

Statistics is a set of methods used to analyze data. This chapter presents the main concepts used in statistics, learning from data is one of the most critical challenges. In general, we can say that statistic based on the theory of probability, provides techniques and methods for data analysis, which help the decision-making process in various problems where there is uncertainty.

Download Full-text

Discussion and Conclusion

Comparative Approaches to Using R and Python for Statistical Data Analysis - Advances in Systems Analysis, Software Engineering, and High Performance Computing ◽

10.4018/978-1-68318-016-6.ch009 ◽

2017 ◽

pp. 191-194

Keyword(s):

Statistical Analysis

After all of the material we covered throughout this book, this chapter ends the book with a discussion and conclusion about the document's purpose. Thus, in this chapter, we try to clearly state the reasons why we have used the tools we chosen for the statistical analysis tasks and finally conclude the comparison between them.

Download Full-text

Introduction to Linear Regression

Comparative Approaches to Using R and Python for Statistical Data Analysis - Advances in Systems Analysis, Software Engineering, and High Performance Computing ◽

10.4018/978-1-68318-016-6.ch006 ◽

2017 ◽

pp. 140-147

Keyword(s):

Regression Analysis ◽

Linear Regression ◽

Regression Function ◽

Statistical Modelling ◽

Fundamental Analysis ◽

Statistical Process ◽

Analysis Techniques ◽

Independent Variables ◽

Average Value ◽

Dependent Variables

In statistical modelling, regression analysis is a statistical process for estimating the relationships among variables. More specifically, regression analysis helps the reader understand how the dependent variable changes when any of the independent variables is varied. Thus, regression analysis estimates the average value of the dependent variable when the independent variables are fixed. Therefore, the estimation target is a function of the independent variables called regression function. In limited circumstances, regression analysis can be used to infer causal relationships between the independent and dependent variables. Nonetheless, caution has to be taken since correlation might not signify causality. Regression analysis techniques are varied. Nevertheless, in this chapter, we will present only the fundamental analysis.

Download Full-text

Introduction to Programming R and Python Languages

Comparative Approaches to Using R and Python for Statistical Data Analysis - Advances in Systems Analysis, Software Engineering, and High Performance Computing ◽

10.4018/978-1-68318-016-6.ch002 ◽

2017 ◽

pp. 32-77

Keyword(s):

Data Analysis ◽

Basic Concepts

This chapter introduces the basic concepts of the languages we propose to use in the data analysis tasks. Thus, we first introduce some features of R and Python.

Download Full-text

Comparative Approaches to Using R and Python for Statistical Data Analysis - Advances in Systems Analysis, Software Engineering, and High Performance Computing
Latest Publications

TOTAL DOCUMENTS

H-INDEX

Published By IGI Global

Descriptive Analysis

Clusters

Factor Analysis

Statistical Inference

Dataset

Statistics

Discussion and Conclusion

Introduction to Linear Regression

Introduction to Programming R and Python Languages

Export Citation Format

Comparative Approaches to Using R and Python for Statistical Data Analysis - Advances in Systems Analysis, Software Engineering, and High Performance ComputingLatest Publications

TOTAL DOCUMENTS

H-INDEX

Published By IGI Global

Descriptive Analysis

Clusters

Factor Analysis

Statistical Inference

Dataset

Statistics

Discussion and Conclusion

Introduction to Linear Regression

Introduction to Programming R and Python Languages

Comparative Approaches to Using R and Python for Statistical Data Analysis - Advances in Systems Analysis, Software Engineering, and High Performance Computing
Latest Publications