scholarly journals Exploratory Data Analysis using Python

Data need to be analyzed so as to produce good result. Using the result decision can be taken. For example recommendation system, ranking of the page, demand fore casting, prediction of purchase of the product. There are some leading companies where the review of the customer plays a great role to analyze the factor which influences the review rating. We have used exploratory data analysis (EDA) where data interpretations can be done in row and column format. We have used python for data analysis. it is object oriented ,interpreted and interactive programming language. it is open source with rich sets of libraries like pandas, MATplotlib, seaborn etc. We have used different types of charts and various types of parameter to analyze Amazon review data sets which contains the reviews of electronic data items. We have used python programming for the data analysis

Exploratorydata analysis is a method to summarize main characteristics of data, and also to understand data more deeply using visualization techniques. This paper focuses on defining systematic approach in the form of well-defined sequence of steps to explore data in various aspects. Every organization produces lot of data. Organization needs to analyze this data very carefully to extract hidden patterns in the data. Task Centric EDA[2]produces actionable insights as outcome to improve business process.This uses Pythonprogramming language and Jupyter Notebook for data analysis. Python is an object oriented and interactive programming language, which contains rich sets of libraries likepandas, MATplotlib, seaborn[10]etc. We have used different types of charts and various types of parameters to analyze retail dataset and to improve sales using precision marketing.


Antibiotics ◽  
2019 ◽  
Vol 8 (4) ◽  
pp. 225 ◽  
Author(s):  
Antonio Gnoni ◽  
Emanuele De Nitto ◽  
Salvatore Scacco ◽  
Luigi Santacroce ◽  
Luigi Leonardo Palese

Sepsis is a life-threatening condition that accounts for numerous deaths worldwide, usually complications of common community infections (i.e., pneumonia, etc), or infections acquired during the hospital stay. Sepsis and septic shock, its most severe evolution, involve the whole organism, recruiting and producing a lot of molecules, mostly proteins. Proteins are dynamic entities, and a large number of techniques and studies have been devoted to elucidating the relationship between the conformations adopted by proteins and what is their function. Although molecular dynamics has a key role in understanding these relationships, the number of protein structures available in the databases is so high that it is currently possible to build data sets obtained from experimentally determined structures. Techniques for dimensionality reduction and clustering can be applied in exploratory data analysis in order to obtain information on the function of these molecules, and this may be very useful in immunology to better understand the structure-activity relationship of the numerous proteins involved in host defense, moreover in septic patients. The large number of degrees of freedom that characterize the biomolecules requires special techniques which are able to analyze this kind of data sets (with a small number of entries respect to the number of degrees of freedom). In this work we analyzed the ability of two different types of algorithms to provide information on the structures present in three data sets built using the experimental structures of allosteric proteins involved in sepsis. The results obtained by means of a principal component analysis algorithm and those obtained by a random projection algorithm are largely comparable, proving the effectiveness of random projection methods in structural bioinformatics. The usefulness of random projection in exploratory data analysis is discussed, including validation of the obtained clusters. We have chosen these proteins because of their involvement in sepsis and septic shock, aimed to highlight the potentiality of bioinformatics to point out new diagnostic and prognostic tools for the patients.


Author(s):  
Dharmendra Trikamlal Patel

Exploratory data analysis is a technique to analyze data sets in order to summarize the main characteristics of them using quantitative and visual aspects. The chapter starts with the introduction of exploratory data analysis. It discusses the conventional view of it and describes the main limitations of it. It explores the features of quantitative and visual exploratory data analysis in detail. It deals with the statistical techniques relevant to EDA. It also emphasizes the main visual techniques to represent the data in an efficient way. R has extraordinary capabilities to deal with quantitative and visual aspects to summarize the main characteristics of the data set. The chapter provides the practical exposure of various plotting systems using R. Finally, the chapter deals with current research and future trends of the EDA.


2013 ◽  
Author(s):  
Stephen J. Tueller ◽  
Richard A. Van Dorn ◽  
Georgiy Bobashev ◽  
Barry Eggleston

Author(s):  
Jayesh S

UNSTRUCTURED Covid-19 outbreak was first reported in Wuhan, China. The deadly virus spread not just the disease, but fear around the globe. On January 2020, WHO declared COVID-19 as a Public Health Emergency of International Concern (PHEIC). First case of Covid-19 in India was reported on January 30, 2020. By the time, India was prepared in fighting against the virus. India has taken various measures to tackle the situation. In this paper, an exploratory data analysis of Covid-19 cases in India is carried out. Data namely number of cases, testing done, Case Fatality ratio, Number of deaths, change in visits stringency index and measures taken by the government is used for modelling and visual exploratory data analysis.


Molecules ◽  
2021 ◽  
Vol 26 (5) ◽  
pp. 1393
Author(s):  
Ralitsa Robeva ◽  
Miroslava Nedyalkova ◽  
Georgi Kirilov ◽  
Atanaska Elenkova ◽  
Sabina Zacharieva ◽  
...  

Catecholamines are physiological regulators of carbohydrate and lipid metabolism during stress, but their chronic influence on metabolic changes in obese patients is still not clarified. The present study aimed to establish the associations between the catecholamine metabolites and metabolic syndrome (MS) components in obese women as well as to reveal the possible hidden subgroups of patients through hierarchical cluster analysis and principal component analysis. The 24-h urine excretion of metanephrine and normetanephrine was investigated in 150 obese women (54 non diabetic without MS, 70 non-diabetic with MS and 26 with type 2 diabetes). The interrelations between carbohydrate disturbances, metabolic syndrome components and stress response hormones were studied. Exploratory data analysis was used to determine different patterns of similarities among the patients. Normetanephrine concentrations were significantly increased in postmenopausal patients and in women with morbid obesity, type 2 diabetes, and hypertension but not with prediabetes. Both metanephrine and normetanephrine levels were positively associated with glucose concentrations one hour after glucose load irrespectively of the insulin levels. The exploratory data analysis showed different risk subgroups among the investigated obese women. The development of predictive tools that include not only traditional metabolic risk factors, but also markers of stress response systems might help for specific risk estimation in obesity patients.


Sign in / Sign up

Export Citation Format

Share Document