scholarly journals Analysis of psychometric data using statistical and machine learning methods

Author(s):  
Krishnapriya Subramanian

The objective of this thesis is to analyse the psychometric data using statistical and machine learning methods. Psychological data are analysed to predict illness and injury of athletes. Regression technique, one of the statistical processes for estimating the relationship among variables is used as basis of this thesis. We apply the linear regression, time series and logistics regression to predict illness and well-being. Our linear regression simulation results are mainly used, to understand the data well. By reviewing the results of linear regression, time series model is developed which predicts sickness one day ahead. The predicted values of this time series model are continuous. However, logistic regression can be used, to provide a probabilistic approach to predict the future levels as a categorical value. Hence we have developed a binomial logistics regression model, when observation variable is the type of dichotomous. Our simulation results show that this prediction model performs well. Our empirical studies also show that our method can act as early warning system for athletes.

2021 ◽  
Author(s):  
Krishnapriya Subramanian

The objective of this thesis is to analyse the psychometric data using statistical and machine learning methods. Psychological data are analysed to predict illness and injury of athletes. Regression technique, one of the statistical processes for estimating the relationship among variables is used as basis of this thesis. We apply the linear regression, time series and logistics regression to predict illness and well-being. Our linear regression simulation results are mainly used, to understand the data well. By reviewing the results of linear regression, time series model is developed which predicts sickness one day ahead. The predicted values of this time series model are continuous. However, logistic regression can be used, to provide a probabilistic approach to predict the future levels as a categorical value. Hence we have developed a binomial logistics regression model, when observation variable is the type of dichotomous. Our simulation results show that this prediction model performs well. Our empirical studies also show that our method can act as early warning system for athletes.


2021 ◽  
Vol 13 (5) ◽  
pp. 974
Author(s):  
Lorena Alves Santos ◽  
Karine Ferreira ◽  
Michelle Picoli ◽  
Gilberto Camara ◽  
Raul Zurita-Milla ◽  
...  

The use of satellite image time series analysis and machine learning methods brings new opportunities and challenges for land use and cover changes (LUCC) mapping over large areas. One of these challenges is the need for samples that properly represent the high variability of land used and cover classes over large areas to train supervised machine learning methods and to produce accurate LUCC maps. This paper addresses this challenge and presents a method to identify spatiotemporal patterns in land use and cover samples to infer subclasses through the phenological and spectral information provided by satellite image time series. The proposed method uses self-organizing maps (SOMs) to reduce the data dimensionality creating primary clusters. From these primary clusters, it uses hierarchical clustering to create subclusters that recognize intra-class variability intrinsic to different regions and periods, mainly in large areas and multiple years. To show how the method works, we use MODIS image time series associated to samples of cropland and pasture classes over the Cerrado biome in Brazil. The results prove that the proposed method is suitable for identifying spatiotemporal patterns in land use and cover samples that can be used to infer subclasses, mainly for crop-types.


Water ◽  
2020 ◽  
Vol 12 (5) ◽  
pp. 1342 ◽  
Author(s):  
Yong Fan ◽  
Litang Hu ◽  
Hongliang Wang ◽  
Xin Liu

Pumping tests are very important means for investigating aquifer properties; however, interpreting the data using common analytical solutions become invalid in complex aquifer systems. The paper aims to explore the potential of machine learning methods in retrieving the pumping tests information in a field site in the Democratic Republic of Congo. A newly planned mining site with a pumping test of three pumping wells and 28 observation wells over one month was chosen to analyze the significance of machine learning methods in the pumping test analysis. Widely used machine learning methods, including correlation, cluster, time-series analysis, artificial neural network (ANN), support vector machine (SVR), random forest (RF) method, and linear regression, are all used in this study. Correlation and cluster analyses among wells provide visual pictures of possible hydraulic connections. The pathway with the best permeability ranges from the depth of 250 m to 350 m. Time-series analysis perfectly captured changes of drawdowns within the three pumping wells. The RF method is found to have the higher accuracy and the lower sensitivity to model parameters than ANN and SVR methods. The coupling of the linear regressive model and analytical solutions is applied to estimate hydraulic conductivities. The results found that ML methods can significantly and effectively improve our understanding of pumping tests by revealing inherent information hidden in those tests.


2021 ◽  
Author(s):  
Dhairya Vyas

In terms of Machine Learning, the majority of the data can be grouped into four categories: numerical data, category data, time-series data, and text. We use different classifiers for different data properties, such as the Supervised; Unsupervised; and Reinforcement. Each Categorises has classifier we have tested almost all machine learning methods and make analysis among them.


Nafta-Gaz ◽  
2019 ◽  
Vol 75 (2) ◽  
pp. 111-117
Author(s):  
Andrzej Paliński ◽  

The paper presents contemporary trends in artificial intelligence and machine learning methods, which include, among others, artificial neural networks, decision trees, fuzzy logic systems and others. Computational intelligence methods are part of the field of research on artificial intelligence. Selected methods of computational intelligence were used to build medium-term monthly forecasts of natural gas demand for Poland. The accuracy of forecasts obtained using the artificial neural network and the decision tree with classical linear regression was compared based on historical data from a ten-year period. The explanatory variables were: gas consumption in other EU countries, average monthly temperature, industrial production, wages in the economy and the price of natural gas. Forecasting was carried out in five stages differing in the selection of the learning and testing sample, the use of data preprocessing and the elimination of some variables. For raw data and a random training set, the highest accuracy was achieved by linear regression. For the preprocessed data and the random learning set, the decision tree was the most accurate. The forecast obtained on the basis of the first eight years and tested on the last two was most accurately created by regression, but only slightly better than with the decision tree or neural network, regardless of data normalization and elimination of collinear variables. Machine learning methods showed good accuracy of monthly gas consumption forecasts, but nevertheless slightly gave way to classical linear regression, due to too narrow set of explanatory variables. Machine learning methods will be able to show higher effectiveness as the number of data increases and the set of potential explanatory variables is expanded. In the sea of data, machine learning methods are able to create prognostic models more effectively, without the analyst’s laborious involvement in data preparation and multi-stage analysis. They will also allow for the frequent updating of the form of prognostic models even after each addition of new data into the database.


2020 ◽  
Vol 12 (8) ◽  
pp. 3269
Author(s):  
Shinyoung Kwag ◽  
Daegi Hahm ◽  
Minkyu Kim ◽  
Seunghyun Eem

The objective of this study is to propose a model that can predict the seismic performance of slope relatively accurately and efficiently by using machine learning methods. Probabilistic seismic fragility analyses of the slope had been carried out in other studies, and a closed-form equation for slope seismic performance was proposed through a multiple linear regression analysis. However, the traditional statistical linear regression analysis showed a limit that could not accurately represent such nonlinear slope seismic performances. To overcome this limit, in this study, we used three machine learning methods (i.e., support vector machine (SVM), artificial neural network (ANN), Gaussian process regression (GPR)) to generate prediction models of the slope seismic performance. The models obtained through the machine learning methods basically showed better performance compared to the models of the traditional statistical methods. The results of the SVM showed no significant performance difference compared with the results of the nonlinear regression analysis method, but the results based on the ANN and GPR showed a remarkable improvement in the prediction performance over the other models. Furthermore, this study confirmed that the GPR-based model predicted relatively accurate seismic performance values compared with the model through the ANN.


Sign in / Sign up

Export Citation Format

Share Document