scholarly journals Depression Analysis using Machine Learning Based on Musical Habits

Depression has been a main cause of mental illness. Depression results in vital impairment in lifestyle. A significant reason for suicidal cerebration is observed to be depression. Music varies the intensity of emotional experience by captivating the neurotransmitters and brain anatomy, including the brain’s dopaminergic projections. The popularity of using Regression Models in data analysis in both research and industry has driven the development of an array of prediction models. It relies on independent variables and can provide the prediction for the dependent variable. The paper outlines the development of a Regression model to get the depression score of a person based on the music the user listens to. A regression model is used to predict the depression score depending upon the data obtained from a varied span of individuals and the genre of music they have listened to. We generate a suitable report based on the depression score. The doctor can then use the report to give the necessary treatment to the depressed patient. With our research, we have obtained variance and r2 score of over 0.95.

2021 ◽  
Vol 11 (4) ◽  
pp. 1776
Author(s):  
Young Seo Kim ◽  
Han Young Joo ◽  
Jae Wook Kim ◽  
So Yun Jeong ◽  
Joo Hyun Moon

This study identified the meteorological variables that significantly impact the power generation of a solar power plant in Samcheonpo, Korea. To this end, multiple regression models were developed to estimate the power generation of the solar power plant with changing weather conditions. The meteorological data for the regression models were the daily data from January 2011 to December 2019. The dependent variable was the daily power generation of the solar power plant in kWh, and the independent variables were the insolation intensity during daylight hours (MJ/m2), daylight time (h), average relative humidity (%), minimum relative humidity (%), and quantity of evaporation (mm). A regression model for the entire data and 12 monthly regression models for the monthly data were constructed using R, a large data analysis software. The 12 monthly regression models estimated the solar power generation better than the entire regression model. The variables with the highest influence on solar power generation were the insolation intensity variables during daylight hours and daylight time.


2021 ◽  
Vol 42 (Supplement_1) ◽  
pp. S33-S34
Author(s):  
Morgan A Taylor ◽  
Randy D Kearns ◽  
Jeffrey E Carter ◽  
Mark H Ebell ◽  
Curt A Harris

Abstract Introduction A nuclear disaster would generate an unprecedented volume of thermal burn patients from the explosion and subsequent mass fires (Figure 1). Prediction models characterizing outcomes for these patients may better equip healthcare providers and other responders to manage large scale nuclear events. Logistic regression models have traditionally been employed to develop prediction scores for mortality of all burn patients. However, other healthcare disciplines have increasingly transitioned to machine learning (ML) models, which are automatically generated and continually improved, potentially increasing predictive accuracy. Preliminary research suggests ML models can predict burn patient mortality more accurately than commonly used prediction scores. The purpose of this study is to examine the efficacy of various ML methods in assessing thermal burn patient mortality and length of stay in burn centers. Methods This retrospective study identified patients with fire/flame burn etiologies in the National Burn Repository between the years 2009 – 2018. Patients were randomly partitioned into a 67%/33% split for training and validation. A random forest model (RF) and an artificial neural network (ANN) were then constructed for each outcome, mortality and length of stay. These models were then compared to logistic regression models and previously developed prediction tools with similar outcomes using a combination of classification and regression metrics. Results During the study period, 82,404 burn patients with a thermal etiology were identified in the analysis. The ANN models will likely tend to overfit the data, which can be resolved by ending the model training early or adding additional regularization parameters. Further exploration of the advantages and limitations of these models is forthcoming as metric analyses become available. Conclusions In this proof-of-concept study, we anticipate that at least one ML model will predict the targeted outcomes of thermal burn patient mortality and length of stay as judged by the fidelity with which it matches the logistic regression analysis. These advancements can then help disaster preparedness programs consider resource limitations during catastrophic incidents resulting in burn injuries.


2016 ◽  
Author(s):  
Geoffrey Fouad ◽  
André Skupin ◽  
Christina L. Tague

Abstract. Percentile flows are statistics derived from the flow duration curve (FDC) that describe the flow equaled or exceeded for a given percent of time. These statistics provide important information for managing rivers, but are often unavailable since most basins are ungauged. A common approach for predicting percentile flows is to deploy regional regression models based on gauged percentile flows and related independent variables derived from physical and climatic data. The first step of this process identifies groups of basins through a cluster analysis of the independent variables, followed by the development of a regression model for each group. This entire process hinges on the independent variables selected to summarize the physical and climatic state of basins. Distributed physical and climatic datasets now exist for the contiguous United States (US). However, it remains unclear how to best represent these data for the development of regional regression models. The study presented here developed regional regression models for the contiguous US, and evaluated the effect of different approaches for selecting the initial set of independent variables on the predictive performance of the regional regression models. An expert assessment of the dominant controls on the FDC was used to identify a small set of independent variables likely related to percentile flows. A data-driven approach was also applied to evaluate two larger sets of variables that consist of either (1) the averages of data for each basin or (2) both the averages and statistical distribution of basin data distributed in space and time. The small set of variables from the expert assessment of the FDC and two larger sets of variables for the data-driven approach were each applied for a regional regression procedure. Differences in predictive performance were evaluated using 184 validation basins withheld from regression model development. The small set of independent variables selected through expert assessment produced similar, if not better, performance than the two larger sets of variables. A parsimonious set of variables only consisted of mean annual precipitation, potential evapotranspiration, and baseflow index. Additional variables in the two larger sets of variables added little to no predictive information. Regional regression models based on the parsimonious set of variables were developed using 734 calibration basins, and were converted into a tool for predicting 13 percentile flows in the contiguous US. Supplementary Material for this paper includes an R graphical user interface for predicting the percentile flows of basins within the range of conditions used to calibrate the regression models. The equations and performance statistics of the models are also supplied in tabular form.


2010 ◽  
Vol 121-122 ◽  
pp. 346-349
Author(s):  
Yu Qin Sun ◽  
Yuan Ttao Jiang ◽  
Yong Ge Tian

One century ago (1910), the Hungarian mathematician Alfred Haar introduced the simplest wavelets in approximation theory, which are now known as the Haar wavelets. This type of wavelets can effectively be used to fit data in statistical applications. It is well known that for a general regression model, it is not easy to write estimations of its parameters in analytical forms. However, regression models generated from the Haar wavelets are easy to compute. In this article, we introduce how to use the Haar wavelets to formulate regression models and to fit data. In addition, we mention some variations of the Haar wavelets and their possible applications.


2020 ◽  
Vol 9 (3) ◽  
pp. 164-172
Author(s):  
Changsheng Jiang ◽  
Piaopiao Zhao ◽  
Weihua Li ◽  
Yun Tang ◽  
Guixia Liu

Abstract Neurotoxicity is one of the main causes of drug withdrawal, and the biological experimental methods of detecting neurotoxic toxicity are time-consuming and laborious. In addition, the existing computational prediction models of neurotoxicity still have some shortcomings. In response to these shortcomings, we collected a large number of data set of neurotoxicity and used PyBioMed molecular descriptors and eight machine learning algorithms to construct regression prediction models of chemical neurotoxicity. Through the cross-validation and test set validation of the models, it was found that the extra-trees regressor model had the best predictive effect on neurotoxicity (${q}_{\mathrm{test}}^2$ = 0.784). In addition, we get the applicability domain of the models by calculating the standard deviation distance and the lever distance of the training set. We also found that some molecular descriptors are closely related to neurotoxicity by calculating the contribution of the molecular descriptors to the models. Considering the accuracy of the regression models, we recommend using the extra-trees regressor model to predict the chemical autonomic neurotoxicity.


2021 ◽  
Vol 2 (2) ◽  
pp. 40-47
Author(s):  
Sunil Kumar ◽  
Vaibhav Bhatnagar

Machine learning is one of the active fields and technologies to realize artificial intelligence (AI). The complexity of machine learning algorithms creates problems to predict the best algorithm. There are many complex algorithms in machine learning (ML) to determine the appropriate method for finding regression trends, thereby establishing the correlation association in the middle of variables is very difficult, we are going to review different types of regressions used in Machine Learning. There are mainly six types of regression model Linear, Logistic, Polynomial, Ridge, Bayesian Linear and Lasso. This paper overview the above-mentioned regression model and will try to find the comparison and suitability for Machine Learning. A data analysis prerequisite to launch an association amongst the innumerable considerations in a data set, association is essential for forecast and exploration of data. Regression Analysis is such a procedure to establish association among the datasets. The effort on this paper predominantly emphases on the diverse regression analysis model, how they binning to custom in context of different data sets in machine learning. Selection the accurate model for exploration is the most challenging assignment and hence, these models considered thoroughly in this study. In machine learning by these models in the perfect way and thru accurate data set, data exploration and forecast can provide the maximum exact outcomes.


2020 ◽  
Author(s):  
Yue Ruan ◽  
Alexis Bellot ◽  
Zuzana Moysova ◽  
Garry D. Tan ◽  
Alistair Lumb ◽  
...  

<b><i>Objective </i></b> <p>We analyzed data from inpatients with diabetes admitted to a large university hospital to predict the risk of hypoglycemia through the use of machine learning algorithms.<i></i></p> <p><b><i>Research Design and Methods </i></b></p> <p>Four years of data was extracted from a hospital electronic health record system. This included laboratory and point-of-care blood glucose (BG) values to identify biochemical and clinically significant hypoglycaemic episodes (BG <u><</u> 3.9 and <u><</u> 2.9mmol/L respectively). We used patient demographics, administered medications, vital signs, laboratory results and procedures performed during the hospital stays to inform the model. Two iterations of the dataset included the doses of insulin administered and the past history of inpatient hypoglycaemia. Eighteen different prediction models were compared using the area under curve of the receiver operating characteristics (AUC_ROC) through a ten-fold cross validation.</p> <p><b><i>Results</i></b> </p> <p>We analyzed data obtained from 17,658 inpatients with diabetes who underwent 32,758 admissions between July 2014 and August 2018. The predictive factors from the logistic regression model included people undergoing procedures, weight, type of diabetes, oxygen saturation level, use of medications (insulin, sulfonylurea, metformin) and albumin levels. The machine learning model with the best performance was the XGBoost model (AUC_ROC 0.96. This outperformed the logistic regression model which had an AUC_ROC of 0.75 for the estimation of the risk of clinically significant hypoglycaemia.<b><i></i></b></p> <p><b><i>Conclusions</i></b></p> <p>Advanced machine learning models are superior to logistic regression models in predicting the risk of hypoglycemia in inpatients with diabetes. Trials of such models should be conducted in real time to evaluate their utility to reduce inpatient hypoglycaemia.</p>


2020 ◽  
Author(s):  
Yue Ruan ◽  
Alexis Bellot ◽  
Zuzana Moysova ◽  
Garry D. Tan ◽  
Alistair Lumb ◽  
...  

<b><i>Objective </i></b> <p>We analyzed data from inpatients with diabetes admitted to a large university hospital to predict the risk of hypoglycemia through the use of machine learning algorithms.<i></i></p> <p><b><i>Research Design and Methods </i></b></p> <p>Four years of data was extracted from a hospital electronic health record system. This included laboratory and point-of-care blood glucose (BG) values to identify biochemical and clinically significant hypoglycaemic episodes (BG <u><</u> 3.9 and <u><</u> 2.9mmol/L respectively). We used patient demographics, administered medications, vital signs, laboratory results and procedures performed during the hospital stays to inform the model. Two iterations of the dataset included the doses of insulin administered and the past history of inpatient hypoglycaemia. Eighteen different prediction models were compared using the area under curve of the receiver operating characteristics (AUC_ROC) through a ten-fold cross validation.</p> <p><b><i>Results</i></b> </p> <p>We analyzed data obtained from 17,658 inpatients with diabetes who underwent 32,758 admissions between July 2014 and August 2018. The predictive factors from the logistic regression model included people undergoing procedures, weight, type of diabetes, oxygen saturation level, use of medications (insulin, sulfonylurea, metformin) and albumin levels. The machine learning model with the best performance was the XGBoost model (AUC_ROC 0.96. This outperformed the logistic regression model which had an AUC_ROC of 0.75 for the estimation of the risk of clinically significant hypoglycaemia.<b><i></i></b></p> <p><b><i>Conclusions</i></b></p> <p>Advanced machine learning models are superior to logistic regression models in predicting the risk of hypoglycemia in inpatients with diabetes. Trials of such models should be conducted in real time to evaluate their utility to reduce inpatient hypoglycaemia.</p>


2020 ◽  
Vol 9 (3) ◽  
pp. 678-695
Author(s):  
Zuhur Alatawi

A business committed to CSR activities can establish a favourable reputation in the market hence this reputation can be used to mislead the market by making them rely on the financial reporting of the organisation. This study aimed to investigate the relationship between CSR and earnings quality for firms listed on FTSE 350. Besides, it aimed to explore the impact of CSR on the motivation of the management to improve the earnings quality or manage earnings. The research has applied LSDV regression and OLS regression on the data collected from 217 firms listed on the FTSE 350. The respective regression models applied by keeping earnings quality as a dependent variable and range of independent variables such as CSR, SIZE, GROWTH, LEVERAGE and ROA. Besides, the correlation coefficient has also been calculated despite, the result could not reveal the nature of the relationship between the variables hence regression model was applied. The results have revealed no relationship between earnings quality and CSR in the case of LSDV regression model. The same has been observed for the OLS model however, there exists a relatively significant relationship between earnings quality and LEVERAGE. Similar findings recorded for earnings quality and GROWTH.


Sign in / Sign up

Export Citation Format

Share Document