A Comparison of GLOWER and Other Machine Learning Methods for Investment Decision Making

Author(s):  
Vasant Dhar
2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Alan Brnabic ◽  
Lisa M. Hess

Abstract Background Machine learning is a broad term encompassing a number of methods that allow the investigator to learn from the data. These methods may permit large real-world databases to be more rapidly translated to applications to inform patient-provider decision making. Methods This systematic literature review was conducted to identify published observational research of employed machine learning to inform decision making at the patient-provider level. The search strategy was implemented and studies meeting eligibility criteria were evaluated by two independent reviewers. Relevant data related to study design, statistical methods and strengths and limitations were identified; study quality was assessed using a modified version of the Luo checklist. Results A total of 34 publications from January 2014 to September 2020 were identified and evaluated for this review. There were diverse methods, statistical packages and approaches used across identified studies. The most common methods included decision tree and random forest approaches. Most studies applied internal validation but only two conducted external validation. Most studies utilized one algorithm, and only eight studies applied multiple machine learning algorithms to the data. Seven items on the Luo checklist failed to be met by more than 50% of published studies. Conclusions A wide variety of approaches, algorithms, statistical software, and validation strategies were employed in the application of machine learning methods to inform patient-provider decision making. There is a need to ensure that multiple machine learning approaches are used, the model selection strategy is clearly defined, and both internal and external validation are necessary to be sure that decisions for patient care are being made with the highest quality evidence. Future work should routinely employ ensemble methods incorporating multiple machine learning algorithms.


Author(s):  
Mojtaba Montazery ◽  
Nic Wilson

Support Vector Machines (SVM) are among the most well-known machine learning methods, with broad use in different scientific areas. However, one necessary pre-processing phase for SVM is normalization (scaling) of features, since SVM is not invariant to the scales of the features’ spaces, i.e., different ways of scaling may lead to different results. We define a more robust decision-making approach for binary classification, in which one sample strongly belongs to a class if it belongs to that class for all possible rescalings of features. We derive a way of characterising the approach for binary SVM that allows determining when an instance strongly belongs to a class and when the classification is invariant to rescaling. The characterisation leads to a computation method to determine whether one sample is strongly positive, strongly negative or neither. Our experimental results back up the intuition that being strongly positive suggests stronger confidence that an instance really is positive.


Author(s):  
Yusuf S. Türkan ◽  
Hacer Yumurtacı Aydoğmuş ◽  
Hamit Erdal

In Turkey, many enterprisers started to make investment on renewable energy systems after new legal regulations and stimulus packages about production of renewable energy were introduced. Out of many alternatives, production of electricity via wind farms is one of the leading systems. For these systems, the wind speed values measured prior to the establishment of the farms are extremely important in both decision making and in the projection of the investment. However, the measurement of the wind speed at different heights is a time consuming and expensive process. For this reason, the success of the techniques predicting the wind speeds is fairly important in fast and reliable decision-making for investment in wind farms. In this study, the annual wind speed values of Kutahya, one of the regions in Turkey that has potential for wind energy at two different heights, were used and with the help of speed values at 10 m, wind speed values at 30 m of height were predicted by seven different machine learning methods. The results of the analysis were compared with each other. The results show that support vector machines is a successful technique in the prediction of the wind speed for different heights. 


Author(s):  
Stacey Fisher ◽  
Lief Pagalan ◽  
Mack Hurst ◽  
Meghan O’Neill ◽  
Lori Diemert ◽  
...  

IntroductionData from population health surveys, administrative health records and environmental monitoring are increasingly being linked at the individual level. As these data become available to health researchers, there is an increasing need for methods which can make sense of large, noisy and heterogeneous data and can model complex relationships. Using these data, machine learning methods have the potential to produce population health risk algorithms with better performance than those developed with traditional statistical approaches. Objectives and ApproachThe objective of this work is to explore the use of machine learning methods for the development, validation and implementation of predictive risk algorithms designed specifically for population health planning purposes. Algorithms to predict risk of dementia and avoidable hospitalizations are in development using the Canadian Community Health Survey, geographic sociodemographic information, administrative health care utilization data and vital statistics. Methods being explored include naïve Bayes, gradient boosting, support vector machines and neural networks. ResultsRisk algorithms for population health should generally prioritize calibration over discrimination due to implications for resource allocation decisions. Approaches to minimize the risk of overfitting should be used and reweighting of unbalanced data avoided as it distorts the population-level nature of the data. It is important to be aware of propagating underlying bias in the data or exacerbating existing health inequities, which can be evaluated in part through assessment of calibration across relevant population subgroups. Approaches that consider multi-level data structures are needed to appropriately incorporate neighbourhood-level measures with individual-level information. To maximize population health impact and acceptability, model transparency and interpretability should be prioritized. ConclusionThere is tremendous potential for machine learning approaches to leverage large volumes of linked population data to produce predictive risk algorithms that will inform population health decision-making. Future work will explore use of complex environmental remote sensing and built environment data.


2020 ◽  
Vol 10 (3) ◽  
pp. 544-551 ◽  
Author(s):  
Xiongtao Zhang ◽  
Yunliang Jiang ◽  
Wenjun Hu ◽  
Shitong Wang

Diabetes is one of the deadliest disease on the planet. It isn't just an ailment yet additionally a maker of various types of maladies like heart assault, blurred vision, nephropathy and dyspnea. When decision-making process by traditional machine learning methods for a patient is made, it often face the following challenges: (1) some uncertain factors exist in the patient or the decision-making process which often result in misdiagnosis; (2) the decision-making process with traditional machine learning methods are block-box which are not interpretable. In this paper, a parallel-based fuzzy partition and fuzzy weighted ensemble TSK (Takagi-Sugeno-Kang) fuzzy classifier called FP-TSK-FW is proposed for diabetes diagnosis by utilizing its strong uncertainty-handling capability and interpretability so as to achieve promising classification performance. In FP-TSK-FW, the training dataset firstly is partitioned into several subsets by fuzzy clustering algorithm FCM on certain attributes, each interpretable TSK fuzzy subclassifier on each training subset can be quickly built in parallel, and with different structures. Finally, the final prediction of FP-TSK-FW is realized by fuzzy weighted for the results of each classifier. The experimental results on Pima Indians Diabetes dataset indicate the effectiveness of the proposed methods in the sense of both enhanced classification performance and interpretability.


Sign in / Sign up

Export Citation Format

Share Document