Forecasting of Tourism Companies Before and During Covid-19

For about last two years, the whole world is suffering from a novel disease i.e. Covid-19. When it was first diagnosed in China, even the giant health agencies could not predict the severity and spread of this disease. Slowly, when this novel corona virus had an outbreak the countries stopped all kinds of communication be it interstate or intercountry and so the tourism companies started facing huge loss due to lockdown in every single country. In this paper, the stock prices of the multinational tourism companies that operate in India, have been forecasted and using an online learning algorithm known as Gated Recurrent Unit (GRU). As we know that predicting stock prices is not an easy task to do, it requires extensive study of the stock market and intervention of statistical and machine learning models. We will try to spot whether the forecasting before pandemic is better than the forecasting during the pandemic for each of the six leading multinational tourism companies.

Download Full-text

Identification of Anti-cancer Peptides Based on Multi-classifier System

Combinatorial Chemistry & High Throughput Screening ◽

10.2174/1386207322666191203141102 ◽

2020 ◽

Vol 22 (10) ◽

pp. 694-704 ◽

Cited By ~ 2

Author(s):

Wanben Zhong ◽

Bineng Zhong ◽

Hongbo Zhang ◽

Ziyi Chen ◽

Yan Chen

Keyword(s):

Machine Learning ◽

Side Effect ◽

Learning Models ◽

Normal Cells ◽

Classifier System ◽

Prediction Rate ◽

Anti Cancer ◽

Feature Information ◽

Machine Learning Models ◽

Better Than

Aim and Objective: Cancer is one of the deadliest diseases, taking the lives of millions every year. Traditional methods of treating cancer are expensive and toxic to normal cells. Fortunately, anti-cancer peptides (ACPs) can eliminate this side effect. However, the identification and development of new anti Materials and Methods: In our study, a multi-classifier system was used, combined with multiple machine learning models, to predict anti-cancer peptides. These individual learners are composed of different feature information and algorithms, and form a multi-classifier system by voting. Results and Conclusion: The experiments show that the overall prediction rate of each individual learner is above 80% and the overall accuracy of multi-classifier system for anti-cancer peptides prediction can reach 95.93%, which is better than the existing prediction model.

Download Full-text

Three machine learning models for the 2019 Solubility Challenge

ADMET & DMPK ◽

10.5599/admet.835 ◽

2020 ◽

Cited By ~ 1

Author(s):

John Mitchell

Keyword(s):

Machine Learning ◽

Random Forest ◽

Gold Standard ◽

Challenge Test ◽

Wisdom Of Crowds ◽

Learning Models ◽

The Third ◽

Aqueous Solubilities ◽

Machine Learning Models ◽

Better Than

<p class="ADMETabstracttext">We describe three machine learning models submitted to the 2019 Solubility Challenge. All are founded on tree-like classifiers, with one model being based on Random Forest and another on the related Extra Trees algorithm. The third model is a consensus predictor combining the former two with a Bagging classifier. We call this consensus classifier Vox Machinarum, and here discuss how it benefits from the Wisdom of Crowds. On the first 2019 Solubility Challenge test set of 100 low-variance intrinsic aqueous solubilities, Extra Trees is our best classifier. One the other, a high-variance set of 32 molecules, we find that Vox Machinarum and Random Forest both perform a little better than Extra Trees, and almost equally to one another. We also compare the gold standard solubilities from the 2019 Solubility Challenge with a set of literature-based solubilities for most of the same compounds.</p>

Download Full-text

Forecasting admissions in psychiatric hospitals before and during Covid-19

10.1101/2021.07.16.21260200 ◽

2021 ◽

Author(s):

Jan Wolff ◽

Ansgar Klimke ◽

Michael Marschollek ◽

Tim Kacprowski

Keyword(s):

Machine Learning ◽

Time Series ◽

Hospital Admissions ◽

Model Performance ◽

Psychiatric Hospitals ◽

Time Series Models ◽

Learning Models ◽

One Step ◽

Machine Learning Models ◽

Better Than

Introduction The COVID-19 pandemic has strong effects on most health care systems and individual services providers. Forecasting of admissions can help for the efficient organisation of hospital care. We aimed to forecast the number of admissions to psychiatric hospitals before and during the COVID-19 pandemic and we compared the performance of machine learning models and time series models. This would eventually allow to support timely resource allocation for optimal treatment of patients. Methods We used admission data from 9 psychiatric hospitals in Germany between 2017 and 2020. We compared machine learning models with time series models in weekly, monthly and yearly forecasting before and during the COVID-19 pandemic. Our models were trained and validated with data from the first two years and tested in prospectively sliding time-windows in the last two years. Results A total of 90,686 admissions were analysed. The models explained up to 90% of variance in hospital admissions in 2019 and 75% in 2020 with the effects of the COVID-19 pandemic. The best models substantially outperformed a one-step seasonal naive forecast (seasonal mean absolute scaled error (sMASE) 2019: 0.59, 2020: 0.76). The best model in 2019 was a machine learning model (elastic net, mean absolute error (MAE): 7.25). The best model in 2020 was a time series model (exponential smoothing state space model with Box-Cox transformation, ARMA errors and trend and seasonal components, MAE: 10.44), which adjusted more quickly to the shock effects of the COVID-19 pandemic. Models forecasting admissions one week in advance did not perform better than monthly and yearly models in 2019 but they did in 2020. The most important features for the machine learning models were calendrical variables. Conclusion Model performance did not vary much between different modelling approaches before the COVID-19 pandemic and established forecasts were substantially better than one-step seasonal naive forecasts. However, weekly time series models adjusted quicker to the COVID-19 related shock effects. In practice, different forecast horizons could be used simultaneously to allow both early planning and quick adjustments to external effects.

Download Full-text

Machine Learning for Localizing Epileptogenic-Zone in the Temporal Lobe: Quantifying the Value of Multimodal Clinical-Semiology and Imaging Concordance

Frontiers in Digital Health ◽

10.3389/fdgth.2021.559103 ◽

2021 ◽

Vol 3 ◽

Author(s):

Ali Alim-Marvasti ◽

Fernando Pérez-García ◽

Karan Dahele ◽

Gloria Romagnoli ◽

Beate Diehl ◽

...

Keyword(s):

Machine Learning ◽

Temporal Lobe ◽

Support Vector ◽

Seizure Freedom ◽

Epileptogenic Zone ◽

Learning Models ◽

Seizure Semiology ◽

Multimodal Features ◽

Machine Learning Models ◽

Better Than

Background: Epilepsy affects 50 million people worldwide and a third are refractory to medication. If a discrete cerebral focus or network can be identified, neurosurgical resection can be curative. Most excisions are in the temporal-lobe, and are more likely to result in seizure-freedom than extra-temporal resections. However, less than half of patients undergoing surgery become entirely seizure-free. Localizing the epileptogenic-zone and individualized outcome predictions are difficult, requiring detailed evaluations at specialist centers.Methods: We used bespoke natural language processing to text-mine 3,800 electronic health records, from 309 epilepsy surgery patients, evaluated over a decade, of whom 126 remained entirely seizure-free. We investigated the diagnostic performances of machine learning models using set-of-semiology (SoS) with and without hippocampal sclerosis (HS) on MRI as features, using STARD criteria.Findings: Support Vector Classifiers (SVC) and Gradient Boosted (GB) decision trees were the best performing algorithms for temporal-lobe epileptogenic zone localization (cross-validated Matthews correlation coefficient (MCC) SVC 0.73 ± 0.25, balanced accuracy 0.81 ± 0.14, AUC 0.95 ± 0.05). Models that only used seizure semiology were not always better than internal benchmarks. The combination of multimodal features, however, enhanced performance metrics including MCC and normalized mutual information (NMI) compared to either alone (p < 0.0001). This combination of semiology and HS on MRI increased both cross-validated MCC and NMI by over 25% (NMI, SVC SoS: 0.35 ± 0.28 vs. SVC SoS+HS: 0.61 ± 0.27).Interpretation: Machine learning models using only the set of seizure semiology (SoS) cannot unequivocally perform better than benchmarks in temporal epileptogenic-zone localization. However, the combination of SoS with an imaging feature (HS) enhance epileptogenic lobe localization. We quantified this added NMI value to be 25% in absolute terms. Despite good performance in localization, no model was able to predict seizure-freedom better than benchmarks. The methods used are widely applicable, and the performance enhancements by combining other clinical, imaging and neurophysiological features could be similarly quantified. Multicenter studies are required to confirm generalizability.Funding: Wellcome/EPSRC Center for Interventional and Surgical Sciences (WEISS) (203145Z/16/Z).

Download Full-text

Simple Linear Cancer Risk Prediction Models with Novel Features Outperform Complex Approaches

10.1101/2021.01.11.21249290 ◽

2021 ◽

Author(s):

Scott Kulm ◽

Lior Kofman ◽

Jason Mezey ◽

Olivier Elemento

Keyword(s):

Machine Learning ◽

Cancer Survival ◽

Linear Models ◽

Prediction Models ◽

Learning Algorithm ◽

Learning Performance ◽

Health Study ◽

Learning Models ◽

The Uk ◽

Machine Learning Models

ABSTRACTA patient’s risk for cancer is usually estimated through simple linear models that sum effect sizes of proven risk factors. In theory, more advanced machine learning models can be used for the same task. Using data from the UK Biobank, a large prospective health study, we have developed linear and machine learning models for the prediction of 12 different cancers diagnoses within a 10 year time span. We find that the top machine learning algorithm, XGBoost (XGB), trained on 707 features generated an average area under the receiver operator curve of 0.736 (with a range of 0.65-0.85). Linear models trained with only 10 features were found to be statistically indifferent from the machine learning performance. The linear models were significantly more accurate than the prominent QCancer models (p = 0.0019), which are trained on 45 million patient records and available to over 4,000 United Kingdom general practices. The increase in accuracy may be caused by the consideration of often omitted feature types, including survey answers, census records, and genetic information. This approach led to the discovery of significant novel risk features, including self-reported happiness with own health (relevant to 12 cancers), measured testosterone (relevant to 8 cancers), and ICD codes for rehabilitation procedures (relevant to 3 cancers). These ten feature models can be easily implemented within the clinic, allowing for personalized screening schedules that may increase the cancer survival within a population.

Download Full-text

P1923Deep and machine learning models to improve risk prediction of cardiovascular disease using data extraction from electronic health records

European Heart Journal ◽

10.1093/eurheartj/ehz748.0670 ◽

2019 ◽

Vol 40 (Supplement_1) ◽

Author(s):

I Korsakov ◽

A Gusev ◽

T Kuznetsova ◽

D Gavrilov ◽

R Novitskiy

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Cardiovascular Disease ◽

Logistic Regression ◽

Deep Learning ◽

Risk Prediction ◽

Learning Algorithm ◽

Learning Models ◽

Electronic Health ◽

Machine Learning Models

Abstract Abstract Background Advances in precision medicine will require an increasingly individualized prognostic evaluation of patients in order to provide the patient with appropriate therapy. The traditional statistical methods of predictive modeling, such as SCORE, PROCAM, and Framingham, according to the European guidelines for the prevention of cardiovascular disease, not adapted for all patients and require significant human involvement in the selection of predictive variables, transformation and imputation of variables. In ROC-analysis for prediction of significant cardiovascular disease (CVD), the areas under the curve for Framingham: 0.62–0.72, for SCORE: 0.66–0.73 and for PROCAM: 0.60–0.69. To improve it, we apply for approaches to predict a CVD event rely on conventional risk factors by machine learning and deep learning models to 10-year CVD event prediction by using longitudinal electronic health record (EHR). Methods For machine learning, we applied logistic regression (LR) and recurrent neural networks with long short-term memory (LSTM) units as a deep learning algorithm. We extract from longitudinal EHR the following features: demographic, vital signs, diagnoses (ICD-10-cm: I21-I22.9: I61-I63.9) and medication. The problem in this step, that near 80 percent of clinical information in EHR is “unstructured” and contains errors and typos. Missing data are important for the correct training process using by deep learning & machine learning algorithm. The study cohort included patients between the ages of 21 to 75 with a dynamic observation window. In total, we got 31517 individuals in the dataset, but only 3652 individuals have all features or missing features values can be easy to impute. Among these 3652 individuals, 29.4% has a CVD, mean age 49.4 years, 68,2% female. Evaluation We randomly divided the dataset into a training and a test set with an 80/20 split. The LR was implemented with Python Scikit-Learn and the LSTM model was implemented with Keras using Tensorflow as the backend. Results We applied machine learning and deep learning models using the same features as traditional risk scale and longitudinal EHR features for CVD prediction, respectively. Machine learning model (LR) achieved an AUROC of 0.74–0.76 and deep learning (LSTM) 0.75–0.76. By using features from EHR logistic regression and deep learning models improved the AUROC to 0.78–0.79. Conclusion The machine learning models outperformed a traditional clinically-used predictive model for CVD risk prediction (i.e. SCORE, PROCAM, and Framingham equations). This approach was used to create a clinical decision support system (CDSS). It uses both traditional risk scales and models based on neural networks. Especially important is the fact that the system can calculate the risks of cardiovascular disease automatically and recalculate immediately after adding new information to the EHR. The results are delivered to the user's personal account.

Download Full-text

MO353MACHINE LEARNING-BASED PREDICTION OF ACUTE KIDNEY INJURY AFTER NEPHRECTOMY IN PATIENTS WITH RENAL CELL CARCINOMA

Nephrology Dialysis Transplantation ◽

10.1093/ndt/gfab082.007 ◽

2021 ◽

Vol 36 (Supplement_1) ◽

Author(s):

Sejoong Kim ◽

Yeonhee Lee ◽

Seung Seok Han

Keyword(s):

Machine Learning ◽

Renal Cell Carcinoma ◽

Kidney Injury ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Learning Models ◽

Scoring Model ◽

Postoperative Aki ◽

Machine Learning Models ◽

Better Than

Abstract Background and Aims The precise prediction of acute kidney injury (AKI) after nephrectomy for renal cell carcinoma (RCC) is an important issue because of its relationship with subsequent kidney dysfunction and high mortality. Herein we addressed whether machine learning algorithms could predict postoperative AKI risk better than conventional logistic regression (LR) models. Method A total of 4,104 RCC patients who had undergone unilateral nephrectomy from January 2003 to December 2017 were reviewed. Machine learning models such as support vector machine, random forest, extreme gradient boosting, and light gradient boosting machine (LightGBM) were developed, and their performance based on the area under the receiver operating characteristic curve, accuracy, and F1 score was compared with that of the LR-based scoring model. Results Postoperative AKI developed in 1,167 patients (28.4%). All the machine learning models had higher performance index values than the LR-based scoring model. Among them, the LightGBM model had the highest value of 0.810 (0.783–0.837). The decision curve analysis demonstrated a greater net benefit of the machine learning models than the LR-based scoring model over all the ranges of threshold probabilities. The LightGBM and random forest models, but not others, were well calibrated. Conclusion The application of machine learning algorithms improves the predictability of AKI after nephrectomy for RCC, and these models perform better than conventional LR-based models.

Download Full-text

Dialysis adequacy predictions using a machine learning method

Scientific Reports ◽

10.1038/s41598-021-94964-1 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Hyung Woo Kim ◽

Seok-Jae Heo ◽

Jae Young Kim ◽

Annie Kim ◽

Chung-Mo Nam ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Regression Models ◽

Measured Data ◽

Chronic Hemodialysis ◽

Linear Regression Models ◽

Learning Models ◽

Dialysis Adequacy ◽

Gated Recurrent Unit ◽

Machine Learning Models

AbstractDialysis adequacy is an important survival indicator in patients with chronic hemodialysis. However, there are inconveniences and disadvantages to measuring dialysis adequacy by blood samples. This study used machine learning models to predict dialysis adequacy in chronic hemodialysis patients using repeatedly measured data during hemodialysis. This study included 1333 hemodialysis sessions corresponding to the monthly examination dates of 61 patients. Patient demographics and clinical parameters were continuously measured from the hemodialysis machine; 240 measurements were collected from each hemodialysis session. Machine learning models (random forest and extreme gradient boosting [XGBoost]) and deep learning models (convolutional neural network and gated recurrent unit) were compared with multivariable linear regression models. The mean absolute percentage error (MAPE), root mean square error (RMSE), and Spearman’s rank correlation coefficient (Corr) for each model using fivefold cross-validation were calculated as performance measurements. The XGBoost model had the best performance among all methods (MAPE = 2.500; RMSE = 2.906; Corr = 0.873). The deep learning models with convolutional neural network (MAPE = 2.835; RMSE = 3.125; Corr = 0.833) and gated recurrent unit (MAPE = 2.974; RMSE = 3.230; Corr = 0.824) had similar performances. The linear regression models had the lowest performance (MAPE = 3.284; RMSE = 3.586; Corr = 0.770) compared with other models. Machine learning methods can accurately infer hemodialysis adequacy using continuously measured data from hemodialysis machines.

Download Full-text

Software Engineering for Machine Learning in Health Informatics

10.21203/rs.2.17747/v1 ◽

2019 ◽

Author(s):

Mohammed Moreb ◽

Oguz Ata

Keyword(s):

Machine Learning ◽

Software Engineering ◽

Health Informatics ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Prototype System ◽

Machine Learning Algorithm ◽

Learning Models ◽

Machine Learning Models

Abstract Background We propose a novel framework for health Informatics: framework and methodology of Software Engineering for machine learning in Health Informatics (SEMLHI). This framework shed light on its features, that allow users to study and analyze the requirements, determine the function of objects related to the system and determine the machine learning algorithms that will be used for the dataset.Methods Based on original data that collected from the hospital in Palestine government in the past three years, first the data validated and all outlier removed, analyzed using develop framework in order to compare ML provide patients with real-time. Our proposed module comparison with three Systems Engineering Methods Vee, agile and SEMLHI. The result used by implement prototype system, which require machine learning algorithm, after development phase, questionnaire deliver to developer to indicate the result using three methodology. SEMLHI framework, is composed into four components: software, machine learning model, machine learning algorithms, and health informatics data, Machine learning Algorithm component used five algorithms use to evaluate the accuracy for machine learning models on component.Results we compare our approach with the previously published systems in terms of performance to evaluate the accuracy for machine learning models, the results of accuracy with different algorithms applied for 750 case, linear SVG have about 0.57 value compared with KNeighbors classifier, logistic regression, multinomial NB, random forest classifier. This research investigates the interaction between SE, and ML within the context of health informatics, our proposed framework define the methodology for developers to analyzing and developing software for the health informatic model, and create a space, in which software engineering, and ML experts could work on the ML model lifecycle, on the disease level and the subtype level.Conclusions This article is an ongoing effort towards defining and translating an existing research pipeline into four integrated modules, as framework system using the dataset from healthcare to reduce cost estimation by using a new suggested methodology. The framework is available as open source software, licensed under GNU General Public License Version 3 to encourage others to contribute to the future development of the SEMLHI framework.

Download Full-text

Comparisons of forecasting for Survival Outcome for Head and Neck Squamous Cell Carcinoma by using Six Machine Learning Models Based on Multi-Omics

10.21203/rs.3.rs-1100398/v1 ◽

2021 ◽

Author(s):

Liying Mo ◽

Yuangang Su ◽

Jianhui Yuan ◽

Zhiwei Xiao ◽

Ziyan Zhang ◽

...

Keyword(s):

Machine Learning ◽

Squamous Cell Carcinoma ◽

Cell Carcinoma ◽

Head And Neck ◽

Squamous Cell ◽

Survival Outcome ◽

Learning Models ◽

Machine Learning Methods ◽

Machine Learning Models ◽

Better Than

Abstract Background: Machine learning methods showed excellent predictive ability in a wide range of fields. For the survival of head and neck squamous cell carcinoma (HNSC), its multi-omics influence is crucial. This study attempts to establish a variety of machine learning multi-omics models to predict the survival of HNSC and find the most suitable machine learning prediction method. Results: For omics of HNSC, the results of the six models all showed that the performance of multi-omics was better than each single-omic alone. Results were presented which showed that the BN model played a good prediction performance (area under the curve [AUC] 0.8250) in HNSC multi-omics data. The other machine learning models RF (AUC = 0.8002), NN (AUC = 0.7200), and GLM (AUC = 0.7145) also showed high predictive performance except for DT(AUC = 0.5149) and SVM(AUC = 0.6981). And the results of a vitro qPCR were consistent with the Random forest algorithm. Conclusion: Machine learning methods could better forecast the survival outcome of HNSC. Meanwhile, this study found that the Bayesian network was the most superior. Moreover, the forecast result of multi-omics was better than single-omic alone in HNSC.

Download Full-text