Random forest and long short-term memory based machine learning models for classification of ion mobility spectrometry spectra

Author(s):  
Patrick C. Riley ◽  
Samir V. Deshpande ◽  
Brian S. Ince ◽  
Brian C. Hauck ◽  
Kyle P. O'Donnell ◽  
...  
Author(s):  
Suleka Helmini ◽  
Nadheesh Jihan ◽  
Malith Jayasinghe ◽  
Srinath Perera

In the retail domain, estimating the sales before actual sales become known plays a key role in maintaining a successful business. This is due to the fact that most crucial decisions are bound to be based on these forecasts. Statistical sales forecasting models like ARIMA (Auto-Regressive Integrated Moving Average), can be identified as one of the most traditional and commonly used forecasting methodologies. Even though these models are capable of producing satisfactory forecasts for linear time series data they are not suitable for analyzing non-linear data. Therefore, machine learning models (such as Random Forest Regression, XGBoost) have been employed frequently as they were able to achieve better results using non-linear data. The recent research shows that deep learning models (e.g. recurrent neural networks) can provide higher accuracy in predictions compared to machine learning models due to their ability to persist information and identify temporal relationships. In this paper, we adopt a special variant of Long Short Term Memory (LSTM) network called LSTM model with peephole connections for sales prediction. We first build our model using historical features for sales forecasting. We compare the results of this initial LSTM model with multiple machine learning models, namely, the Extreme Gradient Boosting model (XGB) and Random Forest Regressor model(RFR). We further improve the prediction accuracy of the initial model by incorporating features that describe the future that is known to us in the current moment, an approach that has not been explored in previous state-of-the-art LSTM based forecasting models. The initial LSTM model we develop outperforms the machine learning models achieving 12% - 14% improvement whereas the improved LSTM model achieves 11\% - 13\% improvement compared to the improved machine learning models. Furthermore, we also show that our improved LSTM model can obtain a 20% - 21% improvement compared to the initial LSTM model, achieving significant improvement.


2019 ◽  
Author(s):  
Suleka Helmini ◽  
Nadheesh Jihan ◽  
Malith Jayasinghe ◽  
Srinath Perera

In the retail domain, estimating the sales before actual sales become known plays a key role in maintaining a successful business. This is due to the fact that most crucial decisions are bound to be based on these forecasts. Statistical sales forecasting models like ARIMA (Auto-Regressive Integrated Moving Average), can be identified as one of the most traditional and commonly used forecasting methodologies. Even though these models are capable of producing satisfactory forecasts for linear time series data they are not suitable for analyzing non-linear data. Therefore, machine learning models (such as Random Forest Regression, XGBoost) have been employed frequently as they were able to achieve better results using non-linear data. The recent research shows that deep learning models (e.g. recurrent neural networks) can provide higher accuracy in predictions compared to machine learning models due to their ability to persist information and identify temporal relationships. In this paper, we adopt a special variant of Long Short Term Memory (LSTM) network called LSTM model with peephole connections for sales prediction. We first build our model using historical features for sales forecasting. We compare the results of this initial LSTM model with multiple machine learning models, namely, the Extreme Gradient Boosting model (XGB) and Random Forest Regressor model(RFR). We further improve the prediction accuracy of the initial model by incorporating features that describe the future that is known to us in the current moment, an approach that has not been explored in previous state-of-the-art LSTM based forecasting models. The initial LSTM model we develop outperforms the machine learning models achieving 12% - 14% improvement whereas the improved LSTM model achieves 11\% - 13\% improvement compared to the improved machine learning models. Furthermore, we also show that our improved LSTM model can obtain a 20% - 21% improvement compared to the initial LSTM model, achieving significant improvement.


Sensors ◽  
2021 ◽  
Vol 21 (11) ◽  
pp. 3678
Author(s):  
Dongwon Lee ◽  
Minji Choi ◽  
Joohyun Lee

In this paper, we propose a prediction algorithm, the combination of Long Short-Term Memory (LSTM) and attention model, based on machine learning models to predict the vision coordinates when watching 360-degree videos in a Virtual Reality (VR) or Augmented Reality (AR) system. Predicting the vision coordinates while video streaming is important when the network condition is degraded. However, the traditional prediction models such as Moving Average (MA) and Autoregression Moving Average (ARMA) are linear so they cannot consider the nonlinear relationship. Therefore, machine learning models based on deep learning are recently used for nonlinear predictions. We use the Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) neural network methods, originated in Recurrent Neural Networks (RNN), and predict the head position in the 360-degree videos. Therefore, we adopt the attention model to LSTM to make more accurate results. We also compare the performance of the proposed model with the other machine learning models such as Multi-Layer Perceptron (MLP) and RNN using the root mean squared error (RMSE) of predicted and real coordinates. We demonstrate that our model can predict the vision coordinates more accurately than the other models in various videos.


2018 ◽  
Author(s):  
Yu-Wei Lin ◽  
Yuqian Zhou ◽  
Faraz Faghri ◽  
Michael J. Shaw ◽  
Roy H. Campbell

AbstractBackgroundUnplanned readmission of a hospitalized patient is an extremely undesirable outcome as the patient may have been exposed to additional risks. The rates of unplanned readmission are, therefore, regarded as an important performance indicator for the medical quality of a hospital and healthcare system. Identifying high-risk patients likely to suffer from readmission before release benefits both the patients and the medical providers. The emergence of machine learning to detect hidden patterns in complex, multi-dimensional datasets provides unparalleled opportunities to develop efficient discharge decision-making support system for physicians.Methods and FindingsWe used supervised machine learning approaches for ICU readmission prediction. We used machine learning methods on comprehensive, longitudinal clinical data from the MIMIC-III to predict the ICU readmission of patients within 30 days of their discharge. We have utilized recent machine learning techniques such as Recurrent Neural Networks (RNN) with Long Short-Term Memory (LSTM), by this we have been able incorporate the multivariate features of EHRs and capture sudden fluctuations in chart event features (e.g. glucose and heart rate) that are significant in time series with temporal dependencies, which cannot be properly captured by traditional static models, but can be captured by our proposed deep neural network based model. We incorporate multiple types of features including chart events, demographic, and ICD9 embeddings. Our machine learning models identifies ICU readmissions at a higher sensitivity rate (0.742) and an improved Area Under the Curve (0.791) compared with traditional methods. We also illustrate the importance of each portion of the features and different combinations of the models to verify the effectiveness of the proposed model.ConclusionOur manuscript highlights the ability of machine learning models to improve our ICU decision making accuracy, and is a real-world example of precision medicine in hospitals. These data-driven results enable clinicians to make assisted decisions within their patient cohorts. This knowledge could have immediate implications for hospitals by improving the detection of possible readmission. We anticipate that machine learning models will improve patient counseling, hospital administration, allocation of healthcare resources and ultimately individualized clinical care.


Photonics ◽  
2021 ◽  
Vol 8 (12) ◽  
pp. 535
Author(s):  
Thomas Adler ◽  
Manuel Erhard ◽  
Mario Krenn ◽  
Johannes Brandstetter ◽  
Johannes Kofler ◽  
...  

We demonstrate how machine learning is able to model experiments in quantum physics. Quantum entanglement is a cornerstone for upcoming quantum technologies, such as quantum computation and quantum cryptography. Of particular interest are complex quantum states with more than two particles and a large number of entangled quantum levels. Given such a multiparticle high-dimensional quantum state, it is usually impossible to reconstruct an experimental setup that produces it. To search for interesting experiments, one thus has to randomly create millions of setups on a computer and calculate the respective output states. In this work, we show that machine learning models can provide significant improvement over random search. We demonstrate that a long short-term memory (LSTM) neural network can successfully learn to model quantum experiments by correctly predicting output state characteristics for given setups without the necessity of computing the states themselves. This approach not only allows for faster search, but is also an essential step towards the automated design of multiparticle high-dimensional quantum experiments using generative machine learning models.


Aerospace ◽  
2021 ◽  
Vol 8 (9) ◽  
pp. 236
Author(s):  
Junghyun Kim ◽  
Kyuman Lee

Obtaining reliable wind information is critical for efficiently managing air traffic and airport operations. Wind forecasting has been considered one of the most challenging tasks in the aviation industry. Recently, with the advent of artificial intelligence, many machine learning techniques have been widely used to address a variety of complex phenomena in wind predictions. In this paper, we propose a hybrid framework that combines a machine learning model with Kalman filtering for a wind nowcasting problem in the aviation industry. More specifically, this study has three objectives as follows: (1) compare the performance of the machine learning models (i.e., Gaussian process, multi-layer perceptron, and long short-term memory (LSTM) network) to identify the most appropriate model for wind predictions, (2) combine the machine learning model selected in step (1) with an unscented Kalman filter (UKF) to improve the fidelity of the model, and (3) perform Monte Carlo simulations to quantify uncertainties arising from the modeling process. Results show that short-term time-series wind datasets are best predicted by the LSTM network compared to the other machine learning models and the UKF-aided LSTM (UKF-LSTM) approach outperforms the LSTM network only, especially when long-term wind forecasting needs to be considered.


2021 ◽  
pp. 016555152110077
Author(s):  
Şura Genç ◽  
Elif Surer

Clickbait is a strategy that aims to attract people’s attention and direct them to specific content. Clickbait titles, created by the information that is not included in the main content or using intriguing expressions with various text-related features, have become very popular, especially in social media. This study expands the Turkish clickbait dataset that we had constructed for clickbait detection in our proof-of-concept study, written in Turkish. We achieve a 48,060 sample size by adding 8859 tweets and release a publicly available dataset – ClickbaitTR – with its open-source data analysis library. We apply machine learning algorithms such as Artificial Neural Network (ANN), Logistic Regression, Random Forest, Long Short-Term Memory Network (LSTM), Bidirectional Long Short-Term Memory (BiLSTM) and Ensemble Classifier on 48,060 news headlines extracted from Twitter. The results show that the Logistic Regression algorithm has 85% accuracy; the Random Forest algorithm has a performance of 86% accuracy; the LSTM has 93% accuracy; the ANN has 93% accuracy; the Ensemble Classifier has 93% accuracy; and finally, the BiLSTM has 97% accuracy. A thorough discussion is provided for the psychological aspects of clickbait strategy focusing on curiosity and interest arousal. In addition to a successful clickbait detection performance and the detailed analysis of clickbait sentences in terms of language and psychological aspects, this study also contributes to clickbait detection studies with the largest clickbait dataset in Turkish.


Author(s):  
Salma P. Z ◽  
Maya Mohan

One of today's important means of communication is email. The extensive use of email for communication has led to many problems. Spam emails being the most crucial among them. It is one the major issues in today's internet world. Spam emails contain mostly advertisements and offensive content, which are often sent without the recipient's request and are generally annoying, time consuming, and wasting space on the communication media's resources. It creates inconveniences and financial loss to the recipients. Hence, there is always the need to filter the spam emails and separate them from the legitimate emails. There are a lot of content-based machine learning techniques that have proven to be effective in detecting and filtering spam emails. Due to a large increase in email spamming, the emails are studied and classified as spam or not spam. In this chapter, three machine learning models, Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), and Bidirectional LSTM (BLSTM), are used classify the emails as spam and benign.


2022 ◽  
Vol 2161 (1) ◽  
pp. 012055
Author(s):  
H O Lekshmy ◽  
Dhanyalaxmi Panickar ◽  
Sandhya Harikumar

Abstract Epilepsy is a common neurological disease that affects more than 2 percent of the population globally. An imbalance in brain electrical activities causes unpredictable seizures, which eventually leads to epilepsy. Neurostimulators have the power to intervene in advance and avoid the occurrence of seizures. Its efficiency can be increased with the help of heuristics like advanced seizure prediction. Early identification of preictal state will help easy activation of neurostimulator on time. This research concentrates on the performance analysis of various machine learning algorithms on recorded EEG data. Through this study, we aim to find the best model, which can be used to create an ensemble model for better learning. This involves modeling and simulation of classical machine learning technique like Logistic regression, Naive Bayes model, K nearest neighbors Random Forest, and deep learning techniques like an Artificial neural network, Convolutional neural networks, Long short term memory, and Autoencoders. In this analysis, Random Forest and Long Short-Term Memory performed well among all models in terms of sensitivity and specificity.


Sign in / Sign up

Export Citation Format

Share Document