scholarly journals Metaheuristic Ensemble Pruning via Greedy-based Optimization Selection

2022 ◽  
Vol 13 (1) ◽  
pp. 0-0

Ensemble selection is a crucial problem for ensemble learning (EL) to speed up the predictive model, reduce the storage space requirements and to further improve prediction accuracy. Diversity among individual predictors is widely recognized as a key factor to successful ensemble selection (ES), while the ultimate goal of ES is to improve its predictive accuracy and generalization of the ensemble. Motivated by the problems stated in previous, we have devised a novel hybrid layered based greedy ensemble reduction (HLGER) architecture to delete the predictor with lowest accuracy and diversity with evaluation function according to the diversity metrics. Experimental investigations are conducted based on benchmark time series data sets, support vectors regression algorithm utilized as base learner to generate homogeneous ensemble, HLGER uses locally weight ensemble (LWE) strategies to provide a final ensemble prediction. The experimental results demonstrate that, in comparison with benchmark ensemble pruning techniques, HLGER achieves significantly superior generalization performance.

2020 ◽  
Author(s):  
Hsiao-Ko Chang ◽  
Hui-Chih Wang ◽  
Chih-Fen Huang ◽  
Feipei Lai

BACKGROUND In most of Taiwan’s medical institutions, congestion is a serious problem for emergency departments. Due to a lack of beds, patients spend more time in emergency retention zones, which make it difficult to detect cardiac arrest (CA). OBJECTIVE We seek to develop a pharmaceutical early warning model to predict cardiac arrest in emergency departments via drug classification and medical expert suggestion. METHODS We propose a new early warning score model for detecting cardiac arrest via pharmaceutical classification and by using a sliding window; we apply learning-based algorithms to time-series data for a Pharmaceutical Early Warning Scoring Model (PEWSM). By treating pharmaceutical features as a dynamic time-series factor for cardiopulmonary resuscitation (CPR) patients, we increase sensitivity, reduce false alarm rates and mortality, and increase the model’s accuracy. To evaluate the proposed model we use the area under the receiver operating characteristic curve (AUROC). RESULTS Four important findings are as follows: (1) We identify the most important drug predictors: bits, and replenishers and regulators of water and electrolytes. The best AUROC of bits is 85%; that of replenishers and regulators of water and electrolytes is 86%. These two features are the most influential of the drug features in the task. (2) We verify feature selection, in which accounting for drugs improve the accuracy: In Task 1, the best AUROC of vital signs is 77%, and that of all features is 86%. In Task 2, the best AUROC of all features is 85%, which demonstrates that thus accounting for the drugs significantly affects prediction. (3) We use a better model: For traditional machine learning, this study adds a new AI technology: the long short-term memory (LSTM) model with the best time-series accuracy, comparable to the traditional random forest (RF) model; the two AUROC measures are 85%. (4) We determine whether the event can be predicted beforehand: The best classifier is still an RF model, in which the observational starting time is 4 hours before the CPR event. Although the accuracy is impaired, the predictive accuracy still reaches 70%. Therefore, we believe that CPR events can be predicted four hours before the event. CONCLUSIONS This paper uses a sliding window to account for dynamic time-series data consisting of the patient’s vital signs and drug injections. In a comparison with NEWS, we improve predictive accuracy via feature selection, which includes drugs as features. In addition, LSTM yields better performance with time-series data. The proposed PEWSM, which offers 4-hour predictions, is better than the National Early Warning Score (NEWS) in the literature. This also confirms that the doctor’s heuristic rules are consistent with the results found by machine learning algorithms.


2020 ◽  
Author(s):  
Hsiao-Ko Chang ◽  
Hui-Chih Wang ◽  
Chih-Fen Huang ◽  
Feipei Lai

BACKGROUND In most of Taiwan’s medical institutions, congestion is a serious problem for emergency departments. Due to a lack of beds, patients spend more time in emergency retention zones, which make it difficult to detect cardiac arrest (CA). OBJECTIVE We seek to develop a Drug Early Warning System Model (DEWSM), it included drug injections and vital signs as this research important features. We use it to predict cardiac arrest in emergency departments via drug classification and medical expert suggestion. METHODS We propose this new model for detecting cardiac arrest via drug classification and by using a sliding window; we apply learning-based algorithms to time-series data for a DEWSM. By treating drug features as a dynamic time-series factor for cardiopulmonary resuscitation (CPR) patients, we increase sensitivity, reduce false alarm rates and mortality, and increase the model’s accuracy. To evaluate the proposed model, we use the area under the receiver operating characteristic curve (AUROC). RESULTS Four important findings are as follows: (1) We identify the most important drug predictors: bits (intravenous therapy), and replenishers and regulators of water and electrolytes (fluid and electrolyte supplement). The best AUROC of bits is 85%, it means the medical expert suggest the drug features: bits, it will affect the vital signs, and then the evaluate this model correctly classified patients with CPR reach 85%; that of replenishers and regulators of water and electrolytes is 86%. These two features are the most influential of the drug features in the task. (2) We verify feature selection, in which accounting for drugs improve the accuracy: In Task 1, the best AUROC of vital signs is 77%, and that of all features is 86%. In Task 2, the best AUROC of all features is 85%, which demonstrates that thus accounting for the drugs significantly affects prediction. (3) We use a better model: For traditional machine learning, this study adds a new AI technology: the long short-term memory (LSTM) model with the best time-series accuracy, comparable to the traditional random forest (RF) model; the two AUROC measures are 85%. It can be seen that the use of new AI technology will achieve better results, currently comparable to the accuracy of traditional common RF, and the LSTM model can be adjusted in the future to obtain better results. (4) We determine whether the event can be predicted beforehand: The best classifier is still an RF model, in which the observational starting time is 4 hours before the CPR event. Although the accuracy is impaired, the predictive accuracy still reaches 70%. Therefore, we believe that CPR events can be predicted four hours before the event. CONCLUSIONS This paper uses a sliding window to account for dynamic time-series data consisting of the patient’s vital signs and drug injections. The National Early Warning Score (NEWS) only focuses on the score of vital signs, and does not include factors related to drug injections. In this study, the experimental results of adding the drug injections are better than only vital signs. In a comparison with NEWS, we improve predictive accuracy via feature selection, which includes drugs as features. In addition, we use traditional machine learning methods and deep learning (using LSTM method as the main processing time series data) as the basis for comparison of this research. The proposed DEWSM, which offers 4-hour predictions, is better than the NEWS in the literature. This also confirms that the doctor’s heuristic rules are consistent with the results found by machine learning algorithms.


2017 ◽  
Author(s):  
Anthony Szedlak ◽  
Spencer Sims ◽  
Nicholas Smith ◽  
Giovanni Paternostro ◽  
Carlo Piermarocchi

AbstractModern time series gene expression and other omics data sets have enabled unprecedented resolution of the dynamics of cellular processes such as cell cycle and response to pharmaceutical compounds. In anticipation of the proliferation of time series data sets in the near future, we use the Hopfield model, a recurrent neural network based on spin glasses, to model the dynamics of cell cycle in HeLa (human cervical cancer) and S. cerevisiae cells. We study some of the rich dynamical properties of these cyclic Hopfield systems, including the ability of populations of simulated cells to recreate experimental expression data and the effects of noise on the dynamics. Next, we use a genetic algorithm to identify sets of genes which, when selectively inhibited by local external fields representing gene silencing compounds such as kinase inhibitors, disrupt the encoded cell cycle. We find, for example, that inhibiting the set of four kinases BRD4, MAPK1, NEK7, and YES1 in HeLa cells causes simulated cells to accumulate in the M phase. Finally, we suggest possible improvements and extensions to our model.Author SummaryCell cycle – the process in which a parent cell replicates its DNA and divides into two daughter cells – is an upregulated process in many forms of cancer. Identifying gene inhibition targets to regulate cell cycle is important to the development of effective therapies. Although modern high throughput techniques offer unprecedented resolution of the molecular details of biological processes like cell cycle, analyzing the vast quantities of the resulting experimental data and extracting actionable information remains a formidable task. Here, we create a dynamical model of the process of cell cycle using the Hopfield model (a type of recurrent neural network) and gene expression data from human cervical cancer cells and yeast cells. We find that the model recreates the oscillations observed in experimental data. Tuning the level of noise (representing the inherent randomness in gene expression and regulation) to the “edge of chaos” is crucial for the proper behavior of the system. We then use this model to identify potential gene targets for disrupting the process of cell cycle. This method could be applied to other time series data sets and used to predict the effects of untested targeted perturbations.


Author(s):  
Pritpal Singh

Forecasting using fuzzy time series has been applied in several areas including forecasting university enrollments, sales, road accidents, financial forecasting, weather forecasting, etc. Recently, many researchers have paid attention to apply fuzzy time series in time series forecasting problems. In this paper, we present a new model to forecast the enrollments in the University of Alabama and the daily average temperature in Taipei, based on one-factor fuzzy time series. In this model, a new frequency based clustering technique is employed for partitioning the time series data sets into different intervals. For defuzzification function, two new principles are also incorporated in this model. In case of enrollments as well daily temperature forecasting, proposed model exhibits very small error rate.


2019 ◽  
Vol 11 (3) ◽  
pp. 933 ◽  
Author(s):  
Yanping Qian ◽  
Zhen Wu

Impervious surface area is a key factor affecting urbanization and urban environmental quality. It is of great significance to analysis timely and accurately the dynamic changes of impervious surface for urban development planning. In this study, we use a comprehensive method to extract the time series data on the impervious surface area (ISA) from the multi-temporal Landsat remote sensing images with a high overall accuracy of 90%. The processes and mechanisms of urban expansion at different political administration and direction level in the Nanjing metropolitan area are investigated by using the comprehensive classification method consisting of minimum noise fraction, linear spectral mixture analysis, spectral index, and decision tree classifiers. The expansion of Nanjing is examined by using various ISA indexes and concentric regression analyses. Results indicate that the overall classification accuracy of ISA is higher than 90%. The ISA in Nanjing has dramatically increased in the past three decades from 427.36 km2 to 1780.21 km2 and with a high expansion rate of 0.48 from 2000 to 2005. The city sprawls from monocentric to urban core with multiple subcenters in a concentric structure, and the geometric gravity center of construction land moves southward annually. The stages of urbanization in different district levels and the dynamic changes in different direction levels are influenced by the topographic and economic factors.


SAGE Open ◽  
2020 ◽  
Vol 10 (2) ◽  
pp. 215824402091827
Author(s):  
Oluwabunmi O. Adejumo

In the school of development thought, growth has been identified as a viable alternative to the challenge of poverty and economic backwardness. However, the ecologists have continuously challenged the growth position in relation to environmental degradation and depletion. It is against this background; this study examined the limits to growth in Nigeria beyond which there will be inimical consequences for the environment. The study employed time series data that spanned between 1970 and 2014. These data sets were sourced from the World Development Indicators. Based on the assimilation model, threshold estimates were used to identify optimal growth regions, whereas regression estimates were used to measure growth effects. It was discovered that below the identified growth limit, there are currently significant negative impacts on the quality of the environment in Nigeria via economic growth. This study is a single-country case, that is, Nigeria; hence, the study can be expanded to include other sub-Saharan African countries. The study adds to knowledge by establishing the prospects for sustainability in the quality of the environment in the long run; therefore, policies designed in this areas have higher likelihood of attaining sustainability.


2020 ◽  
Vol 496 (1) ◽  
pp. 629-637
Author(s):  
Ce Yu ◽  
Kun Li ◽  
Shanjiang Tang ◽  
Chao Sun ◽  
Bin Ma ◽  
...  

ABSTRACT Time series data of celestial objects are commonly used to study valuable and unexpected objects such as extrasolar planets and supernova in time domain astronomy. Due to the rapid growth of data volume, traditional manual methods are becoming extremely hard and infeasible for continuously analysing accumulated observation data. To meet such demands, we designed and implemented a special tool named AstroCatR that can efficiently and flexibly reconstruct time series data from large-scale astronomical catalogues. AstroCatR can load original catalogue data from Flexible Image Transport System (FITS) files or data bases, match each item to determine which object it belongs to, and finally produce time series data sets. To support the high-performance parallel processing of large-scale data sets, AstroCatR uses the extract-transform-load (ETL) pre-processing module to create sky zone files and balance the workload. The matching module uses the overlapped indexing method and an in-memory reference table to improve accuracy and performance. The output of AstroCatR can be stored in CSV files or be transformed other into formats as needed. Simultaneously, the module-based software architecture ensures the flexibility and scalability of AstroCatR. We evaluated AstroCatR with actual observation data from The three Antarctic Survey Telescopes (AST3). The experiments demonstrate that AstroCatR can efficiently and flexibly reconstruct all time series data by setting relevant parameters and configuration files. Furthermore, the tool is approximately 3× faster than methods using relational data base management systems at matching massive catalogues.


2020 ◽  
Vol 7 (11) ◽  
pp. 467-484
Author(s):  
Sunday Osahon Igbinedion

Extant economic literature has acknowledged monetary policy as a key factor influencing infrastructural growth through different channels, such as affordable housing and efficient transportation, among others. However, in recent times, the Nigeria’s experience suggests a conflicting position on the above supposition. It is against this backdrop that this study set out to investigate the nexus between monetary policy and infrastructural growth within the Nigerian context, time series data from 1981 to 2018, and utilizing the Fully Modified Least Squares (FMOLS) estimation technique. The results show that both real interest rate and inflation rate exerted negative and statistically significant impact on infrastructural growth, while federal government capital expenditure and net official development assistance impacted positively on the level of infrastructural growth in the period under assessment. In the light of the study’s findings, the study recommends that, the monetary authority should carefully review existing lending interest rate downward to a single digit that will be investment driven particularly in the face of current global economic uncertainties occasioned by the COVID-19 pandemic that has led to the collapse of many economies across the world.


Sign in / Sign up

Export Citation Format

Share Document