scholarly journals IMPROVING PERFORMANCE OF INDUCTIVE MODELS THROUGH AN ALGORITHM AND SAMPLE COMBINATION STRATEGY

2001 ◽  
Vol 10 (04) ◽  
pp. 555-572
Author(s):  
HALEH VAFAIE ◽  
DEAN ABBOTT ◽  
MARK HUTCHINS ◽  
I. PHILIP MATKOVSKY

Multiple approaches have been developed for improving predictive performance of a system by creating and combining various learned models. There are two main approaches to creating model ensembles. This first is to create a set of learned models by applying an algorithm repeatedly to different training sample data, the second applies various learning algorithms to the same sample data. The predictions of the models are then combined accordings to a voting scheme. This paper presents a method for combining models that were developed using numerous samples, modeling algorithms, and modelers and compares it with the alternate approaches. The presented results are based on findings from an ongoing operational data mining initiative with respect to selecting a model set that is best able to meet defined goals from among trained models. The operational goals to be attained in this initiative are to deploy data mining model(s) that maximizes specificity with minimal negative impact to sensitivity. The results of the model combination methods are evaluated with respect to sensitivity and false alarm rates and are then compared against other approaches.

2017 ◽  
Vol 2017 ◽  
pp. 1-11
Author(s):  
Changhao Fan ◽  
Xuefeng Yan

In modeling, only information from the deviation between the output of the support vector regression (SVR) model and the training sample is considered, whereas the other prior information of the training sample, such as probability distribution information, is ignored. Probabilistic distribution information describes the overall distribution of sample data in a training sample that contains different degrees of noise and potential outliers, as well as helping develop a high-accuracy model. To mine and use the probability distribution information of a training sample, a new support vector regression model that incorporates probability distribution information weight SVR (PDISVR) is proposed. In the PDISVR model, the probability distribution of each sample is considered as the weight and is then introduced into the error coefficient and slack variables of SVR. Thus, the deviation and probability distribution information of the training sample are both used in the PDISVR model to eliminate the influence of noise and outliers in the training sample and to improve predictive performance. Furthermore, examples with different degrees of noise were employed to demonstrate the performance of PDISVR, which was then compared with those of three SVR-based methods. The results showed that PDISVR performs better than the three other methods.


2021 ◽  
Author(s):  
Saloua Balhane ◽  
Fatima Driouech ◽  
Omar Chafki ◽  
Rodrigo Manzanas ◽  
Abdelghani Chehbouni ◽  
...  

AbstractInternal variability, multiple emission scenarios, and different model responses to anthropogenic forcing are ultimately behind a wide range of uncertainties that arise in climate change projections. Model weighting approaches are generally used to reduce the uncertainty related to the choice of the climate model. This study compares three multi-model combination approaches: a simple arithmetic mean and two recently developed weighting-based alternatives. One method takes into account models’ performance only and the other accounts for models’ performance and independence. The effect of these three multi-model approaches is assessed for projected changes of mean precipitation and temperature as well as four extreme indices over northern Morocco. We analyze different widely used high-resolution ensembles issued from statistical (NEXGDDP) and dynamical (Euro-CORDEX and bias-adjusted Euro-CORDEX) downscaling. For the latter, we also investigate the potential added value that bias adjustment may have over the raw dynamical simulations. Results show that model weighting can significantly reduce the spread of the future projections increasing their reliability. Nearly all model ensembles project a significant warming over the studied region (more intense inland than near the coasts), together with longer and more severe dry periods. In most cases, the different weighting methods lead to almost identical spatial patterns of climate change, indicating that the uncertainty due to the choice of multi-model combination strategy is nearly negligible.


2015 ◽  
Vol 1 (4) ◽  
pp. 270
Author(s):  
Muhammad Syukri Mustafa ◽  
I. Wayan Simpen

Penelitian ini dimaksudkan untuk melakukan prediksi terhadap kemungkian mahasiswa baru dapat menyelesaikan studi tepat waktu dengan menggunakan analisis data mining untuk menggali tumpukan histori data dengan menggunakan algoritma K-Nearest Neighbor (KNN). Aplikasi yang dihasilkan pada penelitian ini akan menggunakan berbagai atribut yang klasifikasikan dalam suatu data mining antara lain nilai ujian nasional (UN), asal sekolah/ daerah, jenis kelamin, pekerjaan dan penghasilan orang tua, jumlah bersaudara, dan lain-lain sehingga dengan menerapkan analysis KNN dapat dilakukan suatu prediksi berdasarkan kedekatan histori data yang ada dengan data yang baru, apakah mahasiswa tersebut berpeluang untuk menyelesaikan studi tepat waktu atau tidak. Dari hasil pengujian dengan menerapkan algoritma KNN dan menggunakan data sampel alumni tahun wisuda 2004 s.d. 2010 untuk kasus lama dan data alumni tahun wisuda 2011 untuk kasus baru diperoleh tingkat akurasi sebesar 83,36%.This research is intended to predict the possibility of new students time to complete studies using data mining analysis to explore the history stack data using K-Nearest Neighbor algorithm (KNN). Applications generated in this study will use a variety of attributes in a data mining classified among other Ujian Nasional scores (UN), the origin of the school / area, gender, occupation and income of parents, number of siblings, and others that by applying the analysis KNN can do a prediction based on historical proximity of existing data with new data, whether the student is likely to complete the study on time or not. From the test results by applying the KNN algorithm and uses sample data alumnus graduation year 2004 s.d 2010 for the case of a long and alumni data graduation year 2011 for new cases obtained accuracy rate of 83.36%.


2014 ◽  
Vol 543-547 ◽  
pp. 4698-4701
Author(s):  
Juan Wang

During the processing of aircraft and other high precision machinery workpieces, if using the traditional machining methods, it will consume a amount of machining costs, and the mechanical processing cycle is long. In this context, this paper designs a kind of robot intelligent processing system with high precision machinery. And it has realized the intelligent online control on the machining process by using the high precision machining intelligent online monitoring technology and the numerical simulation prediction technology. Finally, this system is introduced into the process of data mining for volleyball game, and designs the partial differential variational data mining model, which has realized the key parameter data mining of volleyball games service system, and has provided reliable parameters and technical support for the training of volleyball players.


Sign in / Sign up

Export Citation Format

Share Document