Trip Generation Modeling Using CHAID, a Criterion-Based Segmentation Modeling Tool

Author(s):  
Orlando Strambi ◽  
Karin-Anne Van De Bilt

Conventional trip generation models are identified, as are the difficulties of model application typical of segmentation problems: identification and categorization of explanatory variables and of the interactions among them. The use of CHAID (Chi-Squared Automatic Interaction Detection), a criterion-based segmentation modeling tool, is explored to analyze household trip generation rates. CHAID models are presented in the form of a tree, each final node representing a group of homogenous households concerning daily trip making. An application to data from an origin-destination survey for São Paulo produced interesting results, in agreement with theoretical expectations and amenable to interpretation based on the likely activity-travel patterns of each group of households generated by the technique. CHAID can be used as an exploratory technique for aiding model development or as a model by itself. The use of CHAID results as a trip generation model was verified through an evaluation of its predictive capability in a cross comparison of two subsamples and through a comparison of observed versus predicted trips at a zone level; the segmentation of households produced by the technique provided good estimates of trip rates and zone totals. The application of a modeling approach requiring a highly disaggregate projection of the population may become possible considering the advances in methods for the generation of synthetic populations. The use of these methods in conjunction with a segmentation model represents an alternative to conventional trip generation models and an opportunity to introduce new population forecasting techniques into transportation planning practice.

2020 ◽  
Vol 2019 (1) ◽  
pp. 357-367
Author(s):  
Isti Samrotul Hidayati ◽  
I Made Arcana

Metode Chi-squared Automatic Interaction Detection (CHAID) merupakan metode segmentasi berdasarkan hubungan variabel respon dan penjelas menggunakan uji chi-square, yang dalam penerapannya perlu memperhatikan keseimbangan data untuk meminimalkan kesalahan dalam klasifikasi. Salah satu pendekatan yang dapat digunakan pada data yang tidak seimbang adalah metode Synthetic Minority Over-sampling Technique (SMOTE). Dalam penelitian ini, metode CHAID dengan pendekatan SMOTE diterapkan pada Angka Kematian Balita (AKBa) di Kawasan Timur Indonesia (KTI). Tujuannya adalah untuk mengetahui variabel-variabel yang mencirikan kematian balita berdasarkan metode analisis CHAID yang diterapkan dan membandingkannya dengan pendekatan SMOTE. Hasil perbandingan menunjukkan bahwa pendekatan SMOTE lebih baik digunakan dengan nilai sensitivitas sebesar 48,3% dan nilai presisi sebesar 75,9%. Variabel yang signifikan mencirikan kematian balita di KTI adalah berat badan saat lahir, jenis kelahiran, status bekerja ibu dan kekayaan rumah tangga, dengan karakteristik utama adalah balita yang memiliki berat badan lahir rendah dan terlahir kembar.


2020 ◽  
Vol 2020 ◽  
pp. 1-18 ◽  
Author(s):  
Guozhu Cheng ◽  
Rui Cheng ◽  
Yulong Pei ◽  
Liang Xu

To predict the probability of roadside accidents for curved sections on highways, we chose eight risk factors that may contribute to the probability of roadside accidents to conduct simulation tests and collected a total of 12,800 data obtained from the PC-crash software. The chi-squared automatic interaction detection (CHAID) decision tree technique was employed to identify significant risk factors and explore the influence of different combinations of significant risk factors on roadside accidents according to the generated decision rules, so as to propose specific improved countermeasures as the reference for the revision of the Design Specification for Highway Alignment (JTG D20-2017) of China. Considering the effects of related interactions among different risk factors on roadside accidents, path analysis was applied to investigate the importance of the significant risk factors. The results showed that the significant risk factors were in decreasing order of importance, vehicle speed, horizontal curve radius, vehicle type, adhesion coefficient, hard shoulder width, and longitudinal slope. The first five important factors were chosen as predictors of the probability of roadside accidents in the Bayesian network analysis to establish the probability prediction model of roadside accidents. Eventually, the thresholds of the various factors for roadside accident blackspot identification were given according to probabilistic prediction results.


Author(s):  
Zeying Huang ◽  
Haijun Li ◽  
Jiazhang Huang

The nutrition facts table is a nutrition labeling tool designed to inform consumers of food nutritional contents and enable them to make healthier choices by comparing the nutritional values of similar foods. However, its adoption level is considerably low in China. This study employed the Chi-squared Automatic Interaction Detection (CHAID) algorithm to explore the factors associated with respondents’ adoption of nutrition facts table to compare the nutritional values of similar foods. Data were gathered through a nationally representative online survey of 1500 samples. Results suggested that consumers’ comprehension of the nutrition facts table was a direct explanatory factor for its use. The usage was also indirectly explained by people’s nutrition knowledge, the usage of nutrition facts table by their relatives and friends, and their focus on a healthy diet. Therefore, to increase the use of nutrition facts table by Chinese consumers, the first consideration should be given to enhancing consumers’ comprehension of the labeling


2006 ◽  
Vol 86 (1) ◽  
pp. 191-202
Author(s):  
Ivan Ratkaj

Trip generation models aim to predict the amount of transportation movements (or the number of potential trip makers) leaving a territorial unit according to the attributes of that unit. There are two basic approaches used for modeling the generation of trips: linear regression and category analysis. This article explains the issue of trip generation modeling based on the methodology of linear regression analysis, on the example of grammar schools in Belgrade.


2014 ◽  
Vol 4 (2) ◽  
Author(s):  
Heri Susanto ◽  
Sudiyatno Sudiyatno

Penelitian ini bertujuan untuk membuat prediksi prestasi belajar siswa berdasarkan status sosial ekonomi orang tua, motivasi, kedisiplinan siswa dan prestasi masa lalu menggunakan metode data mining dengan algoritma J48. Sebagai perbandingan, data penelitian dianalisis juga dengan CHAID (Chi Squared Automatic Interaction Detection) dan regresi ganda. Pendekatan penelitian yang digunakan adalah kuantitatif. Subyek penelitian ini adalah siswa tingkat X SMK Negeri 4 Surakarta berjumlah 416 siswa. Teknik pengumpulan data yang digunakan adalah dokumentasi dan angket. Hasil penelitian menunjukkan bahwa analisis prediksi menggunakan decision tree algoritma J48 memiliki akurasi sebesar 95,7%, sedangkan analisis prediksi menggunakan CHAID memiliki tingat akurasi 82,1% dan analisis regresi ganda menghasilkan tingkat signifikansi sebesar 90,6%. Berdasarkan hasil tersebut bisa disimpulkan bahwa metode J48 lebih baik dibandingkan dengan metode CHAID dan regresi ganda. DATA MINING TO PREDICT STUDENT’S ACHIEVEMENT BASED ON SOCIO-ECONOMIC, MOTIVATION, DISCIPLINE AND ACHIEVEMENT OF THE PASTAbstractThis study aims to make student achievement prediction based on socio-economic status of parents, motivation, discipline students and past achievements using data mining methods with the J48 algorithm. For comparison, the data were analyzed also with CHAID (Chi Squared Automatic Interaction Detection) and multiple regression. The research approach is quantitative. The subjects of this study were student-first level at SMK Negeri 4 Surakarta totaled 416 students. Data collection techniques used are documentation and questionnaires. The results showed that the predictive analysis using J48 decision tree algorithm has an accuracy of 95.7%, while the predictive analysis using CHAID has the rank of an accuracy of 82.1% and a multiple regression analysis resulted in a significance level of 90.6%. Based on these results it can be concluded that the J48 method is better than the CHAID and multiple regression methods.


2020 ◽  
Vol 3 (1) ◽  
pp. 410-416
Author(s):  
Murdani Murdani ◽  
Renni Anggraini ◽  
Muhammad Isya

Johan Pahlawan subdistrict is one of subdistricts in West Aceh. This subdistrict is center of all community activities compared to the sub-districts in West Aceh Regency. This is because there are many government offices, schools and trade centers. So that community activities tend to move to this sub-district. The modeling of trip generation has been performed by individuals in one area that will be needed to know by studying a variety of relationships between the characteristic of movements and the environmental of land use. This research aimed at achieving the modelling movements of generation based on activities in the housing of Caritas, Islamic Relief and IOM  in subdistrict of Johan Pahlawan in West Aceh Regency by identifying the factors which have influenced the occurrence of movements to the workplace by dwellers of housing. The data were collected by surveys, questionnaires and the formation of the model was collected by using SPSS 21 and multiple linear analysis to get the best trip generation model. In this study there are five types of activity, two as main activity and three as an additional activity. they are obtained is school activity (mandatory), work activity (mandatory), shuttle of children activity (maintenance), shuttle household affairs activity (maintenance) and social activity (maintenance). Based on the results of running from several variables there are 5 variables that meet to the criteria of model, the variables are number of family members (X1), family income (X2), age (X8), travel distance (X10) and gender (X11). The best models are: Work Aktivity (Y­­­­1) = 0.988 + 0.169 X1 + 0.582 X2, School Aktivity (Y2) = 1.684 + 0.865 X2 + 0.387 X8, Social Activity (Y3) = 0.885 + 0.564 X2 + 0.334 X10, Shuttle of Children Activity (Y4) = 1.028 + 0.902 X8 + 0.557 X11 and Shuttle Household Affairs Activity (Y5) = 2.367 + 0.931 X1 + 0.858 X2.


Author(s):  
Miloslava Kašparová ◽  
Jirí Krupka

This chapter deals with modeling and metamodeling of air quality in the Pardubice region of the Czech Republic. From a regional point of view, the Pardubice district is the most problematic area in regards to air pollution. Concentrations of traffic, industry and power stations (Opatovice and Chvaletice) activities are the cause of this situation, although emissions of all pollutants have markedly decreased within the last ten years. A decrease in air pollution was achieved particularly by restriction and restructuring of industrial production, use of emission standards, changes in legislation in the area of air protection, etc. The mentioned air quality modeling belongs to classification tasks. It means the authors deal with the classification problem, with the creation of classification models (classifiers) and they focus on metamodeling (combining classifiers). Through the application of modeling and metamodeling the authors use selected algorithms of decision trees (C5.0, chi-squared automatic interaction detection and classification and regression trees) that belong to useful explanatory techniques.


Author(s):  
Pradeep Sarvareddy ◽  
Haitham Al-Deek ◽  
Jack Klodzinski ◽  
Georgios Anagnostopoulos

A methodology for building a truck trip generation model by use of artificial neural networks from vessel freight data has been developed and successfully applied to five Florida seaports. The backpropagation neural network (BPNN) algorithm was used in the design. Although the methodology was sound, a new model had to be developed for each of these intermodal facilities. Lead and lag variables were necessary input variables for most models to account for commodities stored on port property before export or pickup after import. Other modeling techniques were researched, and a fully recurrent neural network (FRNN) trained by the real-time recurrent learning algorithm was selected to develop a model for Port Canaveral and compare with a BPNN model. FRNN is dynamic in nature and was found to relate to the storage time of the commodities to truck trip generation. A developed Port Canaveral BPNN model was successfully validated at the 95% confidence level with collected field data. It was applied to conduct a short-term forecast of the port's truck traffic for 5 years. The average annual growth of trucks based on the estimated freight activity under the BPNN model was 5.07%. The Port Canaveral FRNN model adequately estimated the current conditions but failed to forecast truck growth. The FRNN model required more data for forecasting than backpropagation. However, when more consecutive data are available for training, FRNN may produce more accurate results.


Sign in / Sign up

Export Citation Format

Share Document