scholarly journals Alighting Stop Determination of Unlinked Trips Based on a Two-Layer Stacking Framework

2021 ◽  
Vol 2021 ◽  
pp. 1-15
Author(s):  
Ziwei Cui ◽  
Cheng Wang ◽  
Yueer Gao ◽  
Dingkang Yang ◽  
Wei Wei ◽  
...  

Smart card data of conventional bus passengers are important basic data for many studies such as bus network optimization. As only boarding information is recorded in most cities, alighting stops need to be identified. The classical trip chain method can only detect destinations of passengers who have trip cycles. However, the rest of unlinked trips without destinations are hard to analyze. To improve the accuracy of existing methods for determining alighting stops of unlinked trips, a two-layer stacking-framework-based method is proposed in this work. In the first layer, five methods are used, i.e., high-frequency stop method, stop attraction method, transfer convenience method, land-use type attraction method, and improved group historical set method (I-GHSM). Among them, the last one is presented here to cluster records with similar behavior patterns into a group more accurately. In the second layer, the logistic regression model is selected to get the appropriate weight of each method in the former layer for different datasets, which brings the generalization ability. Taking data from Xiamen BRT Line Kuai 1 as an example, I-GHSM given in the first layer has proved to be necessary and effective. Besides, the two-layer stacking-framework-based method can detect all destinations of unlinked trips with an accuracy of 51.88%, and this accuracy is higher than that of comparison methods, i.e., the two-step algorithms with KNN (k-nearest neighbor), Decision Tree or Random Forest, and a step-by-step method. Results indicate that the framework-based method presented has high accuracy in identifying all alighting stops of unlinked trips.

2019 ◽  
Vol 1280 ◽  
pp. 022025
Author(s):  
W Uriawan ◽  
A Kodir ◽  
A R Atmadja ◽  
F Fathurrahman ◽  
M A Ramdhani

2008 ◽  
Author(s):  
Petronella Anbeek ◽  
Koen L. Vincken ◽  
Max A. Viergever

This paper proposes a new method for fully automated multiple sclerosis (MS) lesion segmentation in cranial magnetic resonance (MR) imaging. The algorithm uses the T1-weighted and the fluid attenuation inversion recovery scans. It is based the K-Nearest Neighbor (KNN) classification technique. The data has been acquired at the Children�s Hospital Boston (CHB) and the University of North Carolina (UNC). Manual segmentations, composed by a human expert of the CHB, were used for training of the KNN-classification. The method uses voxel location and signal intensity information for determination of the probability being a lesion per voxel, thus generating probabilistic segmentation images. By applying a threshold on the probabilistic images binary segmentations are derived. Automatic segmentations were performed on a set of testing images, and compared with manual segmentations from a CHB and a UNC expert rater. Furthermore, a combined segmentation was composed from segmentations from different algorithms, and used for evaluation. The proposed method shows good resemblance with the segmentations of the CHB rater. High specificity and lower specificity has been observed in comparison with the combined segmentations. Over- and undersegmentation can be easily corrected in this procedure by varying the threshold on the probabilistic segmentation image. The proposed method offers an automated and fully reproducible approach that accurate and applicable on standard clinical MR images.


2019 ◽  
Vol 8 (3) ◽  
pp. 366-376
Author(s):  
Annisa Sugesti ◽  
Moch. Abdul Mukid ◽  
Tarno Tarno

Credit feasibility analysis is important for lenders to avoid the risk among the increasement of credit applications. This analysis can be carried out by the classification technique. Classification technique used in this research is instance-based classification. These techniques tend to be simple, but are very dependent on the determination of  K values. K is number of nearest neighbor considered for class classification of new data. A small value of K is very sensitive to outliers. This weakness can be overcome using an algorithm that is able to handle outliers, one of them is Mutual K-Nearest Neighbor (MKNN). MKNN removes outliers first, then predicts new observation classes based on the majority class of their mutual nearest neighbors. The algorithm will be compared with KNN without outliers. The model is evaluated by 10-fold cross validation and the classification performance is measured by Gemoetric-Mean of sensitivity and specificity. Based on the analysis the optimal value of K is 9 for MKNN and 3 for KNN, with the highest G-Mean produced by KNN is equal to 0.718, meanwhile G-Mean produced by MKNN is 0.702. The best alternative to classifying credit feasibility in this study is K-Nearest Neighbor (KNN) algorithm with K=3.Keywords: Classification, Credit, MKNN, KNN, G-Mean.


Author(s):  
Ahmed Wasif Reza ◽  
Abdullah Al Rifat ◽  
Tanvir Ahmed

Indoor network optimization is not a simple task due to the obstacles, interference, and attenuation of the signal in an environment. Intense noises can affect the intelligibility of the signal and reduce the coverage strength significantly which results in a poor user experience. Most of the existing works are associated with finding the location of the devices via different mathematical and generic algorithmic approaches, but very few are focused on implying machine learning algorithms. The purpose of this research is to introduce an integrated machine learning model to find maximum indoor coverage with a minimum number of transmitters. The users in the indoor environment also have been allocated based on the most reliable signal strength and the system is also capable of allocating new users. K-means clustering, K-nearest neighbor (KNN), support vector machine (SVM), and Gaussian Naïve Bayes (GNB) have been used to provide an optimized solution. It is found that KNN, SVM, and GNB obtained maximum accuracy of 100% in some cases. However, among all the algorithms, KNN performed the best and provided an average accuracy of 93.33%. K-fold cross-validation (Kf-CV) technique has been added to validate the experimental simulations and re-evaluate the outcomes of the machine learning models.


Author(s):  
Raemon S Saljumairi ◽  
Sarjon Defit ◽  
S Sumijan ◽  
Yusma Elda

The Current wireless technology is used to find out where the user is in the room. Utilization of WiFi strength signal from the Access Point (AP) can provide information on the user position in a room. Alternative determination of the user's position in the room using WiFi Receive Signal Strength (RSS). This research was conducted by comparing the distance between users to 2 or more APs using the euclidean distance technique. The Euclidean distance technique is used as a distance calculator where there are two points in a 3-dimensional plane or space by measuring the length of the segment connecting two points. This technique is best for representing the distance between the users and the AP. The collection of RSS data uses the Fingerprinting technique. The RSS data was collected from 20 APs detected using the wifi analyzer application, from the results of the scanning, 709 RSS data were obtained. The RSS value is used as training data. K-Nearest Neighbor (K-NN) uses the Neighborhood Classification as the predictive value of the new test data so that K-NN can classify the closest distance from the new test data to the value of the existing training data. Based on the test results obtained an accuracy rate of 95% with K is 3. Based on the results of research that has been done that using the K-NN method obtained excellent results, with the highest accuracy rate of 95% with a minimum error value of 5%


2017 ◽  
Vol 11 (1) ◽  
pp. 42
Author(s):  
Yampi R Kaesmitan ◽  
Jusrianto A Johannis

Nutritional status is a state of the body as a result of food consumption patterns and the use of nutritional substances. Determination of the nutritional status of children is useful to know the circumstances of infant nutritional based BB / U (Weight by Age), TB / U (Height by Age), BB / TB (Weight by Height). The system designed is a system of determining the nutritional status of children using the K-NN (K-Nearest Neighbor), where the method of K-NN (K-Nearest Neighbor) is a method of classifying or grouping of test data that is unknown class to several nearest neighbors using distance calculation formula. The variables used in this system is based on data Anthropometri or measurements of the human body, namely U (Age), BB (Weight), TB (Height), LK (Head Circumference).


Author(s):  
M. Jeyanthi ◽  
C. Velayutham

In Science and Technology Development BCI plays a vital role in the field of Research. Classification is a data mining technique used to predict group membership for data instances. Analyses of BCI data are challenging because feature extraction and classification of these data are more difficult as compared with those applied to raw data. In this paper, We extracted features using statistical Haralick features from the raw EEG data . Then the features are Normalized, Binning is used to improve the accuracy of the predictive models by reducing noise and eliminate some irrelevant attributes and then the classification is performed using different classification techniques such as Naïve Bayes, k-nearest neighbor classifier, SVM classifier using BCI dataset. Finally we propose the SVM classification algorithm for the BCI data set.


2020 ◽  
Vol 17 (1) ◽  
pp. 319-328
Author(s):  
Ade Muchlis Maulana Anwar ◽  
Prihastuti Harsani ◽  
Aries Maesya

Population Data is individual data or aggregate data that is structured as a result of Population Registration and Civil Registration activities. Birth Certificate is a Civil Registration Deed as a result of recording the birth event of a baby whose birth is reported to be registered on the Family Card and given a Population Identification Number (NIK) as a basis for obtaining other community services. From the total number of integrated birth certificate reporting for the 2018 Population Administration Information System (SIAK) totaling 570,637 there were 503,946 reported late and only 66,691 were reported publicly. Clustering is a method used to classify data that is similar to others in one group or similar data to other groups. K-Nearest Neighbor is a method for classifying objects based on learning data that is the closest distance to the test data. k-means is a method used to divide a number of objects into groups based on existing categories by looking at the midpoint. In data mining preprocesses, data is cleaned by filling in the blank data with the most dominating data, and selecting attributes using the information gain method. Based on the k-nearest neighbor method to predict delays in reporting and the k-means method to classify priority areas of service with 10,000 birth certificate data on birth certificates in 2019 that have good enough performance to produce predictions with an accuracy of 74.00% and with K = 2 on k-means produces a index davies bouldin of 1,179.


Sign in / Sign up

Export Citation Format

Share Document