Rancang Bangun Sistem Informasi Untuk Menentukan Kapabilitas Konsumen Dalam Mengambil Pinjaman KPR

Indonesia is one country that has a relatively large population . The government in the period of 5 years, annually hold a procurement program 1 million FLPP house units. This program is held in an effort to provide a decent home for low income people. FLPP housing development requires good precision and speed of development on the part of the developer, this is often hampered by the bank process, because it is difficult to predict the results and speed of data processing in the bank. Knowing the ability of consumers to get subsidized credit, has many advantages, among others, developers can plan a better cash flow, and developers can replace consumers who will be rejected before entering the bank process. For that reason built a system that can help developers. There are many methods that can be used to create this application. One of them is data mining with Classification tree. The results of 10-fold-cross-validation applications have an accuracy of 92%. Index Terms-Data Mining, Classification Tree, Housing, FLPP, 10-fold-cross Validation, Consumer Capability

Download Full-text

Rule Extraction from Privacy Preserving Neural Network: Application to Banking

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.403-408.920 ◽

2011 ◽

Vol 403-408 ◽

pp. 920-928 ◽

Cited By ~ 1

Author(s):

Nekuri Naveen ◽

V. Ravi ◽

C. Raghavendra Rao

Keyword(s):

Neural Network ◽

Data Mining ◽

Privacy Preservation ◽

Cross Validation ◽

Hybrid Approach ◽

Rule Extraction ◽

Privacy Preserving ◽

Preservation Method ◽

Network Application ◽

Fold Cross Validation

In the last two decades in areas like banking, finance and medical research privacy policies restrict the data owners to share the data for data mining purpose. This issue throws up a new area of research namely privacy preserving data mining. In this paper, we proposed a privacy preservation method by employing Particle Swarm Optimization (PSO) trained Auto Associative Neural Network (PSOAANN). The modified (privacy preserved) input values are fed to a decision tree (DT) and a rule induction algorithm viz., Ripper for rule extraction purpose. The performance of the hybrid is tested on four benchmark and bankruptcy datasets using 10-fold cross validation. The results are compared with those obtained using the original datasets where privacy is not preserved. The proposed hybrid approach achieved good results in all datasets.

Download Full-text

A new GIS-based data mining technique using an adaptive neuro-fuzzy inference system (ANFIS) and k-fold cross-validation approach for land subsidence susceptibility mapping

Natural Hazards ◽

10.1007/s11069-018-3449-y ◽

2018 ◽

Vol 94 (2) ◽

pp. 497-517 ◽

Cited By ~ 36

Author(s):

Omid Ghorbanzadeh ◽

Hashem Rostamzadeh ◽

Thomas Blaschke ◽

Khalil Gholaminia ◽

Jagannath Aryal

Keyword(s):

Data Mining ◽

Land Subsidence ◽

Fuzzy Inference System ◽

Cross Validation ◽

Fuzzy Inference ◽

Data Mining Technique ◽

Inference System ◽

Mining Technique ◽

Neuro Fuzzy ◽

Fold Cross Validation

Download Full-text

Analisis Sentimen Twitter terhadap Tokoh Publik dengan Algoritma Naive Bayes dan Support Vector Machine

Simetris Jurnal Teknik Mesin Elektro dan Ilmu Komputer ◽

10.24176/simet.v11i2.4568 ◽

2021 ◽

Vol 11 (2) ◽

pp. 626-636

Author(s):

Tanthy Tawaqalia Widowati ◽

Mujiono Sadikin

Keyword(s):

Data Mining ◽

Support Vector Machine ◽

Cross Validation ◽

Naive Bayes ◽

Naïve Bayes ◽

Support Vector ◽

Fold Cross Validation

Salah satu media sosial yang berkembang adalah Twitter. Media sosial Twitter mempermudah masyarakat untuk bebas berpendapat melalui cuitan atau biasa disebut dengan tweets. Netizen dengan bebas menyampaikan opini pribadinya untuk topik apapun, termasuk persepsi terhadap tokoh publik. Artikel ini menyajikan hasil penelitian dan analisis sentimen masyarakat (netizen) terhadap tokoh publik, Nadiem Makariem sebagai Menteri Kementerian Pendidikan dan Kebudayaan baru. Penelitian ini menggunakan teknik data mining yang bertujuan untuk membandingkan hasil klasifikasi dari opini masyarakat yang dituliskan di Twitter. Dataset yang digunakan berasal dari tweets dengan kata kunci ”nadiem makariem”, ”kemendikbud” dan ”pak nadiem”. Tools RapidMiner digunakan untuk membantu tahap pre-processing dan klasifikasi menggunakan dua metode yaitu, Naive Bayes dan Support Vector Machine dengan evaluasi k-fold cross-validation. Dari hasil ujicoba diketahui bahwa untuk kasus yang diteliti, metode Naive Bayes menghasilkan kinerja yang lebih baik dengan accuracy 91.48%, precision 89.28% dan recall 91.58%.

Download Full-text

Uji Performa Algoritma Naïve Bayes untuk Prediksi Masa Studi Mahasiswa

Creative Information Technology Journal ◽

10.24076/citec.2019v6i1.178 ◽

2020 ◽

Vol 6 (1) ◽

pp. 1

Author(s):

Irkham Widhi Saputro ◽

Bety Wulan Sari

Keyword(s):

Data Mining ◽

Cross Validation ◽

Naive Bayes ◽

Confusion Matrix ◽

Naïve Bayes ◽

Study Program ◽

New Students ◽

Using Data ◽

The Many ◽

Fold Cross Validation

Universitas AMIKOM Yogyakarta adalah salah satu perguruan tinggi yang memiliki ribuan mahasiswa baru khususnya pada prodi Informatika. Pada tahun 2012 tercatat ada 1009 mahasiswa baru, dan pada tahun 2013 juga tercatat ada sebanyak 859 mahasiswa baru. Namun sayangnya, dari sekian banyak mahasiswa hanya sekitar 50% saja yang dapat lulus dengan tepat waktu. Data tersebut untuk membuat sistem klasifikasi menggunakan teknik data mining dengan metode Naïve Bayes. Dataset yang akan digunakan sebanyak 300 data yang bersumber dari data alumni angkatan 2012, dan 2013 dengan masing-masing data sebanyak 150. Data yang diperoleh memiliki 144 mahasiswa dengan keterangan lulus tepat waktu, dan 156 mahasiswa dengan keterangan lulus tidak tepat waktu. Proses pengujian akan dilakukan menggunakan metode 10-Fold Cross Validation, dan Confusion Matrix. Hasil pengujian menunjukkan bahwa rata-rata performa dari model Naïve Bayes mempunyai nilai akurasi sebesar 68%, nilai precision sebesar 61.3%, nilai recall sebesar 65.3%, dan nilai f1-score sebesar 61%. Nilai performa dari model dapat dipengaruhi oleh dataset yang digunakan untuk pembuatan model.Kata Kunci — data mining, Naïve Bayes, K-Fold Cross Validation, Confusion MatrixAMIKOM Yogyakarta University is one of the colleges that has thousands of new students, especially in the Informatics study program. In 2012 there were 1009 new students, and in 2013 there were 859 new students. But unfortunately, of the many students only around 50% can graduate on time. The data is to make the classification system using data mining techniques with the Naïve Bayes method. The dataset will be used as much as 300 data sourced from alumni data of 2012, and 2013 with each data as much as 150. The data obtained has 144 students with information passed on time, and 156 students with graduation information not on time. The testing process will be carried out using the 10-Fold Cross Validation, and Confusion Matrix method. The test results show that the average performance of the Naïve Bayes model has an accuracy value of 68%, precision value is 61.3%, recall value is 65.3%, and f1-score is 61%. The performance value of the model can be influenced by the dataset used for modeling.Keywords — data mining, classification, Naïve Bayes, graduation time

Download Full-text

Komparasi Algoritma Klasifikasi Data Mining untuk Memprediksi Tingkat Kematian Dini Kanker dengan Dataset Early Death Cancer

JOINTECS (Journal of Information Technology and Computer Science) ◽

10.31328/jointecs.v4i2.1008 ◽

2019 ◽

Vol 4 (2) ◽

pp. 63

Author(s):

Panny Agustia Rahayuningsih

Keyword(s):

Neural Network ◽

Data Mining ◽

Random Forest ◽

Cross Validation ◽

Naive Bayes ◽

Early Death ◽

Naïve Bayes ◽

T Test ◽

Fold Cross Validation

Penyakit Kanker merupakan sepuluh besar penyakit pembunuh di dunia. Kanker merupakan penyakit yang ganas dan sulit disembuhkan jika penyebarannya sudah terlalu luas. Akan tetapi, pendeteksian sel kanker sedini mungkin dapat mengurangi resiko kematian. Penelitian ini bertujuan untuk memprediksikan tingkat kematian dini kanker pada penduduk Eropa dengan menggunakan 5algoritma klasifikasi yaitu: Desecion Tree, Naïve Bayes, k-Nearset Neighbour, Random Forest dan Neural Network dari algoritma tersebut algoritma mana yang dianggap paling baik untuk penelitian ini. Pengujian dilakukan dengan beberapa tahapan penelitian antara lain: dataset (pengumpulan data), pengolahan data awal, metode yang diusulkan, pengujian metode menggunakan 10-fold cross validation, evaluasi hasil dan uji beda t-test. Nilai alpha yang digunakan adalah 0.05. jika probabilitasnya >0.05 maka H0 diterima. Sedangkan jika probabilitasnya <0.05 maka Ho ditolak.Hasil dari penelitian yang mendapatkan performe terbaik dengan nilai akurasi sebesar 98,35% adalah algoritma Neural Network. Sedangkan, hasil penelitian menggunakan uji t-test algoritma dengan model terbaik yaitu: algoritma Random Forest dan Neural Network, algoritma Naïve Bayes lumanyan baik, algoritma Desecion Tree cukup baik dan algoritma yang kurang baik adalah algoritma K-Nearset Neighbour (K-NN).

Download Full-text

Analisis Komparatif Evaluasi Performa Algoritma Klasifikasi pada Readmisi Pasien Diabetes

Jurnal Buana Informatika ◽

10.24002/jbi.v7i4.770 ◽

2016 ◽

Vol 7 (4) ◽

Author(s):

Mochammad Yusa ◽

Ema Utami ◽

Emha T. Luthfi

Keyword(s):

Data Mining ◽

Decision Tree ◽

Cross Validation ◽

Nearest Neighbor ◽

Naive Bayes ◽

Kappa Statistic ◽

Naïve Bayes ◽

Validation Dataset ◽

K Nearest Neighbor ◽

Fold Cross Validation

Abstract. Readmission is associated with quality measures on patients in hospitals. Different attributes related to diabetic patients such as medication, ethnicity, race, lifestyle, age, and others result in the calculation of quality care that tends to be complicated. Classification techniques of data mining can solve this problem. In this paper, the evaluation on three different classifiers, i.e. Decision Tree, k-Nearest Neighbor (k-NN), dan Naive Bayes with various settingparameter, is developed by using 10-Fold Cross Validation technique. The targets of parameter performance evaluated is based on term of Accuracy, Mean Absolute Error (MAE), dan Kappa Statistic. The selected dataset consists of 47 attributes and 49.735 records. The result shows that k-NN classifier with k=100 has a better performance in term of accuracy and Kappa Statistic, but Naive Bayes outperforms in term of MAE among other classifiers. Keywords: k-NN, naive bayes, diabetes, readmissionAbstrak. Proses Readmisi dikaitkan dengan perhitungan kualitas penanganan pasien di rumah sakit. Perbedaan atribut-atribut yang berhubungan dengan pasien diabetes proses medikasi, etnis, ras, gaya hidup, umur, dan lain-lain, mengakibatkan perhitungan kualitas cenderung rumit. Teknik klasifikasi data mining dapat menjadi solusi dalam perhitungan kualitas ini. Teknik klasifikasi merupakan salah satu teknik data mining yang perkembangannya cukup signifikan. Di dalam penelitian ini, model algoritma klasifikasi Decision Tree, k-Nearest Neighbor (k-NN), dan Naive Bayes dengan berbagai parameter setting akan dievaluasi performanya berdasarkan nilai performa Accuracy, Mean AbsoluteError (MAE), dan Kappa Statistik dengan metode 10-Fold Cross Validation. Dataset yang dievaluasi memiliki 47 atribut dengan 49.735 records. Hasil penelitian menunjukan bahwa performa accuracy, MAE, dan Kappa Statistik terbaik didapatkan dari Model Algoritma Naive Bayes.Kata Kunci: k-NN, naive bayes, diabetes, readmisi

Download Full-text

Access to Affordable Houses for the Low-Income Urban Dwellers in Kigali: Analysis Based on Sale Prices

Land ◽

10.3390/land9030085 ◽

2020 ◽

Vol 9 (3) ◽

pp. 85 ◽

Cited By ~ 1

Author(s):

Ernest Uwayezu ◽

Walter T. de Vries

Keyword(s):

Real Estate ◽

Low Income ◽

Housing Prices ◽

Construction Materials ◽

Housing Affordability ◽

Housing Development ◽

Bank Loan ◽

Real Estate Developers ◽

The Government ◽

Housing Units

The government of Rwanda recently passed housing development regulations and funding schemes which aim at promoting access to affordable houses for the low- and middle-income Kigali city inhabitants. The existing studies on housing affordability in this city did not yet discuss whether this government-supported programme is likely to promote access to housing for these target beneficiaries. This study applies the price-to-income ratio (PIR) approach and the 30-percent of household income standard through the bank loan to assess whether housing units developed in the framework of affordable housing schemes are, for the target recipients, affordable at all. It relies mainly on housing prices schemes held by real estate developers, data on households’ incomes collected through the household survey and a review of the existing studies and socio-economic censuses reports. Findings reveal that the developed housing units are seriously and severely unaffordable for most of the target beneficiaries, especially the lowest-income urban dwellers, due to the high costs of housing development, combined with the high profits expected by real estate developers. The study suggests policy and practical options for promoting inclusive urban (re)development and housing affordability for various categories of Kigali city inhabitants. These options include upgrading the existing informal settlements, combined with their conversion into shared apartments through the collaboration between property owners and real estate developers, the development of affordable rental housing for the low-income tenants, tax exemption on construction materials, progressive housing ownership through a rent-to-own approach, and incremental self-help housing development using the low-cost local materials.

Download Full-text

A Study of Information Systems in Indian Railways with Specific Reference to Konkan Railway Application Package

Integrating E-Business Models for Government Solutions ◽

10.4018/978-1-60566-240-4.ch014 ◽

2010 ◽

pp. 224-250

Author(s):

Sanjay Nayyar ◽

Vinayshil Gautam ◽

M. P. Gupta

Keyword(s):

Information Technology ◽

Developing Countries ◽

Information Systems ◽

Technology Management ◽

Low Income ◽

Large Population ◽

Specific Reference ◽

Information Technology Management ◽

Technology Applications ◽

The Government

The railroads sector in the developing countries like the other services sectors (i.e. electricity, post, and telegraphs, health, and transport) are still administered by the government in many counties. Organizations providing these services have a large geographical spread, an assured market, and an administered price regime. The organizations function under the twin pressures. One being to function as an entity with commercial goals thereby being financially self-sufficient; a compulsion imposed on the organizations as a result of the financial squeeze faced by the governments that support these organizations through budgetary grants. The second pressure being to support a large public service obligation; a constraint imposed by a large population with low income levels. Information Technology Management in such organizations evolves in a scenario marked by such conflicting pressures. The chapter takes a look at the evolution of the information technology applications in Railroads of select countries. A particular focus is given to the Indian Railways in an attempt to cull out the issues of Information Systems for the same. Further specific reference is the Konkan Railways enterprise systems which led to some learning for development and implementation of large information systems in the railroads. The learning could be of substantial value in developing a sound theoretical framework for information technology management practices in the services sector in the developing countries.

Download Full-text

Predicting Win-Loss outcomes in MLB regular season games – A comparative study using data mining methods

International Journal of Computer Science in Sport ◽

10.1515/ijcss-2016-0007 ◽

2016 ◽

Vol 15 (2) ◽

pp. 91-112 ◽

Cited By ~ 11

Author(s):

C. Soto Valero

Keyword(s):

Data Mining ◽

Cross Validation ◽

Data Contamination ◽

Past Data ◽

Mining Methods ◽

Using Data ◽

New Statistics ◽

Fold Cross Validation ◽

Better Than ◽

Model Approach

Abstract Baseball is a statistically filled sport, and predicting the winner of a particular Major League Baseball (MLB) game is an interesting and challenging task. Up to now, there is no definitive formula for determining what factors will conduct a team to victory, but through the analysis of many years of historical records many trends could emerge. Recent studies concentrated on using and generating new statistics called sabermetrics in order to rank teams and players according to their perceived strengths and consequently applying these rankings to forecast specific games. In this paper, we employ sabermetrics statistics with the purpose of assessing the predictive capabilities of four data mining methods (classification and regression based) for predicting outcomes (win or loss) in MLB regular season games. Our model approach uses only past data when making a prediction, corresponding to ten years of publicly available data. We create a dataset with accumulative sabermetrics statistics for each MLB team during this period for which data contamination is not possible. The inherent difficulties of attempting this specific sports prediction are confirmed using two geometry or topology based measures of data complexity. Results reveal that the classification predictive scheme forecasts game outcomes better than regression scheme, and of the four data mining methods used, SVMs produce the best predictive results with a mean of nearly 60% prediction accuracy for each team. The evaluation of our model is performed using stratified 10-fold cross-validation.

Download Full-text

Perbandingan Normalisasi Data untuk Klasifikasi Wine Menggunakan Algoritma K-NN

Computer Engineering Science and System Journal ◽

10.24114/cess.v4i1.11458 ◽

2019 ◽

Vol 4 (1) ◽

pp. 78

Author(s):

Darnisa Azzahra Nasution ◽

Hidayah Husnul Khotimah ◽

Nurul Chamidah

Keyword(s):

Data Mining ◽

Cross Validation ◽

Z Score ◽

Score Normalization ◽

Fold Cross Validation

Abstrak— Rentang nilai yang tidak seimbang pada setiap atribut dapat mempengaruhi kualitas hasil data mining. Untuk itu diperlukan adanya praproses data. Praproses ini diharapkan dapat meningkatkatkan keakuratan hasil dari pengklasifikasian dataset wine. Metode praproses yang digunakan adalah transformasi data dengan normalisasi. Ada tiga cara yang dilakukan dalam transformasi data dengan normalisasi, yaitu min-max normalization, z-score normalization, dan decimal scaling. Data yang telah diproses dari setiap metode normalisasi akan dibandingan untuk melihat hasil akurasi terbaik klasifikasi dengan menggunakan algoritama K-NN. K yang digunakan dalam perbandingan adalah 1, 3, 5, 7, 9, 11. Sebelum dilakukan pengklasifikasian dataset wine yang telah dinormalisasi dibagi menjadi data uji dan data latih dengan k-fold cross validation. Pembagian data menggunakan k sama dengan 10. Hasil pengujian klasifikasi dengan algoritma K-NN menunjukkan, bahwa akurasi terbaik terletak pada dataset wine yang telah dinormalisasi menggunakan metode min-max normalization dengan K = 1 sebesar 65,92%. Rata-rata yang diperoleh, yaitu 59,68%. Keywords— Normalisasi, K-fold cross validation, K-NN

Download Full-text