Sample Data
Recently Published Documents





Tuğçe Ayhan ◽  
Tamer Uçar

The demand for credit is increasing constantly. Banks are looking for various methods of credit evaluation that provide the most accurate results in a shorter period in order to minimize their rising risks. This study focuses on various methods that enable the banks to increase their asset quality without market loss regarding the credit allocation process. These methods enable the automatic evaluation of loan applications in line with the sector practices, and enable determination of credit policies/strategies based on actual needs. Within the scope of this study, the relationship between the predetermined attributes and the credit limit outputs are analyzed by using a sample data set of consumer loans. Random forest (RF), sequential minimal optimization (SMO), PART, decision table (DT), J48, multilayer perceptron(MP), JRip, naïve Bayes (NB), one rule (OneR) and zero rule (ZeroR) algorithms were used in this process. As a result of this analysis, SMO, PART and random forest algorithms are the top three approaches for determining customer credit limits.

2022 ◽  
Vol 13 (2) ◽  
pp. 0-0

Pulmonary disease is widespread worldwide. There is persistent blockage of the lungs, pneumonia, asthma, TB, etc. It is essential to diagnose the lungs promptly. For this reason, machine learning models were developed. For lung disease prediction, many deep learning technologies, including the CNN, and the capsule network, are used. The fundamental CNN has low rotating, inclined, or other irregular image orientation efficiency. Therefore by integrating the space transformer network (STN) with CNN, we propose a new hybrid deep learning architecture named STNCNN. The new model is implemented on the dataset from the Kaggle repository for an NIH chest X-ray image. STNCNN has an accuracy of 69% in respect of the entire dataset, while the accuracy values of vanilla grey, vanilla RGB, hybrid CNN are 67.8%, 69.5%, and 63.8%, respectively. When the sample data set is applied, STNCNN takes much less time to train at the cost of a slightly less reliable validation. Therefore both specialists and physicians are simplified by the proposed STNCNN System for the diagnosis of lung disease.

2022 ◽  
Vol 10 (4) ◽  
pp. 605-616
Jody Hendrian ◽  
Suparti Suparti ◽  
Alan Prahutama

Investing in gold is a flexible choice because it can be sold at any time and used as an emergency fund. Investors should have the knowledge to predict data from time to time to achieve investment goals. One of the statistical methods for time series data modeling is ARIMA. The ARIMA model is strict with the assumptions that the data must be stationary, the residuals must be normally distributed, independent, and with constant variance, so an alternative model is proposed, namely nonparametric regression model, which has no modeling assumptions requirement. In this study, the daily world gold price data will be modeled using a local polynomial nonparametric model as an alternative because the assumptions in the ARIMA are not fulfilled. The data is divided into 2 parts, namely in sample data from January 2, 2020 to November 30, 2020 to form a model and out sample data from December 1, 2020 to December 31, 2020 used for evauation of model performance based on MAPE values. The chosen best model is the local polynomial model with Gaussian kernel function of degree 5, bandwidth of 373, and local point of 1744 with an MSE value of 482.6420. The local polynomial model out sample data MAPE value is 0.61%, indicating that the model has excellent forecasting capability. In this study, Graphical User Interface (GUI) using R software with the help of shiny package is also built, making data analyzing easier and generating more interactive display output. 

2022 ◽  
Ayatu Usman ◽  
Geogerbest Azuoko ◽  
Joshua Chizoba ◽  
Ifeanyi Chinwuko

Abstract Aeromagnetic and core drilled data covering parts of southern Nupe Basin was acquired and interpreted with the view to evaluating the mineral potentials of the area through interpretation of the structural features in the area; determination of the curie isotherm depth; and correlation of aeromagnetic outcomes with the core sample data from the area. Two major regional fault trends were interpreted, trending, Northeast–Southwest (NE–SW) and NNE–SSW with minor northwest–southeast (NW–SE) directions. Two depth sources in the area are delineated namely; zone of shallow seated basement which ranges from 0.42km to 1.5km and zone of deeply seated basement which ranges from 1.91 to 3.50km.Results of qualitative interpretation of the Total magnetic intensity map (TMI) and Residual intensity map reveal that the magnetic intensities ranges from 7500 to 8460 nano-Telsa (nT) and -220 to 240 nT respectively. The depth to the centroid and top of the magnetic caustic bodies ranges from 9.00 to 17.10km and 0.4 to 3.10km respectively. Juxtaposing the topographical and core drilling data reveals that the oolitic iron ore level follows the topographical level which implies that the topography of the area controls the configuration of the iron ore deposit level. All these deduction are made considering the geology of the area.

PLoS ONE ◽  
2022 ◽  
Vol 17 (1) ◽  
pp. e0261533
Seung-Whan Choi

This replication underlines the importance of outlier diagnostics since many researchers have long neglected influential observations in OLS regression analysis. In his article, entitled “Primary Resources, Secondary Labor,” Shin finds that advanced democracies with increased natural resource wealth, particularly from oil and natural gas production, are more likely to restrict low-skill immigration policy. By performing outlier diagnostics, this replication shows that Shin’s findings are a statistical artifact. When one outlying country, Norway, is removed from the sample data, I observe almost no significant and negative relationship between oil wealth and immigration policy. When two outlying countries are excluded, the effect of oil wealth completely disappears. Robust regression analysis, a widely used remedial method for outlier problems, confirms the results of my outlier diagnostics.

Owner ◽  
2022 ◽  
Vol 6 (1) ◽  
pp. 747-758
Hantono Hantono ◽  
Riko Fridolend Sianturi

This research aims to look at the influence of Tax Knowledge, Tax Actionand Tax Compliance. Data collection techniques by disseminating questionnaires, while the data analyst method used is  inferenceal statistics,(inductive statistics  or  probability statistics),is a statistical technique used to analyze sample data and the results are applied topopulations (Sugiyono in Kalnadi 2013). In accordance with the hypothesis that has been formulated, then in this study the analysis of inferential statistical data is measured using  SmartPLS software  (PartialLeast Square)  ranging from model measurements (outermodels), the results of the study obtained a calculated  value  for Tax Knowledge (X1)is smaller  and sig t value for Tax Knowledge (X 1)  0.124  is greater than alpha   (0.05). Based on the results obtained then  receive  H0  and  reject  H1 for Tax Knowledge (X1). Thus, partially Tax Knowledge (X1)  has no positive and insignificant  effect on Tax Compliance (Y),  indicating Tax Knowledge (X1)does not have a positive impact in improving  Tax Compliance (Y). The results ofthe study obtained nilai tcalculated  for Tax Sanctions (X2)  of 2,759  greater than  sig  t value for Tax Sanctions (X2)  of  0.007  smaller than  alpha (0.05). Based on the results obtained, reject    H0  and  receive  H1. Thus partially Tax Sanctions (X2)have a positive and significant effect on Tax Compliance (Y), meaning tax sanctions (X2)have a real impact in improving tax compliance (Y).

2022 ◽  
Vol 14 (2) ◽  
pp. 798
Snezhana Gocheva-Ilieva ◽  
Atanas Ivanov ◽  
Maya Stoimenova-Minova

A novel framework for stacked regression based on machine learning was developed to predict the daily average concentrations of particulate matter (PM10), one of Bulgaria’s primary health concerns. The measurements of nine meteorological parameters were introduced as independent variables. The goal was to carefully study a limited number of initial predictors and extract stochastic information from them to build an extended set of data that allowed the creation of highly efficient predictive models. Four base models using random forest, CART ensemble and bagging, and their rotation variants, were built and evaluated. The heterogeneity of these base models was achieved by introducing five types of diversities, including a new simplified selective ensemble algorithm. The predictions from the four base models were then used as predictors in multivariate adaptive regression splines (MARS) models. All models were statistically tested using out-of-bag or with 5-fold and 10-fold cross-validation. In addition, a variable importance analysis was conducted. The proposed framework was used for short-term forecasting of out-of-sample data for seven days. It was shown that the stacked models outperformed all single base models. An index of agreement IA = 0.986 and a coefficient of determination of about 95% were achieved.

2022 ◽  
Vol 27 ◽  
pp. 93-98
Dheny Jatmiko ◽  
Eddy Wahyudi ◽  
Harjo Seputro ◽  
Aris Heri Andriawan ◽  
Eko April Ariyanto ◽  

This type of research uses a quantitative approach.  The quantitative approach is the approach used in research by measuring the indicators of the research variables in order to obtain an overview between these variables. Quantitative research is used to examine the population or sample. Data is collected using instruments or measuring instruments, then analyzed statistically or quantitatively. The Universitas 17 Agustus 1945 Surabaya runs the Independent Student Exchange program by accepting 32 inbound students and sending outbound students as many as 167 students throughout Indonesia outside Java, where the archipelago module is a compulsory subject offered by the Ministry of Education and Research and Technology. In this activity, there were several recipient universities that only provided online learning, so students couldn’t directly explore the knowledge gained during the PMM program. However, this doesn’t dampen the enthusiasm of students to continue learning about the diversity that exists in Indonesia.

2022 ◽  
Vol 18 (1) ◽  
pp. 192-204
Cindy Septia Pratiwi ◽  
Agus Purnomo Sidi

This research aimed to figure out the influence between product quality, price and marketing influencer with the purchasing decision of Scarlett Body Whitening in East Java. The research instrument employed questionnaire to collect data from Scarlett Body Whitening consumers in East Java. Since there was no valid data for number of the consumers, the research used Roscoe method to take the sample. Data analyzed using multiple linear regression test. Product quality and price have a positive and significant effect on purchasing decisions. Meanwhile, the marketing influencer had no significant effect on purchase decision for Scarlett Body Whitening. Need further research to ensure that marketing influencer had an effect on purchase decision.   Keywords: Product quality, price, marketing influencer, buying decision

2022 ◽  
Vol 2022 ◽  
pp. 1-12
Xuezhong Fu

In order to improve the effect of financial data classification and extract effective information from financial data, this paper improves the data mining algorithm, uses linear combination of principal components to represent missing variables, and performs dimensionality reduction processing on multidimensional data. In order to achieve the standardization of sample data, this paper standardizes the data and combines statistical methods to build an intelligent financial data processing model. In addition, starting from the actual situation, this paper proposes the artificial intelligence classification and statistical methods of financial data in smart cities and designs data simulation experiments to conduct experimental analysis on the methods proposed in this paper. From the experimental results, the artificial intelligence classification and statistical method of financial data in smart cities proposed in this paper can play an important role in the statistical analysis of financial data.

Sign in / Sign up

Export Citation Format

Share Document