Indonesian Journal of Statistics and Its Applications
Latest Publications


TOTAL DOCUMENTS

129
(FIVE YEARS 110)

H-INDEX

2
(FIVE YEARS 1)

Published By Institut Pertanian Bogor

2599-0802

2021 ◽  
Vol 5 (2) ◽  
pp. 333-342
Author(s):  
Dhiar Niken Larasati ◽  
Usman Bustaman ◽  
Setia Pramana

The COVID-19 outbreak is not only talking about health crises but also social and economic crises all over the world. In Indonesia, the outbreak has shaken almost all business sectors, however it seems to bring a silver lining for e-commerce sectors since the pandemic has developed online shopping habits. During the pandemic, the impact of COVID-19 on the Indonesian economy needs to be updated from time to time to be used on quick policymaking. Therefore, big data plays an important role to provide the information relatively fast. This paper aims to describe how big data i.e., marketplace data, could be used to figure the impact of COVID-19 outbreak on micro and small retailers in Indonesia. The dataset was collected regularly from a marketplace website in Indonesia from January to June 2020. To see the changing of sales during the COVID-19 period, the sales before and after social distancing policy implementation are compared. The result showed that the online marketplace in Indonesia is dominated by micro retailers based on the number of products sold in the marketplace. The total revenue of micro retailers gives a significant increase during the pandemic. Whereas for medium retailers, the increase in total revenue is seen to be lower than micro retailers’ total revenue. It indicates a positive sign for the growth of micro retailers in the online marketplace.


2021 ◽  
Vol 5 (2) ◽  
pp. 369-376
Author(s):  
Said Al Afghani ◽  
Widhera Yoza Mahana Putra

There are several algorithms to solve many problems in grouping data. Grouping data is also known as clusterization, clustering takes advantage to solve some problems especially in business. In this note, we will modify the clustering algorithm based on distance principle which background of K-means algorithm (Euclidean distance). Manhattan, Mahalanobis-Euclidean, and Chebyshev distance will be used to modify the K-means algorithm. We compare the clustered  result related to their accuracy, we got Mahalanobis - Euclidean distance gives the best accuracy on our experiment data, and some results are also given in this note.


2021 ◽  
Vol 5 (2) ◽  
pp. 377-395
Author(s):  
Iqbal Hanif ◽  
Regita Fachri Septiani

Rating is one of the most frequently used metrics in the television industry to evaluate television programs or channels. This research is an attempt to develop a prediction model of television program ratings using rating data gathered from UseeTV (interned-based television service from Telkom Indonesia). The machine learning methods (Random Forest and Extreme Gradient Boosting) were tried out utilizing a set of rating data from 20 television programs collected from January 2018 to August 2019 (train dataset) and evaluated using September 2019 rating data (test dataset). Research results show that Random Forest gives a better result than Extreme Gradient Boosting based on evaluation metrics: Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE). On the training dataset, prediction using Random Forest produced lower RMSE and MAE scores than Extreme Gradient Boosting in all programs, while on the testing dataset, Random Forest produced lower RMSE and MAE scores in 16 programs compared with Extreme Gradient Boosting. According to MAPE score, Random Forest produced more good quality prediction (4 programs in the training dataset, 16 programs in the testing dataset) than Extreme Gradient Boosting method (1 program in the training dataset, 12 programs in the testing dataset) both in training and testing dataset.


2021 ◽  
Vol 5 (2) ◽  
pp. 396-404
Author(s):  
N Cahyani ◽  
Sinta Septi Pangastuti ◽  
K Fithriasari ◽  
Irhamah Irhamah ◽  
N Iriawan

A Neural network is a series of algorithms that endeavours to recognize underlying relationships in a set of data through processes that mimic the way human brains operate. In the case of classification, this method can provide a fit model through various factors, such as the variety of the optimal number of hidden nodes, the variety of relevant input variables, and the selection of optimal connection weights. One popular method to achieve the optimal selection of connection weights is using a Genetic Algorithm (GA), the basic concept is to iterate over Darwin's evolution. This research presents the Neural Network method with the Backpropagation Neural Network (BPNN) and the combined method of BPNN with GA, where GA is used to initialize and optimize the connection weight of BPNN. Based on accuracy value, the BPNN method combined with GA provides better classification, which is 90.51%, in the case of Bidikmisi Scholarship classification in East Java.


2021 ◽  
Vol 5 (2) ◽  
pp. 355-368
Author(s):  
Nadya Dwi Muchisha ◽  
Novian Tamara ◽  
Andriansyah Andriansyah ◽  
Agus M Soleh

GDP is very important to be monitored in real time because of its usefulness for policy making. We built and compared the ML models to forecast real-time Indonesia's GDP growth. We used 18 variables that consist a number of quarterly macroeconomic and financial market statistics. We have evaluated the performance of six popular ML algorithms, such as Random Forest, LASSO, Ridge, Elastic Net, Neural Networks, and Support Vector Machines, in doing real-time forecast on GDP growth from 2013:Q3 to 2019:Q4 period. We used the RMSE, MAD, and Pearson correlation coefficient as measurements of forecast accuracy. The results showed that the performance of all these models outperformed AR (1) benchmark. The individual model that showed the best performance is random forest. To gain more accurate forecast result, we run forecast combination using equal weighting and lasso regression. The best model was obtained from forecast combination using lasso regression with selected ML models, which are Random Forest, Ridge, Support Vector Machine, and Neural Network.


2021 ◽  
Vol 5 (2) ◽  
pp. 405-414
Author(s):  
Hasna Afifah Rusyda ◽  
Fajar Indrayatna ◽  
Lienda Noviyanti

This paper will discuss the risk estimation of a portfolio based on value at risk (VaR) using a copula-based asymmetric Glosten – Jagannathan – Runkle - Generalized Autoregressive Conditional Heteroskedasticity (GJR-GARCH). There is non-linear correlation for dependent model structure among the variables that lead to the inaccurate VaR estimation so that we use copula functions to model the joint probability of large market movements. Data is GEV distributed. Therefore, we use Block Maxima consisting of fitting an extreme value distribution as a tail distribution to count VaR. The results show VaR can estimate the risk of portfolio return reasonably because the model has captured the data properties. Data volatility can be accommodated by GJR-GARCH, Copula can capture dependence between stocks, and Block maxima can accommodate extreme tail behavior of the data.


2021 ◽  
Vol 5 (2) ◽  
pp. 304-313
Author(s):  
Tigor Nirman Simanjuntak ◽  
Setia Pramana

This study aims to conduct analysis to determine the trend of sentiment on tweets about Covid-19 in Indonesia from the Twitter accounts overseas on big data perspective. The data was obtained from Twitter in the period of April 2020, with the word query "Indonesian Corona Virus" from foreign user accounts in English. The process of retrieving data comes from Twitter tweets by crawling the text using Twitter's API (Application Programming Interface) by employing Python programming language. Twitter was chosen because it is very fast and easy to spread through status updates from and among the user accounts. The number of tweets obtained was 8,740 in text format, with a total engagement of 217,316. The data was sorted from the tweets with the largest to smallest engagement, then cleaned from unnecessary fonts and symbols as well as typo words and abbreviations. The sentiment classification was carried out by analytical tools, extracting information with text mining, into positive, negative, and neutral polarity. To sharpen the analysis, the cleaned data was selected only with the largest engagement until those with 100 engagements; then was grouped into 30 sub-topics to be analyzed. The interesting facts are found that most tweets and sub-topics were dominated by the negative sentiment; and some unthinkable sub-topics were talked by many users.


2021 ◽  
Vol 5 (2) ◽  
pp. 284-303
Author(s):  
J A Putri ◽  
Suhartono Suhartono ◽  
H Prabowo ◽  
N A Salehah ◽  
D D Prastyo ◽  
...  

Most research about the inflow and outflow currency in Indonesia showed that these data contained both linear and nonlinear patterns with calendar variation effect. The goal of this research is to propose a hybrid model by combining ARIMAX and Deep Neural Network (DNN), known as hybrid ARIMAX-DNN, for improving the forecast accuracy in the currency prediction in East Java, Indonesia. ARIMAX is class of classical time series models that could accurately handle linear pattern and calendar variation effect. Whereas, DNN is known as a machine learning method that powerful to tackle a nonlinear pattern. Data about 32 denominations of inflow and outflow currency in East Java are used as case studies. The best model was selected based on the smallest value of RMSE and sMAPE at the testing dataset. The results showed that the hybrid ARIMAX-DNN model improved the forecast accuracy and outperformed the individual models, both ARIMAX and DNN, at 26 denominations of inflow and outflow currency. Hence, it can be concluded that hybrid classical time series and machine learning methods tend to yield more accurate forecasts than individual models, both classical time series and machine learning methods.


2021 ◽  
Vol 5 (2) ◽  
pp. 243-259
Author(s):  
Syalam Ali Wira Dinata ◽  
Muhammad Azka ◽  
Primadina Hasanah ◽  
Suhartono Suhartono ◽  
Moh Danil Hendry Gamal

This paper investigates a case study on short term forecasting for East  Kalimantan, with emphasis on special days, such as public holidays. A time series of load demand electricity  recorded at hourly intervals contains more than one seasonal pattern.  There is a great attraction in using a modelling time series method that is able to capture triple seasonalities.  The Triple SARIMA model has been adapted for this purpose and competitive for modelling load.  Using the least squares method to estimate the coefficients in a triple SARIMA model, followed by model building, model assumptions  and comparing model criteria, we propose and demonstration  the triple Seasonal Autoregressive Integrated Moving Average model  with AIC 290631.9 and SBC 290674.2 as the best model for this study. The Triple seasonal ARIMA is one of the alternative strategy to propose accurate forecasts of  electricity load Kalimantan data for planning, operation  maintenance and  market related activities.


2021 ◽  
Vol 5 (2) ◽  
pp. 273-283
Author(s):  
Salsabila Basalamah ◽  
Edy Widodo

Response Surface Method (RSM) is a collection of statistical techniques in the form of experiments and regression, as well as mathematics that is useful for developing, improving, and optimizing processes. In general, the determination of models in RSM is estimated by linear regression with Ordinary Least Square (OLS) estimation. However, OLS estimation is very weak in the presence of data identified as outliers, so in determining the RSM model a strong and resistant estimation is needed namely robust regression. One estimation method in robust regression is the Method of Moment (MM) estimation. This study aims to compare the OLS estimation and MM estimation method to get the optimal point of response in this case study. Comparison of the best estimation models using the parameters MSE and R^2 adj. The results of MM estimation give better results to the optimal response results in this case study.


Sign in / Sign up

Export Citation Format

Share Document