Indonesian Journal of Statistics and Its Applications

2021 ◽

Vol 5 (2) ◽

pp. 333-342

Author(s):

Dhiar Niken Larasati ◽

Usman Bustaman ◽

Setia Pramana

Keyword(s):

Big Data ◽

Online Shopping ◽

Positive Sign ◽

Total Revenue ◽

Online Marketplace ◽

Before And After ◽

Almost All ◽

The Impact ◽

Business Sectors ◽

Silver Lining

The COVID-19 outbreak is not only talking about health crises but also social and economic crises all over the world. In Indonesia, the outbreak has shaken almost all business sectors, however it seems to bring a silver lining for e-commerce sectors since the pandemic has developed online shopping habits. During the pandemic, the impact of COVID-19 on the Indonesian economy needs to be updated from time to time to be used on quick policymaking. Therefore, big data plays an important role to provide the information relatively fast. This paper aims to describe how big data i.e., marketplace data, could be used to figure the impact of COVID-19 outbreak on micro and small retailers in Indonesia. The dataset was collected regularly from a marketplace website in Indonesia from January to June 2020. To see the changing of sales during the COVID-19 period, the sales before and after social distancing policy implementation are compared. The result showed that the online marketplace in Indonesia is dominated by micro retailers based on the number of products sold in the marketplace. The total revenue of micro retailers gives a significant increase during the pandemic. Whereas for medium retailers, the increase in total revenue is seen to be lower than micro retailers’ total revenue. It indicates a positive sign for the growth of micro retailers in the online marketplace.

Download Full-text

Clustering with Euclidean Distance, Manhattan - Distance, Mahalanobis - Euclidean Distance, and Chebyshev Distance with Their Accuracy

Indonesian Journal of Statistics and Its Applications ◽

10.29244/ijsa.v5i2p369-376 ◽

2021 ◽

Vol 5 (2) ◽

pp. 369-376

Author(s):

Said Al Afghani ◽

Widhera Yoza Mahana Putra

Keyword(s):

Euclidean Distance ◽

Clustering Algorithm ◽

Manhattan Distance ◽

Experiment Data ◽

Data Grouping

There are several algorithms to solve many problems in grouping data. Grouping data is also known as clusterization, clustering takes advantage to solve some problems especially in business. In this note, we will modify the clustering algorithm based on distance principle which background of K-means algorithm (Euclidean distance). Manhattan, Mahalanobis-Euclidean, and Chebyshev distance will be used to modify the K-means algorithm. We compare the clustered result related to their accuracy, we got Mahalanobis - Euclidean distance gives the best accuracy on our experiment data, and some results are also given in this note.

Download Full-text

Ensemble Learning For Television Program Rating Prediction

Indonesian Journal of Statistics and Its Applications ◽

10.29244/ijsa.v5i2p377-395 ◽

2021 ◽

Vol 5 (2) ◽

pp. 377-395

Author(s):

Iqbal Hanif ◽

Regita Fachri Septiani

Keyword(s):

Random Forest ◽

Television Program ◽

Training Dataset ◽

Gradient Boosting ◽

Percentage Error ◽

Television Programs ◽

Rating Data ◽

Testing Dataset ◽

Extreme Gradient Boosting ◽

Boosting Method

Rating is one of the most frequently used metrics in the television industry to evaluate television programs or channels. This research is an attempt to develop a prediction model of television program ratings using rating data gathered from UseeTV (interned-based television service from Telkom Indonesia). The machine learning methods (Random Forest and Extreme Gradient Boosting) were tried out utilizing a set of rating data from 20 television programs collected from January 2018 to August 2019 (train dataset) and evaluated using September 2019 rating data (test dataset). Research results show that Random Forest gives a better result than Extreme Gradient Boosting based on evaluation metrics: Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE). On the training dataset, prediction using Random Forest produced lower RMSE and MAE scores than Extreme Gradient Boosting in all programs, while on the testing dataset, Random Forest produced lower RMSE and MAE scores in 16 programs compared with Extreme Gradient Boosting. According to MAPE score, Random Forest produced more good quality prediction (4 programs in the training dataset, 16 programs in the testing dataset) than Extreme Gradient Boosting method (1 program in the training dataset, 12 programs in the testing dataset) both in training and testing dataset.

Download Full-text

Classification of Bidikmisi Scholarship Acceptance using Neural Network Based on Hybrid Method of Genetic Algorithm

Indonesian Journal of Statistics and Its Applications ◽

10.29244/ijsa.v5i2p396-404 ◽

2021 ◽

Vol 5 (2) ◽

pp. 396-404

Author(s):

N Cahyani ◽

Sinta Septi Pangastuti ◽

K Fithriasari ◽

Irhamah Irhamah ◽

N Iriawan

Keyword(s):

Neural Network ◽

Genetic Algorithm ◽

Optimal Number ◽

Connection Weight ◽

The Neural Network ◽

Network Method ◽

Input Variables ◽

Human Brains ◽

Selection Of

A Neural network is a series of algorithms that endeavours to recognize underlying relationships in a set of data through processes that mimic the way human brains operate. In the case of classification, this method can provide a fit model through various factors, such as the variety of the optimal number of hidden nodes, the variety of relevant input variables, and the selection of optimal connection weights. One popular method to achieve the optimal selection of connection weights is using a Genetic Algorithm (GA), the basic concept is to iterate over Darwin's evolution. This research presents the Neural Network method with the Backpropagation Neural Network (BPNN) and the combined method of BPNN with GA, where GA is used to initialize and optimize the connection weight of BPNN. Based on accuracy value, the BPNN method combined with GA provides better classification, which is 90.51%, in the case of Bidikmisi Scholarship classification in East Java.

Download Full-text

Nowcasting Indonesia’s GDP Growth Using Machine Learning Algorithms

Indonesian Journal of Statistics and Its Applications ◽

10.29244/ijsa.v5i2p355-368 ◽

2021 ◽

Vol 5 (2) ◽

pp. 355-368

Author(s):

Nadya Dwi Muchisha ◽

Novian Tamara ◽

Andriansyah Andriansyah ◽

Agus M Soleh

Keyword(s):

Random Forest ◽

Real Time ◽

Pearson Correlation ◽

Forecast Accuracy ◽

Machine Learning Algorithms ◽

Support Vector ◽

Gdp Growth ◽

Forecast Combination ◽

Lasso Regression ◽

The Individual

GDP is very important to be monitored in real time because of its usefulness for policy making. We built and compared the ML models to forecast real-time Indonesia's GDP growth. We used 18 variables that consist a number of quarterly macroeconomic and financial market statistics. We have evaluated the performance of six popular ML algorithms, such as Random Forest, LASSO, Ridge, Elastic Net, Neural Networks, and Support Vector Machines, in doing real-time forecast on GDP growth from 2013:Q3 to 2019:Q4 period. We used the RMSE, MAD, and Pearson correlation coefficient as measurements of forecast accuracy. The results showed that the performance of all these models outperformed AR (1) benchmark. The individual model that showed the best performance is random forest. To gain more accurate forecast result, we run forecast combination using equal weighting and lasso regression. The best model was obtained from forecast combination using lasso regression with selected ML models, which are Random Forest, Ridge, Support Vector Machine, and Neural Network.

Download Full-text

Estimation of value at risk by using gjr-garch copula based on block maxima

Indonesian Journal of Statistics and Its Applications ◽

10.29244/ijsa.v5i2p405-414 ◽

2021 ◽

Vol 5 (2) ◽

pp. 405-414

Author(s):

Hasna Afifah Rusyda ◽

Fajar Indrayatna ◽

Lienda Noviyanti

Keyword(s):

At Risk ◽

Value At Risk ◽

Risk Estimation ◽

Joint Probability ◽

Extreme Value Distribution ◽

Copula Functions ◽

Tail Distribution ◽

Block Maxima ◽

Dependent Model ◽

Portfolio Return

This paper will discuss the risk estimation of a portfolio based on value at risk (VaR) using a copula-based asymmetric Glosten – Jagannathan – Runkle - Generalized Autoregressive Conditional Heteroskedasticity (GJR-GARCH). There is non-linear correlation for dependent model structure among the variables that lead to the inaccurate VaR estimation so that we use copula functions to model the joint probability of large market movements. Data is GEV distributed. Therefore, we use Block Maxima consisting of fitting an extreme value distribution as a tail distribution to count VaR. The results show VaR can estimate the risk of portfolio return reasonably because the model has captured the data properties. Data volatility can be accommodated by GJR-GARCH, Copula can capture dependence between stocks, and Block maxima can accommodate extreme tail behavior of the data.

Download Full-text

Sentiment Analysis on Overseas Tweets on the Impact of COVID-19 in Indonesia

Indonesian Journal of Statistics and Its Applications ◽

10.29244/ijsa.v5i2p304-313 ◽

2021 ◽

Vol 5 (2) ◽

pp. 304-313

Author(s):

Tigor Nirman Simanjuntak ◽

Setia Pramana

Keyword(s):

Big Data ◽

Application Programming Interface ◽

Text Format ◽

Python Programming Language ◽

Application Programming ◽

Python Programming ◽

Negative Sentiment ◽

Analytical Tools ◽

The Impact ◽

Programming Interface

This study aims to conduct analysis to determine the trend of sentiment on tweets about Covid-19 in Indonesia from the Twitter accounts overseas on big data perspective. The data was obtained from Twitter in the period of April 2020, with the word query "Indonesian Corona Virus" from foreign user accounts in English. The process of retrieving data comes from Twitter tweets by crawling the text using Twitter's API (Application Programming Interface) by employing Python programming language. Twitter was chosen because it is very fast and easy to spread through status updates from and among the user accounts. The number of tweets obtained was 8,740 in text format, with a total engagement of 217,316. The data was sorted from the tweets with the largest to smallest engagement, then cleaned from unnecessary fonts and symbols as well as typo words and abbreviations. The sentiment classification was carried out by analytical tools, extracting information with text mining, into positive, negative, and neutral polarity. To sharpen the analysis, the cleaned data was selected only with the largest engagement until those with 100 engagements; then was grouped into 30 sub-topics to be analyzed. The interesting facts are found that most tweets and sub-topics were dominated by the negative sentiment; and some unthinkable sub-topics were talked by many users.

Download Full-text

Forecasting Currency in East Java: Classical Time Series vs. Machine Learning

Indonesian Journal of Statistics and Its Applications ◽

10.29244/ijsa.v5i2p284-303 ◽

2021 ◽

Vol 5 (2) ◽

pp. 284-303

Author(s):

J A Putri ◽

Suhartono Suhartono ◽

H Prabowo ◽

N A Salehah ◽

D D Prastyo ◽

...

Keyword(s):

Machine Learning ◽

Time Series ◽

Forecast Accuracy ◽

Time Series Models ◽

Learning Methods ◽

Linear Pattern ◽

Machine Learning Methods ◽

Testing Dataset ◽

The Individual ◽

Nonlinear Pattern

Most research about the inflow and outflow currency in Indonesia showed that these data contained both linear and nonlinear patterns with calendar variation effect. The goal of this research is to propose a hybrid model by combining ARIMAX and Deep Neural Network (DNN), known as hybrid ARIMAX-DNN, for improving the forecast accuracy in the currency prediction in East Java, Indonesia. ARIMAX is class of classical time series models that could accurately handle linear pattern and calendar variation effect. Whereas, DNN is known as a machine learning method that powerful to tackle a nonlinear pattern. Data about 32 denominations of inflow and outflow currency in East Java are used as case studies. The best model was selected based on the smallest value of RMSE and sMAPE at the testing dataset. The results showed that the hybrid ARIMAX-DNN model improved the forecast accuracy and outperformed the individual models, both ARIMAX and DNN, at 26 denominations of inflow and outflow currency. Hence, it can be concluded that hybrid classical time series and machine learning methods tend to yield more accurate forecasts than individual models, both classical time series and machine learning methods.

Download Full-text

Comparison of Short-Term Load Forecasting Based on Kalimantan Data

Indonesian Journal of Statistics and Its Applications ◽

10.29244/ijsa.v5i2p243-259 ◽

2021 ◽

Vol 5 (2) ◽

pp. 243-259

Author(s):

Syalam Ali Wira Dinata ◽

Muhammad Azka ◽

Primadina Hasanah ◽

Suhartono Suhartono ◽

Moh Danil Hendry Gamal

Keyword(s):

Time Series ◽

Model Building ◽

Least Squares Method ◽

Moving Average ◽

Seasonal Pattern ◽

Short Term ◽

Sarima Model ◽

East Kalimantan ◽

Moving Average Model ◽

Short Term Forecasting

This paper investigates a case study on short term forecasting for East Kalimantan, with emphasis on special days, such as public holidays. A time series of load demand electricity recorded at hourly intervals contains more than one seasonal pattern. There is a great attraction in using a modelling time series method that is able to capture triple seasonalities. The Triple SARIMA model has been adapted for this purpose and competitive for modelling load. Using the least squares method to estimate the coefficients in a triple SARIMA model, followed by model building, model assumptions and comparing model criteria, we propose and demonstration the triple Seasonal Autoregressive Integrated Moving Average model with AIC 290631.9 and SBC 290674.2 as the best model for this study. The Triple seasonal ARIMA is one of the alternative strategy to propose accurate forecasts of electricity load Kalimantan data for planning, operation maintenance and market related activities.

Download Full-text

Response Surface Model with Comparison of OLS Estimation and MM Estimation

Indonesian Journal of Statistics and Its Applications ◽

10.29244/ijsa.v5i2p273-283 ◽

2021 ◽

Vol 5 (2) ◽

pp. 273-283

Author(s):

Salsabila Basalamah ◽

Edy Widodo

Keyword(s):

Response Surface ◽

Robust Regression ◽

Estimation Method ◽

Least Square ◽

Surface Model ◽

Method Of Moment ◽

Optimal Point ◽

Rsm Model ◽

Mm Estimation

Response Surface Method (RSM) is a collection of statistical techniques in the form of experiments and regression, as well as mathematics that is useful for developing, improving, and optimizing processes. In general, the determination of models in RSM is estimated by linear regression with Ordinary Least Square (OLS) estimation. However, OLS estimation is very weak in the presence of data identified as outliers, so in determining the RSM model a strong and resistant estimation is needed namely robust regression. One estimation method in robust regression is the Method of Moment (MM) estimation. This study aims to compare the OLS estimation and MM estimation method to get the optimal point of response in this case study. Comparison of the best estimation models using the parameters MSE and R^2 adj. The results of MM estimation give better results to the optimal response results in this case study.

Download Full-text

Indonesian Journal of Statistics and Its Applications
Latest Publications

TOTAL DOCUMENTS

H-INDEX

Published By Institut Pertanian Bogor

Online Marketplace Data to Figure COVID-19 Impact on Micro and Small Retailers in Indonesia

Clustering with Euclidean Distance, Manhattan - Distance, Mahalanobis - Euclidean Distance, and Chebyshev Distance with Their Accuracy

Ensemble Learning For Television Program Rating Prediction

Classification of Bidikmisi Scholarship Acceptance using Neural Network Based on Hybrid Method of Genetic Algorithm

Nowcasting Indonesia’s GDP Growth Using Machine Learning Algorithms

Estimation of value at risk by using gjr-garch copula based on block maxima

Sentiment Analysis on Overseas Tweets on the Impact of COVID-19 in Indonesia

Forecasting Currency in East Java: Classical Time Series vs. Machine Learning

Comparison of Short-Term Load Forecasting Based on Kalimantan Data

Response Surface Model with Comparison of OLS Estimation and MM Estimation

Export Citation Format

Indonesian Journal of Statistics and Its ApplicationsLatest Publications

TOTAL DOCUMENTS

H-INDEX

Published By Institut Pertanian Bogor

Online Marketplace Data to Figure COVID-19 Impact on Micro and Small Retailers in Indonesia

Clustering with Euclidean Distance, Manhattan - Distance, Mahalanobis - Euclidean Distance, and Chebyshev Distance with Their Accuracy

Ensemble Learning For Television Program Rating Prediction

Classification of Bidikmisi Scholarship Acceptance using Neural Network Based on Hybrid Method of Genetic Algorithm

Nowcasting Indonesia’s GDP Growth Using Machine Learning Algorithms

Estimation of value at risk by using gjr-garch copula based on block maxima

Sentiment Analysis on Overseas Tweets on the Impact of COVID-19 in Indonesia

Forecasting Currency in East Java: Classical Time Series vs. Machine Learning

Comparison of Short-Term Load Forecasting Based on Kalimantan Data

Response Surface Model with Comparison of OLS Estimation and MM Estimation

Indonesian Journal of Statistics and Its Applications
Latest Publications