Data Mining and Machine Learning

Author(s):  
Mohammed J. Zaki ◽  
Wagner Meira, Jr
Keyword(s):  
2019 ◽  
Vol 12 (3) ◽  
pp. 171-179 ◽  
Author(s):  
Sachin Gupta ◽  
Anurag Saxena

Background: The increased variability in production or procurement with respect to less increase of variability in demand or sales is considered as bullwhip effect. Bullwhip effect is considered as an encumbrance in optimization of supply chain as it causes inadequacy in the supply chain. Various operations and supply chain management consultants, managers and researchers are doing a rigorous study to find the causes behind the dynamic nature of the supply chain management and have listed shorter product life cycle, change in technology, change in consumer preference and era of globalization, to name a few. Most of the literature that explored bullwhip effect is found to be based on simulations and mathematical models. Exploring bullwhip effect using machine learning is the novel approach of the present study. Methods: Present study explores the operational and financial variables affecting the bullwhip effect on the basis of secondary data. Data mining and machine learning techniques are used to explore the variables affecting bullwhip effect in Indian sectors. Rapid Miner tool has been used for data mining and 10-fold cross validation has been performed. Weka Alternating Decision Tree (w-ADT) has been built for decision makers to mitigate bullwhip effect after the classification. Results: Out of the 19 selected variables affecting bullwhip effect 7 variables have been selected which have highest accuracy level with minimum deviation. Conclusion: Classification technique using machine learning provides an effective tool and techniques to explore bullwhip effect in supply chain management.


2021 ◽  
Vol 1088 (1) ◽  
pp. 012035
Author(s):  
Mulyawan ◽  
Agus Bahtiar ◽  
Githera Dwilestari ◽  
Fadhil Muhammad Basysyar ◽  
Nana Suarna

2021 ◽  
pp. 097215092098485
Author(s):  
Sonika Gupta ◽  
Sushil Kumar Mehta

Data mining techniques have proven quite effective not only in detecting financial statement frauds but also in discovering other financial crimes, such as credit card frauds, loan and security frauds, corporate frauds, bank and insurance frauds, etc. Classification of data mining techniques, in recent years, has been accepted as one of the most credible methodologies for the detection of symptoms of financial statement frauds through scanning the published financial statements of companies. The retrieved literature that has used data mining classification techniques can be broadly categorized on the basis of the type of technique applied, as statistical techniques and machine learning techniques. The biggest challenge in executing the classification process using data mining techniques lies in collecting the data sample of fraudulent companies and mapping the sample of fraudulent companies against non-fraudulent companies. In this article, a systematic literature review (SLR) of studies from the area of financial statement fraud detection has been conducted. The review has considered research articles published between 1995 and 2020. Further, a meta-analysis has been performed to establish the effect of data sample mapping of fraudulent companies against non-fraudulent companies on the classification methods through comparing the overall classification accuracy reported in the literature. The retrieved literature indicates that a fraudulent sample can either be equally paired with non-fraudulent sample (1:1 data mapping) or be unequally mapped using 1:many ratio to increase the sample size proportionally. Based on the meta-analysis of the research articles, it can be concluded that machine learning approaches, in comparison to statistical approaches, can achieve better classification accuracy, particularly when the availability of sample data is low. High classification accuracy can be obtained with even a 1:1 mapping data set using machine learning classification approaches.


Author(s):  
Gilda Taranto-Vera ◽  
Purificación Galindo-Villardón ◽  
Javier Merchán-Sánchez-Jara ◽  
Julio Salazar-Pozo ◽  
Alex Moreno-Salazar ◽  
...  

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Satoko Hiura ◽  
Shige Koseki ◽  
Kento Koyama

AbstractIn predictive microbiology, statistical models are employed to predict bacterial population behavior in food using environmental factors such as temperature, pH, and water activity. As the amount and complexity of data increase, handling all data with high-dimensional variables becomes a difficult task. We propose a data mining approach to predict bacterial behavior using a database of microbial responses to food environments. Listeria monocytogenes, which is one of pathogens, population growth and inactivation data under 1,007 environmental conditions, including five food categories (beef, culture medium, pork, seafood, and vegetables) and temperatures ranging from 0 to 25 °C, were obtained from the ComBase database (www.combase.cc). We used eXtreme gradient boosting tree, a machine learning algorithm, to predict bacterial population behavior from eight explanatory variables: ‘time’, ‘temperature’, ‘pH’, ‘water activity’, ‘initial cell counts’, ‘whether the viable count is initial cell number’, and two types of categories regarding food. The root mean square error of the observed and predicted values was approximately 1.0 log CFU regardless of food category, and this suggests the possibility of predicting viable bacterial counts in various foods. The data mining approach examined here will enable the prediction of bacterial population behavior in food by identifying hidden patterns within a large amount of data.


Sign in / Sign up

Export Citation Format

Share Document