scholarly journals Potential of Benford's Law and Machine Learning Based Verification in Agricultural Logistics

Author(s):  
Stanislav Levičar

Food supply chains are becoming increasingly more complex, contributing to emergence of new threats and risks for the involved stakeholders. Additionally, the information technology accelerated development of new and more productive ways of collaboration among organizations (members of supply chains) and helped to optimize their processes. Tighter collaboration among those companies is only possible if sufficient level of trust is established among them, which is often an obstacle that is not easily overcome. Since individual companies (which are part of supply chain) are unable to verify and rely on the data that is provided by third parties, the potential advantages are not fully realized. In this article we try to identify a possibility to remove one important element of this obstacle by using Benford’s law as the basis for general-purpose verification tool that is additionally enhanced by statistics based methods of machine learning algorithms that can be implemented in IT supported business operations. The potential usefullness of those methods lies in the fact that they are able to identify the patterns and correlations without explicit users’ input.

Information ◽  
2019 ◽  
Vol 10 (3) ◽  
pp. 98 ◽  
Author(s):  
Tariq Ahmad ◽  
Allan Ramsay ◽  
Hanady Ahmed

Assigning sentiment labels to documents is, at first sight, a standard multi-label classification task. Many approaches have been used for this task, but the current state-of-the-art solutions use deep neural networks (DNNs). As such, it seems likely that standard machine learning algorithms, such as these, will provide an effective approach. We describe an alternative approach, involving the use of probabilities to construct a weighted lexicon of sentiment terms, then modifying the lexicon and calculating optimal thresholds for each class. We show that this approach outperforms the use of DNNs and other standard algorithms. We believe that DNNs are not a universal panacea and that paying attention to the nature of the data that you are trying to learn from can be more important than trying out ever more powerful general purpose machine learning algorithms.


Author(s):  
Parag Jain

Most popular machine learning algorithms like k-nearest neighbour, k-means, SVM uses a metric to identify the distance(or similarity) between data instances. It is clear that performances of these algorithm heavily depends on the metric being used. In absence of prior knowledge about data we can only use general purpose metrics like Euclidean distance, Cosine similarity or Manhattan distance etc, but these metric often fail to capture the correct behaviour of data which directly affects the performance of the learning algorithm. Solution to this problem is to tune the metric according to the data and the problem, manually deriving the metric for high dimensional data which is often difficult to even visualize is not only tedious but is extremely difficult. Which leads to put effort on \textit{metric learning} which satisfies the data geometry.Goal of metric learning algorithm is to learn a metric which assigns small distance to similar points and relatively large distance to dissimilar points.


2019 ◽  
Vol 11 (17) ◽  
pp. 2247-2253 ◽  
Author(s):  
Alfonso T García-Sosa

Aim: The explosion of data based technology has accelerated pattern mining. However, it is clear that quality and bias of data impacts all machine learning and modeling. Results & methodology: A technique is presented for using the distribution of first significant digits of medicinal chemistry features: log P, log S, and p Ka. experimental and predicted, to assess their following of Benford's law as seen in many natural phenomena. Conclusion: Quality of data depends on the dataset sizes, diversity, and magnitudes. Profiling based on drugs may be too small or narrow; using larger sets of experimentally determined or predicted values recovers the distribution seen in other natural phenomena. This technique may be used to improve profiling, machine learning, large dataset assessment and other data based methods for better (automated) data generation and designing compounds.


Author(s):  
Hong Cui

Despite the sub-language nature of taxonomic descriptions of animals and plants, researchers have warned about the existence of large variations among different description collections in terms of information content and its representation. These variations impose a serious threat to the development of automatic tools to structure large volumes of text-based descriptions. This paper presents a general approach to mark up different collections of taxonomic descriptions with XML, using two large-scale floras as examples. The markup system, MARTT, is based on machine learning methods and enhanced by machine learned domain rules and conventions. Experiments show that our simple and efficient machine learning algorithms outperform significantly general purpose algorithms and that rules learned from one flora can be used when marking up a second flora and help to improve the markup performance, especially for elements that have sparse training examples.Malgré la nature de sous-langage des descriptions taxinomiques des animaux et des plantes, les chercheurs reconnaissent l’existence de vastes variations parmi différentes collections de descriptions, en termes de contenu informationnel et de leur représentation. Ces variations présentent une menace sérieuse pour le développement d’outils automatiques pour la structuration de larges… 


2019 ◽  
Author(s):  
André Dalmora ◽  
Tiago Tavares

Music lyrics can convey a great part of the meaning in popular songs. Such meaning is important for humans to understand songs as related to typical narratives, such as romantic interests or life stories. This understanding is part of affective aspects that can be used to choose songs to play in particular situations. This paper analyzes the effectiveness of using text mining tools to classify lyrics according to their narrative contexts. For such, we used a vote-based dataset and several machine learning algorithms. Also, we compared the classification results to that of a typical human. Last, we compare the problems of identifying narrative contexts and of identifying lyric valence. Our results indicate that narrative contexts can be identified more consistently than valence. Also, we show that human-based classification typically do not reach a high accuracy, which suggests an upper bound for automatic classification. narrative contexts. For such, we built a dataset containing Brazilian popular music lyrics which were raters voted online according to its context and valence. We approached the problem using a machine learning pipeline in which lyrics are projected into a vector space and then classified using general-purpose algorithms. We experimented with document representations based on sparse topic models [11, 12, 13, 14], which aims to find groups of words that typically appear together in the dataset. Also, we extracted part-of-speech tags for each lyric and used their histogram as features in the classification process.


2012 ◽  
pp. 1652-1686
Author(s):  
Réal Carbonneau ◽  
Rustam Vahidov ◽  
Kevin Laframboise

Managing supply chains in today’s complex, dynamic, and uncertain environment is one of the key challenges affecting the success of the businesses. One of the crucial determinants of effective supply chain management is the ability to recognize customer demand patterns and react accordingly to the changes in face of intense competition. Thus the ability to adequately predict demand by the participants in a supply chain is vital to the survival of businesses. Demand prediction is aggravated by the fact that communication patterns between participants that emerge in a supply chain tend to distort the original consumer’s demand and create high levels of noise. Distortion and noise negatively impact forecast quality of the participants. This work investigates the applicability of machine learning (ML) techniques and compares their performances with the more traditional methods in order to improve demand forecast accuracy in supply chains. To this end we used two data sets from particular companies (chocolate manufacturer and toner cartridge manufacturer), as well as data from the Statistics Canada manufacturing survey. A representative set of traditional and ML-based forecasting techniques have been applied to the demand data and the accuracy of the methods was compared. As a group, Machine Learning techniques outperformed traditional techniques in terms of overall average, but not in terms of overall ranking. We also found that a support vector machine (SVM) trained on multiple demand series produced the most accurate forecasts.


Sensors ◽  
2020 ◽  
Vol 20 (24) ◽  
pp. 7294
Author(s):  
Adrián Campazas-Vega ◽  
Ignacio Samuel Crespo-Martínez ◽  
Ángel Manuel Guerrero-Higueras ◽  
Camino Fernández-Llamas

Advanced persistent threats (APTs) are a growing concern in cybersecurity. Many companies and governments have reported incidents related to these threats. Throughout the life cycle of an APT, one of the most commonly used techniques for gaining access is network attacks. Tools based on machine learning are effective in detecting these attacks. However, researchers usually have problems with finding suitable datasets for fitting their models. The problem is even harder when flow data are required. In this paper, we describe a framework to gather flow datasets using a NetFlow sensor. We also present the Docker-based framework for gathering netflow data (DOROTHEA), a Docker-based solution implementing the above framework. This tool aims to easily generate taggable network traffic to build suitable datasets for fitting classification models. In order to demonstrate that datasets gathered with DOROTHEA can be used for fitting classification models for malicious-traffic detection, several models were built using the model evaluator (MoEv), a general-purpose tool for training machine-learning algorithms. After carrying out the experiments, four models obtained detection rates higher than 93%, thus demonstrating the validity of the datasets gathered with the tool.


Author(s):  
Réal Carbonneau ◽  
Rustam Vahidov ◽  
Kevin Laframboise

Managing supply chains in today’s complex, dynamic, and uncertain environment is one of the key challenges affecting the success of the businesses. One of the crucial determinants of effective supply chain management is the ability to recognize customer demand patterns and react accordingly to the changes in face of intense competition. Thus the ability to adequately predict demand by the participants in a supply chain is vital to the survival of businesses. Demand prediction is aggravated by the fact that communication patterns between participants that emerge in a supply chain tend to distort the original consumer’s demand and create high levels of noise. Distortion and noise negatively impact forecast quality of the participants. This work investigates the applicability of machine learning (ML) techniques and compares their performances with the more traditional methods in order to improve demand forecast accuracy in supply chains. To this end we used two data sets from particular companies (chocolate manufacturer and toner cartridge manufacturer), as well as data from the Statistics Canada manufacturing survey. A representative set of traditional and ML-based forecasting techniques have been applied to the demand data and the accuracy of the methods was compared. As a group, Machine Learning techniques outperformed traditional techniques in terms of overall average, but not in terms of overall ranking. We also found that a support vector machine (SVM) trained on multiple demand series produced the most accurate forecasts.


2021 ◽  
Vol 13 (12) ◽  
pp. 6812
Author(s):  
Nesrin Ada ◽  
Yigit Kazancoglu ◽  
Muruvvet Deniz Sezer ◽  
Cigdem Ede-Senturk ◽  
Idil Ozer ◽  
...  

The concept of the circular economy (CE) has gained importance worldwide recently since it offers a wider perspective in terms of promoting sustainable production and consumption with limited resources. However, few studies have investigated the barriers to CE in circular food supply chains. Accordingly, this paper presents a systematic literature review of 136 papers from 2010 to 2020 from WOS and Scopus databases regarding these barriers to understand CE implementation in food supply chains. The barriers are classified under seven categories: “cultural”, “business and business finance”, “regulatory and governmental”, “technological”, “managerial”, “supply-chain management”, “knowledge and skills”. The findings show the need to identify barriers preventing the transition to CE. The findings also indicate that these challenges to CE can be overcome through Industry 4.0, which includes a variety of technologies, such as the Internet of Things (IoT), cloud technologies, machine learning, and blockchain. Specifically, machine learning can offer support by making workflows more efficient through the forecasting and analytical capabilities of food supply chains. Blockchain and big data analytics can provide the necessary support to establish legal systems and improve environmental regulations since transparency is a crucial issue for taxation and incentives systems. Thus, CE can be promoted via adequate laws, policies, and innovative technologies.


Sign in / Sign up

Export Citation Format

Share Document