An Informed Forensics Approach to Detecting Vote Irregularities

2015 ◽  
Vol 23 (4) ◽  
pp. 488-505 ◽  
Author(s):  
Jacob M. Montgomery ◽  
Santiago Olivella ◽  
Joshua D. Potter ◽  
Brian F. Crisp

Electoral forensics involves examining election results for anomalies to efficiently identify patterns indicative of electoral irregularities. However, there is disagreement about which, if any, forensics tool is most effective at identifying fraud, and there is no method for integrating multiple tools. Moreover, forensic efforts have failed to systematically take advantage of country-specific details that might aid in diagnosing fraud. We deploy a Bayesian additive regression trees (BART) model–a machine-learning technique–on a large cross-national data set to explore the dense network of potential relationships between various forensic indicators of anomalies and electoral fraud risk factors, on the one hand, and the likelihood of fraud, on the other. This approach allows us to arbitrate between the relative importance of different forensic and contextual features for identifying electoral fraud and results in a diagnostic tool that can be relatively easily implemented in cross-national research.

2020 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Youngkeun Choi ◽  
Jae Won Choi

Purpose Job involvement can be linked with important work outcomes. One way for organizations to increase job involvement is to use machine learning technology to predict employees’ job involvement, so that their leaders of human resource (HR) management can take proactive measures or plan succession for preservation. This paper aims to develop a reliable job involvement prediction model using machine learning technique. Design/methodology/approach This study used the data set, which is available at International Business Machines (IBM) Watson Analytics in IBM community and applied a generalized linear model (GLM) including linear regression and binomial classification. This study essentially had two primary approaches. First, this paper intends to understand the role of variables in job involvement prediction modeling better. Second, the study seeks to evaluate the predictive performance of GLM including linear regression and binomial classification. Findings In these results, first, employees’ job involvement with a lot of individual factors can be predicted. Second, for each model, this model showed the outstanding predictive performance. Practical implications The pre-access and modeling methodology used in this paper can be viewed as a roadmap for the reader to follow the steps taken in this study and to apply procedures to identify the causes of many other HR management problems. Originality/value This paper is the first one to attempt to come up with the best-performing model for predicting job involvement based on a limited set of features including employees’ demographics using machine learning technique.


2018 ◽  
Vol 52 (1) ◽  
pp. 163-181 ◽  
Author(s):  
Sandra Breux ◽  
Jérôme Couture ◽  
Royce Koop

AbstractWe explore influences on the number of candidates, and female candidates in particular, who contest mayoral elections in Canada. We draw on an original cross-national data set of election results from mayoral elections in Canada's 100 largest cities between 2006 and 2017. An average of 4.96 candidates contested mayoral elections in this period, and 16 per cent of all candidates were women. Density and mayoral prestige were related to higher numbers of candidates; in contrast, incumbent candidates and the availability of other elected positions were related to lower numbers. Notably, the presence of a female incumbent was related to higher numbers of women running for the position of mayor; in contrast, higher mayoral salaries were associated with an increase in the number of male but not female candidates. This analysis enhances our understanding of the factors underlying contested local elections, as well as the factors that appear to facilitate women contesting local elections.


Field Methods ◽  
2020 ◽  
Vol 32 (3) ◽  
pp. 291-308
Author(s):  
Enrijeta Shino ◽  
Christopher McCarty

This study examines the effect of telephone survey dialing patterns on lab productivity and survey responses. Using an original data set of paradata from 2010 to 2017 and a machine learning technique for variable selection, we find that early and late afternoon shifts are as productive as late evening shifts for both landline and cellphone Random Digit Dialing (RDD) samples. Also, early weekdays are more productive than the weekend for the cellphone RDD samples. Most importantly, time of the interview affects survey responses; therefore, survey practitioners and scholars should be cognizant of this effect when scheduling calls.


2019 ◽  
Vol 12 (3) ◽  
pp. 372-388
Author(s):  
Seda Yanık ◽  
Abdelrahman Elmorsy

Purpose The purpose of this paper is to generate customer clusters using self-organizing map (SOM) approach, a machine learning technique with a big data set of credit card consumptions. The authors aim to use the consumption patterns of the customers in a period of three months deducted from the credit card transactions, specifically the consumption categories (e.g. food, entertainment, etc.). Design/methodology/approach The authors use a big data set of almost 40,000 credit card transactions to cluster customers. To deal with the size of the data set and the eliminated the required parametric assumptions the authors use a machine learning technique, SOMs. The variables used are grouped into three as demographical variables, categorical consumption variables and summary consumption variables. The variables are first converted to factors using principal component analysis. Then, the number of clusters is specified by k-means clustering trials. Then, clustering with SOM is conducted by only including the demographical variables and all variables. Then, a comparison is made and the significance of the variables is examined by analysis of variance. Findings The appropriate number of clusters is found to be 8 using k-means clusters. Then, the differences in categorical consumption levels are investigated between the clusters. However, they have been found to be insignificant, whereas the summary consumption variables are found to be significant between the clusters, as well as the demographical variables. Originality/value The originality of the study is to incorporate the credit card consumption variables of customers to cluster the bank customers. The authors use a big data set and dealt with it with a machine learning technique to deduct the consumption patterns to generate the clusters. Credit card transactions generate a vast amount of data to deduce valuable information. It is mainly used to detect fraud in the literature. To the best of the authors’ knowledge, consumption patterns obtained from credit card transaction are first used for clustering the customers in this study.


—On street parking is one of the important and crucial components of urban traffic and transportation system. Allocation of parking space on street is major reason for traffic congestion. Optimizing traffic congestion and facilitating on street parking is a long stand issue. According to urban environment it is expected that car drivers prefers parking space based on road conditions, speed limit and surrounding activities and availability of parking space. The other major components to be ponder while searching parking space is payment method used while parking the car. This paper investigates car driver’s behaviors in selecting parking payment schemas, visualized data as well predicted via machine learning technique of linear regression analysis on the open data set of On-street Car Parking Meters with Location of City of Melbourne's in the Australian.


Atmosphere ◽  
2020 ◽  
Vol 11 (1) ◽  
pp. 111 ◽  
Author(s):  
Chul-Min Ko ◽  
Yeong Yun Jeong ◽  
Young-Mi Lee ◽  
Byung-Sik Kim

This study aimed to enhance the accuracy of extreme rainfall forecast, using a machine learning technique for forecasting hydrological impact. In this study, machine learning with XGBoost technique was applied for correcting the quantitative precipitation forecast (QPF) provided by the Korea Meteorological Administration (KMA) to develop a hydrological quantitative precipitation forecast (HQPF) for flood inundation modeling. The performance of machine learning techniques for HQPF production was evaluated with a focus on two cases: one for heavy rainfall events in Seoul and the other for heavy rainfall accompanied by Typhoon Kong-rey (1825). This study calculated the well-known statistical metrics to compare the error derived from QPF-based rainfall and HQPF-based rainfall against the observational data from the four sites. For the heavy rainfall case in Seoul, the mean absolute errors (MAE) of the four sites, i.e., Nowon, Jungnang, Dobong, and Gangnam, were 18.6 mm/3 h, 19.4 mm/3 h, 48.7 mm/3 h, and 19.1 mm/3 h for QPF and 13.6 mm/3 h, 14.2 mm/3 h, 33.3 mm/3 h, and 12.0 mm/3 h for HQPF, respectively. These results clearly indicate that the machine learning technique is able to improve the forecasting performance for localized rainfall. In addition, the HQPF-based rainfall shows better performance in capturing the peak rainfall amount and spatial pattern. Therefore, it is considered that the HQPF can be helpful to improve the accuracy of intense rainfall forecast, which is subsequently beneficial for forecasting floods and their hydrological impacts.


Author(s):  
Fahad Taha AL-Dhief ◽  
Nurul Mu'azzah Abdul Latiff ◽  
Nik Noordini Nik Abd. Malik ◽  
Naseer Sabri ◽  
Marina Mat Baki ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document