Fraud Audit Based on Visual Analysis: A Process Mining Approach

Among the knowledge areas in which process mining has had an impact, the audit domain is particularly striking. Traditionally, audits seek evidence in a data sample that allows making inferences about a population. Mistakes are usually committed when generalizing the results and anomalies; therefore, they appear in unprocessed sets; however, there are some efforts to address these limitations using process-mining-based approaches for fraud detection. To the best of our knowledge, no fraud audit method exists that combines process mining techniques and visual analytics to identify relevant patterns. This paper presents a fraud audit approach based on the combination of process mining techniques and visual analytics. The main advantages are: (i) a method is included that guides the use of the visual capabilities of process mining to detect fraud data patterns during an audit; (ii) the approach can be generalized to any business domain; (iii) well-known process mining techniques are used (dotted chart, trace alignment, fuzzy miner…). The techniques were selected by a group of experts and were extended to enable filtering for contextual analysis, to handle levels of process abstraction, and to facilitate implementation in the area of fraud audits. Based on the proposed approach, we developed a software solution that is currently being used in the financial sector as well as in the telecommunications and hospitality sectors. Finally, for demonstration purposes, we present a real hotel management use case in which we detected suspected fraud behaviors, thus validating the effectiveness of the approach.

Download Full-text

Visual Analysis of Biomarkers Reveals Differences in Lipid Profiles and Liver Enzymes before and after Gastric Sleeve and Bypass

Obesity Facts ◽

10.1159/000510401 ◽

2021 ◽

pp. 1-11

Author(s):

Marijn Marthe Georgine van Berckel ◽

Saskia L.M. van Loon ◽

Arjen-Kars Boer ◽

Volkher Scharnhorst ◽

Simon W. Nienhuijs

Keyword(s):

Bariatric Surgery ◽

Visual Analytics ◽

Biochemical Markers ◽

Visual Analysis ◽

Large Population ◽

High Volume ◽

Cholesterol Concentration ◽

Health State ◽

Data Set ◽

Before And After

Introduction: Bariatric surgery results in both intentional and unintentional metabolic changes. In a high-volume bariatric center, extensive laboratory panels are used to monitor these changes pre- and postoperatively. Consecutive measurements of relevant biochemical markers allow exploration of the health state of bariatric patients and comparison of different patient groups. Objective: The objective of this study is to compare biomarker distributions over time between 2 common bariatric procedures, i.e., sleeve gastrectomy (SG) and gastric bypass (RYGB), using visual analytics. Methods: Both pre- and postsurgical (6, 12, and 24 months) data of all patients who underwent primary bariatric surgery were collected retrospectively. The distribution and evolution of different biochemical markers were compared before and after surgery using asymmetric beanplots in order to evaluate the effect of primary SG and RYGB. A beanplot is an alternative to the boxplot that allows an easy and thorough visual comparison of univariate data. Results: In total, 1,237 patients (659 SG and 578 RYGB) were included. The sleeve and bypass groups were comparable in terms of age and the prevalence of comorbidities. The mean presurgical BMI and the percentage of males were higher in the sleeve group. The effect of surgery on lowering of glycated hemoglobin was similar for both surgery types. After RYGB surgery, the decrease in the cholesterol concentration was larger than after SG. The enzymatic activity of aspartate aminotransferase, alanine aminotransferase, and alkaline phosphate in sleeve patients was higher presurgically but lower postsurgically compared to bypass values. Conclusions: Beanplots allow intuitive visualization of population distributions. Analysis of this large population-based data set using beanplots suggests comparable efficacies of both types of surgery in reducing diabetes. RYGB surgery reduced dyslipidemia more effectively than SG. The trend toward a larger decrease in liver enzyme activities following SG is a subject for further investigation.

Download Full-text

Sequoia: an interactive visual analytics platform for interpretation and feature extraction from nanopore sequencing datasets

BMC Genomics ◽

10.1186/s12864-021-07791-z ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Ratanond Koonchanok ◽

Swapna Vidhur Daulatabad ◽

Quoseena Mir ◽

Khairi Reda ◽

Sarath Chandra Janga

Keyword(s):

Single Molecule ◽

Visual Analytics ◽

Visual Analysis ◽

Direct Sequencing ◽

Visual Exploration ◽

Nanopore Sequencing ◽

Sequencing Data ◽

Rna Sequences ◽

Sequencing Technologies ◽

Signal Features

Abstract Background Direct-sequencing technologies, such as Oxford Nanopore’s, are delivering long RNA reads with great efficacy and convenience. These technologies afford an ability to detect post-transcriptional modifications at a single-molecule resolution, promising new insights into the functional roles of RNA. However, realizing this potential requires new tools to analyze and explore this type of data. Result Here, we present Sequoia, a visual analytics tool that allows users to interactively explore nanopore sequences. Sequoia combines a Python-based backend with a multi-view visualization interface, enabling users to import raw nanopore sequencing data in a Fast5 format, cluster sequences based on electric-current similarities, and drill-down onto signals to identify properties of interest. We demonstrate the application of Sequoia by generating and analyzing ~ 500k reads from direct RNA sequencing data of human HeLa cell line. We focus on comparing signal features from m6A and m5C RNA modifications as the first step towards building automated classifiers. We show how, through iterative visual exploration and tuning of dimensionality reduction parameters, we can separate modified RNA sequences from their unmodified counterparts. We also document new, qualitative signal signatures that characterize these modifications from otherwise normal RNA bases, which we were able to discover from the visualization. Conclusions Sequoia’s interactive features complement existing computational approaches in nanopore-based RNA workflows. The insights gleaned through visual analysis should help users in developing rationales, hypotheses, and insights into the dynamic nature of RNA. Sequoia is available at https://github.com/dnonatar/Sequoia.

Download Full-text

TV-MV Analytics: A visual analytics framework to explore time-varying multivariate data

Information Visualization ◽

10.1177/1473871619858937 ◽

2019 ◽

Vol 19 (1) ◽

pp. 3-23

Author(s):

Aurea Soriano-Vargas ◽

Bernd Hamann ◽

Maria Cristina F de Oliveira

Keyword(s):

Visual Analytics ◽

Visual Analysis ◽

Multivariate Data ◽

Visual Exploration ◽

Data Sets ◽

Time Varying ◽

Domain Experts ◽

Data Mining Algorithms ◽

Temporal Relationships ◽

Visualization Techniques

We present an integrated interactive framework for the visual analysis of time-varying multivariate data sets. As part of our research, we performed in-depth studies concerning the applicability of visualization techniques to obtain valuable insights. We consolidated the considered analysis and visualization methods in one framework, called TV-MV Analytics. TV-MV Analytics effectively combines visualization and data mining algorithms providing the following capabilities: (1) visual exploration of multivariate data at different temporal scales, and (2) a hierarchical small multiples visualization combined with interactive clustering and multidimensional projection to detect temporal relationships in the data. We demonstrate the value of our framework for specific scenarios, by studying three use cases that were validated and discussed with domain experts.

Download Full-text

Collusion and Fraud Detection on Electronic Energy Meters - A Use Case of Forensics Investigation Procedures

2014 IEEE Security and Privacy Workshops ◽

10.1109/spw.2014.19 ◽

2014 ◽

Cited By ~ 6

Author(s):

Rubens Alexandre De Faria ◽

Keiko V. Ono Fonseca ◽

Bertoldo Schneider ◽

Sing Kiong Nguang

Keyword(s):

Fraud Detection ◽

Electronic Energy ◽

Use Case

Download Full-text

Implementing Visual Analytics Pipelines with Simulation Data

10.5772/intechopen.96152 ◽

2021 ◽

Author(s):

Taimur Khan ◽

Syed Samad Shakeel ◽

Afzal Gul ◽

Hamza Masud ◽

Achim Ebert

Keyword(s):

Visual Analytics ◽

Visual Analysis ◽

Simulated Data ◽

Evaluation Study ◽

Ease Of Use ◽

Preliminary Evaluation ◽

Simulation Data ◽

Visual Data Analytics ◽

High Level ◽

Simulation Parameters

Visual analytics has been widely studied in the past decade both in academia and industry to improve data exploration, minimize the overall cost, and improve data analysis. In this chapter, we explore the idea of visual analytics in the context of simulation data. This would then provide us with the capability to not only explore our data visually but also to apply machine learning models in order to answer high-level questions with respect to scheduling, choosing optimal simulation parameters, finding correlations, etc. More specifically, we examine state-of-the-art tools to be able to perform these above-mentioned tasks. Further, to test and validate our methodology we followed the human-centered design process to build a prototype tool called ViDAS (Visual Data Analytics of Simulated Data). Our preliminary evaluation study illustrates the intuitiveness and ease-of-use of our approach with regards to visual analysis of simulated data.

Download Full-text

Experiments on Fraud Detection use case with QML and TDA Mapper

10.1109/qce52317.2021.00083 ◽

2021 ◽

Author(s):

Satanik Mitra ◽

Kameshwar Rao JV

Keyword(s):

Fraud Detection ◽

Use Case

Download Full-text

Measuring the Impact of the Semantic-Based Process Mining Approach

Applications and Developments in Semantic Process Mining - Advances in Data Mining and Database Management ◽

10.4018/978-1-7998-2668-2.ch008 ◽

2020 ◽

pp. 217-237

Keyword(s):

Knowledge Base ◽

Real Time ◽

Learning Process ◽

Conceptual Analysis ◽

Process Mining ◽

Levels Of Analysis ◽

Use Case ◽

Abstraction Levels ◽

The Impact

This chapter looks at the extent to which the semantic-based process mining approach of this book supports the conceptual analysis of the events logs and resultant models. Qualitatively, the chapter leverages the use case study of the research learning process domain to determine how the proposed method support the discovery, monitoring, and enhancement of the real-time processes through the abstraction levels of analysis. Also, the chapter quantitatively assesses the level of accuracy of the classification process to predict behaviours of unobserved instances within the underlying knowledge base. Overall, the work looks at the implications of the semantic-based approach, validation of the classification results, and their influence compared to other existing benchmark techniques/algorithms used for process mining.

Download Full-text

Identifying, Analyzing, and Visualizing Diagnostic Paths for Patients with Nonspecific Abdominal Pain

Applied Clinical Informatics ◽

10.1055/s-0038-1676338 ◽

2018 ◽

Vol 09 (04) ◽

pp. 905-913 ◽

Cited By ~ 2

Author(s):

Goutham Rao ◽

Katherine Kirley ◽

Paul Epner ◽

Yiye Zhang ◽

Victoria Bauer ◽

...

Keyword(s):

Abdominal Pain ◽

Visual Analytics ◽

Process Mining ◽

Accurate Diagnosis ◽

Health Records ◽

Hospital System ◽

Diagnostic Practices ◽

Different Types ◽

Mining Methods ◽

Insight Into

Background Diagnosis is complex, uncertain, and error-prone. Symptoms such as nonspecific abdominal pain are especially challenging. A diagnostic path consists of diagnostic steps taken from initial presentation until a diagnosis is obtained or the evaluation ends for other reasons. Analysis of diagnostic paths can reveal patterns associated with more timely and accurate diagnosis. Visual analytics can be used to enhance both analysis and comprehension of diagnostic paths. Objective This article applies process-mining methods to extract and visualize diagnostic paths from electronic health records (EHRs). Methods Patient features, actions taken (i.e., tests, referrals, etc.), and diagnoses obtained for 501 adult patients (half female, half ≥50 years of age) presenting with abdominal pain were extracted from an EHR database to construct diagnostic paths from a hospital system in suburban Chicago, Illinois, United States. A stable diagnosis was defined as the same diagnosis recorded twice in a 12-month period; a working diagnosis was recorded only once. Three different types of path visualizations were obtained. Results A stable diagnosis was obtained in 63 (13%) patients after 12 months. In 271 (54%) patients, a working diagnosis was obtained. Mean path duration was 145.3 days (standard deviation, 195.1 days). These 63 patients received 75 stable diagnoses. Conclusion Structured EHR data can be used to construct diagnostic paths to gain insight into diagnostic practices for complaints such as abdominal pain.

Download Full-text

Exploring Visual Analytics to Measure Reliability for IoT Oriented Pollution Detection Software Perspectives

International Journal of Distributed Systems and Technologies ◽

10.4018/ijdst.2019040101 ◽

2019 ◽

Vol 10 (2) ◽

pp. 1-19

Author(s):

Nishi Kant Kumar ◽

Soumya Banerjee

Keyword(s):

Visual Analytics ◽

Actual Data ◽

Use Case ◽

Specific Requirement ◽

Software Bugs ◽

Measure Reliability ◽

Pollution Detection ◽

Detection Software

The measurement of the reliability of such IoT based application requires an embedded analysis. The parameters are the number of imprecise or faulty measures as well as the identification of core modules. This article investigates that how far visual introspection can assist in troubleshooting of IoT-based software bugs. This specific requirement improvises a new idea, where the shape of the plots with actual data can indicate the cause of the error and further they can be patched if the software repairing strategies are implemented adjudging the visual analytics. It is quite indifferent to analyze faults for existing applications as a variation of topological and practicing parameters which takes substantial numbers of iterations and observations. Categorically, the present use-case establishes the fact to analyze and infer concerning the shape of the visual plots derived from embedded modules.

Download Full-text

TOPICVIEW: VISUAL ANALYSIS OF TOPIC MODELS AND THEIR IMPACT ON DOCUMENT CLUSTERING

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213013600087 ◽

2013 ◽

Vol 22 (05) ◽

pp. 1360008 ◽

Cited By ~ 1

Author(s):

PATRICIA J. CROSSNO ◽

ANDREW T. WILSON ◽

TIMOTHY M. SHEAD ◽

WARREN L. DAVIS ◽

DANIEL M. DUNLAVY

Keyword(s):

Visual Analytics ◽

Semantic Analysis ◽

Visual Analysis ◽

Document Clustering ◽

Topic Models ◽

Analysis Tool ◽

Model Assessment ◽

Text Corpora ◽

The Impact ◽

Document Relationships

We present a new approach for analyzing topic models using visual analytics. We have developed TopicView, an application for visually comparing and exploring multiple models of text corpora, as a prototype for this type of analysis tool. TopicView uses multiple linked views to visually analyze conceptual and topical content, document relationships identified by models, and the impact of models on the results of document clustering. As case studies, we examine models created using two standard approaches: Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA). Conceptual content is compared through the combination of (i) a bipartite graph matching LSA concepts with LDA topics based on the cosine similarities of model factors and (ii) a table containing the terms for each LSA concept and LDA topic listed in decreasing order of importance. Document relationships are examined through the combination of (i) side-by-side document similarity graphs, (ii) a table listing the weights for each document's contribution to each concept/topic, and (iii) a full text reader for documents selected in either of the graphs or the table. The impact of LSA and LDA models on document clustering applications is explored through similar means, using proximities between documents and cluster exemplars for graph layout edge weighting and table entries. We demonstrate the utility of TopicView's visual approach to model assessment by comparing LSA and LDA models of several example corpora.

Download Full-text