Visualisation of Large-Scale Call-Centre Data

Large Data ◽

Mean Value ◽

Data Sets ◽

Call Centre ◽

Processing Unit ◽

Data Set ◽

Domain Experts ◽

The contact centre industry employs 4% of the entire United King-dom and United States’ working population and generates gigabytes of operational data that require analysis, to provide insight and to improve efficiency. This thesis is the result of a collaboration with QPC Limited who provide data collection and analysis products for call centres. They provided a large data-set featuring almost 5 million calls to be analysed. This thesis utilises novel visualisation techniques to create tools for the exploration of the large, complex call centre data-set and to facilitate unique observations into the data.A survey of information visualisation books is presented, provid-ing a thorough background of the field. Following this, a feature-rich application that visualises large call centre data sets using scatterplots that support millions of points is presented. The application utilises both the CPU and GPU acceleration for processing and filtering and is exhibited with millions of call events.This is expanded upon with the use of glyphs to depict agent behaviour in a call centre. A technique is developed to cluster over-lapping glyphs into a single parent glyph dependant on zoom level and a customizable distance metric. This hierarchical glyph repre-sents the mean value of all child agent glyphs, removing overlap and reducing visual clutter. A novel technique for visualising individually tailored glyphs using a Graphics Processing Unit is also presented, and demonstrated rendering over 100,000 glyphs at interactive frame rates. An open-source code example is provided for reproducibility.Finally, a novel interaction and layout method is introduced for improving the scalability of chord diagrams to visualise call transfers. An exploration of sketch-based methods for showing multiple links and direction is made, and a sketch-based brushing technique for filtering is proposed. Feedback from domain experts in the call centre industry is reported for all applications developed.

Galaxy spin direction distribution in HST and SDSS show similar large-scale asymmetry

Publications of the Astronomical Society of Australia ◽

10.1017/pasa.2020.46 ◽

2020 ◽

Vol 37 ◽

Author(s):

Lior Shamir

Keyword(s):

Large Scale ◽

Spiral Galaxies ◽

Hubble Space Telescope ◽

Gravitational Interaction ◽

Large Data ◽

Sloan Digital Sky Survey ◽

Data Sets ◽

Dipole Axis ◽

Data Set ◽

The Asymmetry

Abstract Several recent observations using large data sets of galaxies showed non-random distribution of the spin directions of spiral galaxies, even when the galaxies are too far from each other to have gravitational interaction. Here, a data set of $\sim8.7\cdot10^3$ spiral galaxies imaged by Hubble Space Telescope (HST) is used to test and profile a possible asymmetry between galaxy spin directions. The asymmetry between galaxies with opposite spin directions is compared to the asymmetry of galaxies from the Sloan Digital Sky Survey. The two data sets contain different galaxies at different redshift ranges, and each data set was annotated using a different annotation method. The results show that both data sets show a similar asymmetry in the COSMOS field, which is covered by both telescopes. Fitting the asymmetry of the galaxies to cosine dependence shows a dipole axis with probabilities of $\sim2.8\sigma$ and $\sim7.38\sigma$ in HST and SDSS, respectively. The most likely dipole axis identified in the HST galaxies is at $(\alpha=78^{\rm o},\delta=47^{\rm o})$ and is well within the $1\sigma$ error range compared to the location of the most likely dipole axis in the SDSS galaxies with $z>0.15$ , identified at $(\alpha=71^{\rm o},\delta=61^{\rm o})$ .

Realtime cerebellum: A large-scale spiking network model of the cerebellum that runs in realtime using a graphics processing unit

Neural Networks ◽

10.1016/j.neunet.2013.01.019 ◽

2013 ◽

Vol 47 ◽

pp. 103-111 ◽

Cited By ~ 47

Author(s):

Tadashi Yamazaki ◽

Jun Igarashi

Keyword(s):

Network Model ◽

Large Scale ◽

Processing Unit ◽

Spiking Network ◽

A lightweight approach to performance portability with targetDP

The International Journal of High Performance Computing Applications ◽

10.1177/1094342016682071 ◽

2016 ◽

Vol 32 (2) ◽

pp. 288-301

Author(s):

Alan Gray ◽

Kevin Stratford

Keyword(s):

Particle Physics ◽

Message Passing ◽

Graphics Processing Units ◽

High Performance ◽

Large Scale ◽

Message Passing Interface ◽

Processing Unit ◽

Performance Portability ◽

Leading high performance computing systems achieve their status through use of highly parallel devices such as NVIDIA graphics processing units or Intel Xeon Phi many-core CPUs. The concept of performance portability across such architectures, as well as traditional CPUs, is vital for the application programmer. In this paper we describe targetDP, a lightweight abstraction layer which allows grid-based applications to target data parallel hardware in a platform agnostic manner. We demonstrate the effectiveness of our pragmatic approach by presenting performance results for a complex fluid application (with which the model was co-designed), plus separate lattice quantum chromodynamics particle physics code. For each application, a single source code base is seen to achieve portable performance, as assessed within the context of the Roofline model. TargetDP can be combined with Message Passing Interface (MPI) to allow use on systems containing multiple nodes: we demonstrate this through provision of scaling results on traditional and graphics processing unit-accelerated large scale supercomputers.

Self-Adjusting Variable Neighborhood Search Algorithm for Near-Optimal k-Means Clustering

Computation ◽

10.3390/computation8040090 ◽

2020 ◽

Vol 8 (4) ◽

pp. 90

Author(s):

Lev Kazakovtsev ◽

Ivan Rozhnov ◽

Aleksey Popov ◽

Elena Tovbis

Keyword(s):

Large Scale ◽

Variable Neighborhood Search ◽

Search Algorithm ◽

Fixed Time ◽

Exact Algorithms ◽

Neighborhood Search ◽

Data Sets ◽

Processing Unit ◽

Online Computation

The k-means problem is one of the most popular models in cluster analysis that minimizes the sum of the squared distances from clustered objects to the sought cluster centers (centroids). The simplicity of its algorithmic implementation encourages researchers to apply it in a variety of engineering and scientific branches. Nevertheless, the problem is proven to be NP-hard which makes exact algorithms inapplicable for large scale problems, and the simplest and most popular algorithms result in very poor values of the squared distances sum. If a problem must be solved within a limited time with the maximum accuracy, which would be difficult to improve using known methods without increasing computational costs, the variable neighborhood search (VNS) algorithms, which search in randomized neighborhoods formed by the application of greedy agglomerative procedures, are competitive. In this article, we investigate the influence of the most important parameter of such neighborhoods on the computational efficiency and propose a new VNS-based algorithm (solver), implemented on the graphics processing unit (GPU), which adjusts this parameter. Benchmarking on data sets composed of up to millions of objects demonstrates the advantage of the new algorithm in comparison with known local search algorithms, within a fixed time, allowing for online computation.

Splotch

The International Journal of High Performance Computing Applications ◽

10.1177/1094342016652713 ◽

2016 ◽

Vol 31 (6) ◽

pp. 550-563

Author(s):

Timothy Dykes ◽

Claudio Gheller ◽

Marzia Rivi ◽

Mel Krokos

Keyword(s):

High Performance ◽

Large Scale ◽

Processing Unit ◽

Xeon Phi ◽

The Many ◽

Many Core ◽

Performance Results ◽

Graphics Processing ◽

Performance Computing

With the increasing size and complexity of data produced by large-scale numerical simulations, it is of primary importance for scientists to be able to exploit all available hardware in heterogenous high-performance computing environments for increased throughput and efficiency. We focus on the porting and optimization of Splotch, a scalable visualization algorithm, to utilize the Xeon Phi, Intel’s coprocessor based upon the new many integrated core architecture. We discuss steps taken to offload data to the coprocessor and algorithmic modifications to aid faster processing on the many-core architecture and make use of the uniquely wide vector capabilities of the device, with accompanying performance results using multiple Xeon Phi. Finally we compare performance against results achieved with the Graphics Processing Unit (GPU) based implementation of Splotch.

2010 International Symposium on Intelligent Signal Processing and Communication Systems ◽

Implementation of large-scale fir adaptive filters on NVIDIA GeForce graphics processing unit

10.1109/ispacs.2010.5704666 ◽

2010 ◽

Author(s):

Akihiro Hirano ◽

Kenji Nakayama

Keyword(s):

Large Scale ◽

Adaptive Filters ◽

Processing Unit ◽

Encyclopedia of Information Science and Technology, Fifth Edition - Advances in Information Quality and Management ◽

Data Streaming Processing Window Joined With Graphics Processing Units (GPUs)

10.4018/978-1-7998-3479-3.ch043 ◽

2021 ◽

pp. 602-623

Author(s):

Shen Lu ◽

Richard S. Segall

Keyword(s):

Big Data ◽

Data Streams ◽

Graphics Processing Units ◽

Data Stream ◽

Large Scale ◽

Processing Unit ◽

Data Streaming ◽

Large Scale Data ◽

Big data is large-scale data and can be either discrete or continuous. This article entails research that discusses the continuous case of big data often called “data streaming.” More and more businesses will depend on being able to process and make decisions on streams of data. This article utilizes the algorithmic side of data stream processing often called “stream analytics” or “stream mining.” Data streaming Windows Join can be improved by using graphics processing unit (GPU) for higher performance computing. Data streams are generated by two independent threads: one thread can be used to generate Data Stream A, and the other thread can be used to generate Data Stream B. One would use a Windows Join thread to merge the two data streams, which is also the process of “Data Stream Window Join.” The Window Join process can be implemented in parallel that can efficiently improve the computing speed. Experiments are provided for Data Stream Window Joins using both static and dynamic data.

Installation to Production of a Large-Scale General Purpose Graphics Processing Unit (GPGPU) Cluster at the U.S. Army Research Laboratory: Thufir

10.21236/ada610234 ◽

2014 ◽

Author(s):

Brian J. Henz ◽

John Lazorisak ◽

Jaroslaw Knap ◽

Jason Livingston ◽

Dale R. Shires

Keyword(s):

Research Laboratory ◽

Large Scale ◽

General Purpose ◽

Processing Unit ◽

Army Research Laboratory ◽

Graphics Processing ◽

The U.S

ClusterSheep: A Graphics Processing Unit-Accelerated Software Tool for Large-Scale Clustering of Tandem Mass Spectra from Shotgun Proteomics

Journal of Proteome Research ◽

10.1021/acs.jproteome.1c00485 ◽

2021 ◽

Author(s):

Paul Ka Po To ◽

Long Wu ◽

Chak Ming Chan ◽

Ayman Hoque ◽

Henry Lam

Keyword(s):

Mass Spectra ◽

Large Scale ◽

Software Tool ◽

Shotgun Proteomics ◽

Processing Unit ◽

Tandem Mass ◽

Tandem Mass Spectra ◽

An Interval Type 2 Fuzzy Logic Framework for Faster Evolutionary Design

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2019.8576 ◽

2019 ◽

Vol 16 (12) ◽

pp. 5140-5148

Author(s):

Sarabjeet Singh ◽

Satvir Singh ◽

Vijay Kumar Banga

Keyword(s):

Fuzzy Logic ◽

Noisy Data ◽

Parallel Execution ◽

Rule Base ◽

Processing Unit ◽

Data Set ◽

Interval Type ◽

In this paper, a fast processing and efficient framework has been proposed to get an optimum output from a noisy data set of a system by using interval type-2 fuzzy logic system. Further, the concept of GPGPU (General Purpose Computing on Graphics Processing Unit) is used for fast execution of the fuzzy rule base on Graphics Processing Unit (GPU). Application of Whale Optimization Algorithm (WOA) is used to ascertain optimum output from noisy data set. Which is further integrated with Interval Type-2 (IT2) fuzzy logic system and executed on Graphics Processing Unit for faster execution. The proposed framework is also designed for parallel execution using GPU and the results are compared with the serial program execution. Further, it is clearly observed that the parallel execution rule base evolved provide better accuracy in less time. The proposed framework (IT2FLS) has been validated with classical bench mark problem of Mackey Glass Time Series. For non-stationary time-series data with additive gaussian noise has been implemented with proposed framework and with T1 FLS. Further, it is observed that IT2 FLS provides better rule base for noisy data set.