scholarly journals DETECTION OF BEHAVIOR PATTERNS OF INTEREST USING BIG DATA WHICH HAVE SPATIAL AND TEMPORAL ATTRIBUTES

Author(s):  
R. W. La Valley ◽  
A. Usher ◽  
A. Cook

New innovative analytical techniques are emerging to extract patterns in Big Data which have temporal and geospatial attributes. These techniques are required to find patterns of interest in challenging circumstances when geospatial datasets have millions or billions of records and imprecision exists around the exact latitude and longitude of the data. Furthermore, the usual temporal vector approach of years, months, days, hours, minutes and seconds often are computationally expensive and in many cases do not allow the user control of precision necessary to find patterns of interest.<br><br> Geohashing is a single variable ASCII string representation of two-dimensional geometric coordinates. Time hashing is a similar ASCII representation which combines the temporal aspects of date and time of the data into a one dimensional set of data attributes. Both methods utilize Z-order curves which map multidimensional data into single dimensions while preserving locality of the data records. This paper explores the use of a combination of both geohashing and time hashing that is known as “geo-temporal” hashing or “space-time” boxes. This technique provides a foundation for reducing the data into bins that can yield new methods for pattern discovery and detection in Big Data.

2018 ◽  
Vol 38 (6) ◽  
Author(s):  
Giuseppe Grasso

Despite the enormous number of therapeutic advances in medicine, nowadays many diseases are still incurable, mainly due to the lack of knowledge of the pathological biochemical pathways triggering those diseases. For this reason, it is compulsory for the scientific community to investigate and unveil the biomolecular mechanisms responsible for the development of those diseases, such as Alzheimer’s disease and diabetes, which are widespread all over the world. In this scenario, it is of paramount importance to develop new analytical techniques and experimental procedures that are capable to make the above-mentioned investigations feasible. These new methods should allow easy performable analysis carried out in a label-free environment, in order to give reliable answers to specific biochemical questions. A recent paper published on Bioscience Reports by Ivancic et al. (https://doi.org/10.1042/BSR20181416) proposes a new analytical technique capable to reveal some mechanistic insights into the regulation of insulin-degrading enzyme (IDE), a protein involved in the above-mentioned diseases. IDE is a multifaceted enzyme having different and not well-defined roles in the cell, but it is primarily a proteolytic enzyme capable to degrade several different amyloidogenic substrates involved in different diseases. Moreover, many molecules are responsible for IDE activity modulation so that understanding how IDE activity is regulated represents a very challenging analytical task. The new analytical approach proposed by Ivancic et al. reports on the possibility to study IDE activity in an unbiased and label-free manner, representing a valid alternative assay for the investigation of any proteases degradative activity.


Author(s):  
Dafydd Evans

Mutual information quantifies the determinism that exists in a relationship between random variables, and thus plays an important role in exploratory data analysis. We investigate a class of non-parametric estimators for mutual information, based on the nearest neighbour structure of observations in both the joint and marginal spaces. Unless both marginal spaces are one-dimensional, we demonstrate that a well-known estimator of this type can be computationally expensive under certain conditions, and propose a computationally efficient alternative that has a time complexity of order ( N  log  N ) as the number of observations N →∞.


Web Services ◽  
2019 ◽  
pp. 618-638
Author(s):  
Goran Klepac ◽  
Kristi L. Berg

This chapter proposes a new analytical approach that consolidates the traditional analytical approach for solving problems such as churn detection, fraud detection, building predictive models, segmentation modeling with data sources, and analytical techniques from the big data area. Presented are solutions offering a structured approach for the integration of different concepts into one, which helps analysts as well as managers to use potentials from different areas in a systematic way. By using this concept, companies have the opportunity to introduce big data potential in everyday data mining projects. As is visible from the chapter, neglecting big data potentials results often with incomplete analytical results, which imply incomplete information for business decisions and can imply bad business decisions. The chapter also provides suggestions on how to recognize useful data sources from the big data area and how to analyze them along with traditional data sources for achieving more qualitative information for business decisions.


Data and analytics is the heart of a digital business platform. Today, big data (BD) becomes useful when it enriches decision making that is enhanced by application of analytical techniques and some element of human interaction. With the merging of data and information vs. knowledge and intelligence, this chapter investigates an opportunity for cross-fertilization between BD and the field of digital business with related disciplines. Primary BD and analytics platform is a set of business capabilities. This chapter aims to investigate the potential relationship of BD and analytics platform and digital business platform. In doing so, it develops a BD value chain framework, BD business model pattern (BDBMP) with related levels of BD maturity improvement. This framework could be used to find answers on the basic BD and digital business relationship questions.


2020 ◽  
Vol 1639 ◽  
pp. 012018
Author(s):  
Jieli Sun ◽  
Shuangsi Li ◽  
Naishi Yan ◽  
Yanxia Zhao ◽  
jianke Li

2010 ◽  
Vol 22 (8) ◽  
pp. 2208-2227 ◽  
Author(s):  
Intae Lee

While the sample-spacings-based density estimation method is simple and efficient, its applicability has been restricted to one-dimensional data. In this letter, the method is generalized such that it can be extended to multiple dimensions in certain circumstances. As a consequence, a multidimensional entropy estimator of spherically invariant continuous random variables is derived. Partial bias of the estimator is analyzed, and the estimator is further used to derive a nonparametric objective function for frequency-domain independent component analysis. The robustness and the effectiveness of the objective function are demonstrated with simulation results.


2020 ◽  
Vol 34 (5) ◽  
pp. 599-612 ◽  
Author(s):  
Ryan L. Boyd ◽  
Paola Pasca ◽  
Kevin Lanning

Personality psychology has long been grounded in data typologies, particularly in the delineation of behavioural, life outcome, informant–report, and self–report sources of data from one another. Such data typologies are becoming obsolete in the face of new methods, technologies, and data philosophies. In this article, we discuss personality psychology's historical thinking about data, modern data theory's place in personality psychology, and several qualities of big data that urge a rethinking of personality itself. We call for a move away from self–report questionnaires and a reprioritization of the study of behaviour within personality science. With big data and behavioural assessment, we have the potential to witness the confluence of situated, seamlessly interacting psychological processes, forming an inclusive, dynamic, multiangle view of personality. However, big behavioural data come hand in hand with important ethical considerations, and our emerging ability to create a ‘personality panopticon’ requires careful and thoughtful navigation. For our research to improve and thrive in partnership with new technologies, we must not only wield our new tools thoughtfully, but humanely. Through discourse and collaboration with other disciplines and the general public, we can foster mutual growth and ensure that humanity's burgeoning technological capabilities serve, rather than control, the public interest. © 2020 European Association of Personality Psychology


2013 ◽  
Vol 1 (1) ◽  
pp. 7 ◽  
Author(s):  
Casimiro S. Munita ◽  
Lúcia P. Barroso ◽  
Paulo M.S. Oliveira

Several analytical techniques are often used in archaeometric studies, and when used in combination, these techniques can be used to assess 30 or more elements. Multivariate statistical methods are frequently used to interpret archaeometric data, but their applications can be problematic or difficult to interpret due to the large number of variables. In general, the analyst first measures several variables, many of which may be found to be uninformative, this is naturally very time consuming and expensive. In subsequent studies the analyst may wish to measure fewer variables while attempting to minimize the loss of essential information. Such multidimensional data sets must be closely examined to draw useful information. This paper aims to describe and illustrate a stopping rule for the identification of redundant variables, and the selection of variables subsets, preserving multivariate data structure using Procrustes analysis, selecting those variables that are in some senses adequate for discrimination purposes. We provide an illustrative example of the procedure using a data set of 40 samples in which were determined the concentration of As, Ce, Cr, Eu, Fe, Hf, La, Na, Nd, Sc, Sm, Th, and U obtained via instrumental neutron activation analysis (INAA) on archaeological ceramic samples. The results showed that for this data set, only eight variables (As, Cr, Fe, Hf, La, Nd, Sm, and Th) are required to interpret the data without substantial loss information.


Sign in / Sign up

Export Citation Format

Share Document