Information graph-based creation of parallel queries for databases

Author(s):  
Yulia Shichkina ◽  
Dmitry Gushchanskiy ◽  
Alexander Degtyarev
2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Sumedh Yadav ◽  
Mathis Bode

Abstract A scalable graphical method is presented for selecting and partitioning datasets for the training phase of a classification task. For the heuristic, a clustering algorithm is required to get its computation cost in a reasonable proportion to the task itself. This step is succeeded by construction of an information graph of the underlying classification patterns using approximate nearest neighbor methods. The presented method consists of two approaches, one for reducing a given training set, and another for partitioning the selected/reduced set. The heuristic targets large datasets, since the primary goal is a significant reduction in training computation run-time without compromising prediction accuracy. Test results show that both approaches significantly speed-up the training task when compared against that of state-of-the-art shrinking heuristics available in LIBSVM. Furthermore, the approaches closely follow or even outperform in prediction accuracy. A network design is also presented for a partitioning based distributed training formulation. Added speed-up in training run-time is observed when compared to that of serial implementation of the approaches.


1992 ◽  
Vol 57 (2) ◽  
pp. 677-681 ◽  
Author(s):  
Martin Kummer

In 1986, Beigel [Be87] (see also [Od89, III.5.9]) proved the nonspeedup theorem: if A, B ⊆ ω, and as a function of 2n variables can be computed by an algorithm which makes at most n queries to B, then A is recursive (informally, 2n parallel queries to a nonrecursive oracle A cannot be answered by making n sequential (or “adaptive”) queries to an arbitrary oracle B). Here, 2n cannot be replaced by 2n − 1. In subsequent papers of Beigel, Gasarch, Gill, Hay, and Owings the theory of “bounded query classes” has been further developed (see, for example, [BGGOta], [BGH89], and [Ow89]). The topic has also been studied in the context of structural complexity theory (see, for example, [AG88], [Be90], and [JY90]).If A ⊆ ω and n ≥ 1, let . Beigel [Be87] stated the powerful “cardinality conjecture” (CC): if A, B ⊆ ω, and can be computed by an algorithm which makes at most n queries to B, then A is recursive. Owings [Ow89] verified CC for n = 1, and, for n 1, he proved that A is recursive in the halting problem. We prove that CC is true for all n.


2020 ◽  

This study aimed to examine the brain signals of children with Autism Spectrum Disorder (ASD) and use a method according to the concept of complementary opposites to obtain the prominent features or a pattern of EEG signal that represents the biological characteristic of such children. In this study, 20 children with the mean±SD age of 8±5 years were divided into two groups of normal control (NC) and ASD. The diagnosis and approval of individuals in both groups were conducted by two experts in the field of pediatric psychiatry and neurology. The recording protocol was designed with the most accuracy; therefore, the brain signals were recorded with the least noise in the awake state of the individuals in both groups. Moreover, the recording was conducted in three stages from two channels (C3-C4) of EEG ( referred to as the central part of the brain) which were symmetrical in function. In this study, the Mandala method was adopted based on the concept of complementary opposites to investigate the features extracted from Mandala pattern topology and obtain new features and pseudo-patterns for the screening and early diagnosis of ASD. The optimal feature here was based on different stages of processing and statistical analysis of Pattern Detection Capability (PDC). The PDC is a biomarker derived from the Mandala pattern for differentiating the NC from ASD groups.


Author(s):  
Wei Yan

Parallel queries of k Nearest Neighbor for massive spatial data are an important issue. The k nearest neighbor queries (kNN queries), designed to find k nearest neighbors from a dataset S for every point in another dataset R, is a useful tool widely adopted by many applications including knowledge discovery, data mining, and spatial databases. In cloud computing environments, MapReduce programming model is a well-accepted framework for data-intensive application over clusters of computers. This chapter proposes a parallel method of kNN queries based on clusters in MapReduce programming model. Firstly, this chapter proposes a partitioning method of spatial data using Voronoi diagram. Then, this chapter clusters the data point after partition using k-means method. Furthermore, this chapter proposes an efficient algorithm for processing kNN queries based on k-means clusters using MapReduce programming model. Finally, extensive experiments evaluate the efficiency of the proposed approach.


1996 ◽  
Vol 25 (2) ◽  
pp. 365-376 ◽  
Author(s):  
Minos N. Garofalakis ◽  
Yannis E. Ioannidis

1987 ◽  
Vol 41 (1) ◽  
pp. 51-64
Author(s):  
J.A.R. Blais

In general, land information includes all information that is related to the land and its resources. Among the necessary considerations in the design and development of a land information system, the topological aspects are fundamental as they refer to the interconnectivity of the information. Graph and information theoretic considerations, based on the natural topology of the information, are also required for system analysis, optimization and other purposes. Some practical aspects of these considerations are briefly discussed with suggestions for further studies and investigations.


Sign in / Sign up

Export Citation Format

Share Document