Efficient Processing of Queries using Filtered Bitmap Index with Multi-Join Multiple Set Predicates

Efficient query process is an essential task in numerous environments that relate large sum of information. The performance degradation occurs when the sum of information is increased. It will further degrade when the amount of joins in the queries is increased. These problems emphasize a need for good query processing approach. Thus, in this report, we take a various method to optimize the multi-join query with multiple set predicates in Data warehousing environment. So we have proposed an effective algorithm as Filtered Bitmap Index with multi-join multiple set predicates processing approach and examine the time complexity on huge data set with multiple tables. In this approach, the multi-join query is processed by selecting the tabular array based on their level number from lower to higher. A simple rewritten query was created from the given complex query exploitation uses the lowest level table and executed. If the result exists then only continue the join processing in the rewritten query, by taking the next lower level table from the complex query and do the execution. The ratio of our technique is to demonstrated with moving experiment using WorldCup98 and TPC-H benchmark datasets

Download Full-text

An Optimized Pruning-Based Outlier Detecting Algorithm

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.411-414.1076 ◽

2013 ◽

Vol 411-414 ◽

pp. 1076-1080

Author(s):

Jing Hua Wang ◽

Xin Xiang Zhao ◽

Peng Jin ◽

Guo Yan Zhang

Keyword(s):

Outlier Detection ◽

Time Complexity ◽

Cluster Models ◽

Detection Accuracy ◽

Data Set ◽

Multiple Parameters ◽

Query Process ◽

Data Objects ◽

Pruning Technique ◽

Local Outlier

An Optimized Pruning-based Outlier Detecting algorithm is proposed based on the density-based outlier detecting algorithm (LOF algorithm). The calculation accuracy and the time complexity of LOF algorithm are not ideal, so two steps are taken to reduce the amount of calculation and improve the calculation accuracy for LOF algorithm. Firstly, using cluster pruning technique to preprocess data set, at the same time filtering the non-outliers based on the differences of cluster models to avoid the error pruning of outliers located at the edge of clusters, different cluster models are output by inputing multiple parameters in the DBSCAN algorithm. Secondly,optimize the query process of the neighborhood (neighbor and k-neighbor). After pruning, local outlier factors are calculated only for the data objects out of clusters. Experimental results show that the algorithm proposed in this paper can improve the outlier detection accuracy, reduce the time complexity and realize the effective local outlier detection.

Download Full-text

Performance Analysis of a Faster In−place External Sorting Algorithm

Asian Journal of Research in Computer Science ◽

10.9734/ajrcos/2019/v4i430122 ◽

2020 ◽

pp. 1-7

Author(s):

Asaduzzaman Nur Shuvo ◽

Apurba Adhikary ◽

Md. Bipul Hossain ◽

Sultana Jahan Soheli

Keyword(s):

Time Complexity ◽

Divide And Conquer ◽

Sorting Algorithm ◽

Data Sets ◽

Data Set ◽

Internal Memory ◽

Huge Data ◽

External Sorting ◽

Bottle Neck ◽

Quick Sort

Data sets in large applications are often too gigantic to fit completely inside the computer’s internal memory. The resulting input/output communication (or I/O) between fast internal memory and slower external memory (such as disks) can be a major performance bottle−neck. While applying sorting on this huge data set, it is essential to do external sorting. This paper is concerned with a new in−place external sorting algorithm. Our proposed algorithm uses the concept of Quick−Sort and Divide−and−Conquer approaches resulting in a faster sorting algorithm avoiding any additional disk space. In addition, we showed that the average time complexity can be reduced compared to the existing external sorting approaches.

Download Full-text

Efficient Processing and Optimization of Queries with Set Predicates using Filtered Bitmap Index

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v5i11.3339 ◽

2017 ◽

Vol 5 (11) ◽

pp. 33-39

Author(s):

A.Regita Thangam ◽

◽

S.John Peter

Keyword(s):

Bitmap Index ◽

Efficient Processing

Download Full-text

A Fast and Lightweight Method with Feature Fusion and Multi-Context for Face Detection

Future Internet ◽

10.3390/fi10080080 ◽

2018 ◽

Vol 10 (8) ◽

pp. 80

Author(s):

Lei Zhang ◽

Xiaoli Zhi

Keyword(s):

Face Detection ◽

Graphics Processing Units ◽

High Performance ◽

Feature Fusion ◽

Local Context ◽

Data Set ◽

Global Context ◽

Detection Algorithms ◽

Multi Scale ◽

Benchmark Datasets

Convolutional neural networks (CNN for short) have made great progress in face detection. They mostly take computation intensive networks as the backbone in order to obtain high precision, and they cannot get a good detection speed without the support of high-performance GPUs (Graphics Processing Units). This limits CNN-based face detection algorithms in real applications, especially in some speed dependent ones. To alleviate this problem, we propose a lightweight face detector in this paper, which takes a fast residual network as backbone. Our method can run fast even on cheap and ordinary GPUs. To guarantee its detection precision, multi-scale features and multi-context are fully exploited in efficient ways. Specifically, feature fusion is used to obtain semantic strongly multi-scale features firstly. Then multi-context including both local and global context is added to these multi-scale features without extra computational burden. The local context is added through a depthwise separable convolution based approach, and the global context by a simple global average pooling way. Experimental results show that our method can run at about 110 fps on VGA (Video Graphics Array)-resolution images, while still maintaining competitive precision on WIDER FACE and FDDB (Face Detection Data Set and Benchmark) datasets as compared with its state-of-the-art counterparts.

Download Full-text

Borderline and Depression: A Thin EEG Line

Clinical EEG and Neuroscience ◽

10.1177/15500594211060830 ◽

2021 ◽

pp. 155005942110608

Author(s):

Jakša Vukojević ◽

Damir Mulc ◽

Ivana Kinder ◽

Eda Jovičić ◽

Krešimir Friganović ◽

...

Keyword(s):

Machine Learning ◽

Borderline Personality ◽

Machine Learning Techniques ◽

Major Depressive ◽

Everyday Clinical Practice ◽

Data Set ◽

Learning Techniques ◽

Eeg Recordings ◽

The Given ◽

Close Interrelationship

In everyday clinical practice, there is an ongoing debate about the nature of major depressive disorder (MDD) in patients with borderline personality disorder (BPD). The underlying research does not give us a clear distinction between those 2 entities, although depression is among the most frequent comorbid diagnosis in borderline personality patients. The notion that depression can be a distinct disorder but also a symptom in other psychopathologies led our team to try and delineate those 2 entities using 146 EEG recordings and machine learning. The utilized algorithms, developed solely for this purpose, could not differentiate those 2 entities, meaning that patients suffering from MDD did not have significantly different EEG in terms of patients diagnosed with MDD and BPD respecting the given data and methods used. By increasing the data set and the spatiotemporal specificity, one could have a more sensitive diagnostic approach when using EEG recordings. To our knowledge, this is the first study that used EEG recordings and advanced machine learning techniques and further confirmed the close interrelationship between those 2 entities.

Download Full-text

Domain-specific Evaluation Dataset Generator for Multilingual Text Analysis

Journal of Intelligent Systems with Applications ◽

10.54856/jiswa.201912084 ◽

2019 ◽

pp. 140-147

Author(s):

Emrah Inan ◽

Vahab Mostafapour ◽

Fatif Tekbacak

Keyword(s):

Text Analysis ◽

General Purpose ◽

Entity Linking ◽

Named Entity ◽

Domain Specific ◽

Benchmark Datasets ◽

Concise Information ◽

Multilingual Text ◽

The Given ◽

Specific Evaluation

Web enables to retrieve concise information about specific entities including people, organizations, movies and their features. Additionally, large amount of Web resources generally lies on a unstructured form and it tackles to find critical information for specific entities. Text analysis approaches such as Named Entity Recognizer and Entity Linking aim to identify entities and link them to relevant entities in the given knowledge base. To evaluate these approaches, there are a vast amount of general purpose benchmark datasets. However, it is difficult to evaluate domain-specific approaches due to lack of evaluation datasets for specific domains. This study presents WeDGeM that is a multilingual evaluation set generator for specific domains exploiting Wikipedia category pages and DBpedia hierarchy. Also, Wikipedia disambiguation pages are used to adjust the ambiguity level of the generated texts. Based on this generated test data, a use case for well-known Entity Linking systems supporting Turkish texts are evaluated in the movie domain.

Download Full-text

Laser Induced Breakdown Spectroscopy (LIBS): Application to Geological Materials-=SUP=-*-=/SUP=-

Оптика и спектроскопия ◽

10.21883/os.2021.10.51502.1003-21 ◽

2021 ◽

Vol 129 (10) ◽

pp. 1336

Author(s):

Sonali Dubey ◽

Rohit Kumar ◽

Abhishek K. Rai ◽

Awadhesh K. Rai

Keyword(s):

Planetary Exploration ◽

Field Analysis ◽

Laser Induced Breakdown Spectroscopy ◽

Data Set ◽

Geological Materials ◽

Breakdown Spectroscopy ◽

Spot Detection ◽

Huge Data ◽

Laser Induced Breakdown

Laser-induced breakdown spectroscopy (LIBS) is emerging as an analytical tool for investigating geological materials. The unique abilities of this technique proven its potential in the area of geology. Detection of light elements, portability for in-field analysis, spot detection, and no sample preparation are some features that make this technique appropriate for the study of geological materials. The application of the LIBS technique has been tremendously developed in recent years. In this report, results obtained from previous and most recent studies regarding the investigation of geological materials LIBS technique are reviewed. Firstly, we introduce investigations that report the advancement in LIBS instrumentation, its applications, especially in the area of gemology and the extraterrestrial/planetary exploration have been reviewed. Investigation of gemstones by LIBS technique is not widely reviewed in the past as compared to LIBS application in planetary exploration or other geological applications. It is anticipated that for the classification of gemstones samples, huge data set is appropriate and to analyze this data set, multivariate/chemometric methods will be useful. Recent advancement of LIBS instrumentation for the study of meteorites, depth penetration in Martian rocks and its regolith proved the feasibility of LIBS used as robotic vehicles in the Martian environment. Keywords: LIBS, Gemstone, geological samples, Extra-terrestrial

Download Full-text

FUSION OF HYPERSPECTRAL, MULTISPECTRAL, COLOR AND 3D POINT CLOUD INFORMATION FOR THE SEMANTIC INTERPRETATION OF URBAN ENVIRONMENTS

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-2-w13-1899-2019 ◽

2019 ◽

Vol XLII-2/W13 ◽

pp. 1899-1906 ◽

Cited By ~ 1

Author(s):

M. Weinmann ◽

M. Weinmann

Keyword(s):

3D Structure ◽

Urban Environments ◽

Hyperspectral Data ◽

Semantic Interpretation ◽

Lidar Data ◽

Data Set ◽

Feature Sets ◽

Sensor Platforms ◽

The Given ◽

Color Imagery

<p><strong>Abstract.</strong> In this paper, we address the semantic interpretation of urban environments on the basis of multi-modal data in the form of RGB color imagery, hyperspectral data and LiDAR data acquired from aerial sensor platforms. We extract radiometric features based on the given RGB color imagery and the given hyperspectral data, and we also consider different transformations to potentially better data representations. For the RGB color imagery, these are achieved via color invariants, normalization procedures or specific assumptions about the scene. For the hyperspectral data, we involve techniques for dimensionality reduction and feature selection as well as a transformation to multispectral Sentinel-2-like data of the same spatial resolution. Furthermore, we extract geometric features describing the local 3D structure from the given LiDAR data. The defined feature sets are provided separately and in different combinations as input to a Random Forest classifier. To assess the potential of the different feature sets and their combination, we present results achieved for the MUUFL Gulfport Hyperspectral and LiDAR Airborne Data Set.</p>

Download Full-text

Daily briefing: Huge data set reveals COVID-19’s unequal toll in the United States

Nature ◽

10.1038/d41586-020-02036-7 ◽

2020 ◽

Author(s):

Flora Graham

Keyword(s):

United States ◽

The United States ◽

Data Set ◽

Huge Data

Download Full-text

Uplatnění neparametrické metody DEA při zkoumání efektivnosti obcí a měst

10.5817/cz.muni.p210-9896-2021-48 ◽

2021 ◽

Author(s):

Marek Jetmar ◽

Jan Kubát

Keyword(s):

Public Services ◽

Returns To Scale ◽

Administrative District ◽

Data Set ◽

Average Value ◽

Dea Models ◽

Municipal Police ◽

The Given ◽

Variable Returns To Scale

The article deals with the application of data envelope analysis (DEA), in examining the efficiency of selected public services provided by municipalities and cities. The method is focused on calculating indicators for individual municipalities and groups of municipalities. When calculating the efficiency, the DEA model with variable returns to scale and superefficiency is used. The distance from the efficiency limit (data envelope) is not measured by Euclidean, as classical DEA models, but by Chebyshev distance. The analysis focuses on examining efficiency within groups of municipalities, defined according to the number of inhabitants and location in relation to development centers, but also these groups in the context of the entire data set. The created model allows to calculate the efficiency of each municipality and monitor its ranking within the given category, but also the type of municipality, administrative district or region. It then shows the practical results of the calculation of efficiency - the achieved average value on the example of schools and municipal police. The variability of the results achieved is subject to interpretation with respect to the services examined. Finally, the limits of DEA use are discussed with regard to the quality of available data and the overall appropriateness of the method for monitoring the efficiency of municipalities.

Download Full-text