Towards Information Discovery On Large Scale Data: state-of-the-art

Author(s):  
Rohit ◽  
Bhumika Gupta ◽  
Ravish Kumar ◽  
Aman Kumar
Author(s):  
Surabhi Kumari

Abstract: MPC (multi-party computation) is a comprehensive cryptographic concept that can be used to do computations while maintaining anonymity. MPC allows a group of people to work together on a function without revealing the plaintext's true input or output. Privacy-preserving voting, arithmetic calculation, and large-scale data processing are just a few of the applications of MPC. Each MPC party can run on a single computing node from a system perspective. Multiple parties' computing nodes could be homogenous or heterogeneous; nevertheless, MPC protocols' distributed workloads are always homogeneous (symmetric). We investigate the system performance of a representative MPC framework and a collection of MPC applications in this paper. On homogeneous and heterogeneous compute nodes, we describe the complete online calculation workflow of a state-of-the-art MPC protocol and examine the fundamental cause of its stall time and performance limitation. Keywords: Cloud Computing, IoT, MPC, Amazon Service, Virtualization.


2020 ◽  
Vol 2020 ◽  
pp. 1-7
Author(s):  
Tuozhong Yao ◽  
Wenfeng Wang ◽  
Yuhong Gu

Multiview active learning (MAL) is a technique which can achieve a large decrease in the size of the version space than traditional active learning and has great potential applications in large-scale data analysis. In this paper, we present a new deep multiview active learning (DMAL) framework which is the first to combine multiview active learning and deep learning for annotation effort reduction. In this framework, our approach advances the existing active learning methods in two aspects. First, we incorporate two different deep convolutional neural networks into active learning which uses multiview complementary information to improve the feature learnings. Second, through the properly designed framework, the feature representation and the classifier can be simultaneously updated with progressively annotated informative samples. The experiments with two challenging image datasets demonstrate that our proposed DMAL algorithm can achieve promising results than several state-of-the-art active learning algorithms.


2021 ◽  
Author(s):  
Qi Zhai ◽  
Zhigang Kan ◽  
Linhui Feng ◽  
Linbo Qiao ◽  
Feng Liu

Recently, Chinese event detection has attracted more and more attention. As a special kind of hieroglyphics, Chinese glyphs are semantically useful but still unexplored in this task. In this paper, we propose a novel Glyph-Aware Fusion Network, named GlyFN. It introduces the glyphs' information into the pre-trained language model representation. To obtain a better representation, we design a Vector Linear Fusion mechanism to fuse them. Specifically, it first utilizes a max-pooling to capture salient information. Then, we use the linear operation of vectors to retain unique information. Moreover, for large-scale unstructured text, we distribute the data into different clusters parallelly. Finally, we conduct extensive experiments on ACE2005 and large-scale data. Experimental results show that GlyFN obtains increases of 7.48(10.18%) and 6.17(8.7%) in the F1-score for trigger identification and classification over the state-of-the-art methods, respectively. Furthermore, the event detection task for large-scale unstructured text can be efficiently accomplished through distribution.


2020 ◽  
Author(s):  
Than Le

In this paper, we focus on simple data-driven approach to solve deep learning based on implementing the Mask R-CNN module by analyzing deeper manipulation of datasets. We firstly approach to affine transformation and projective representation to data augmentation analysis in order to increasing large-scale data manually based on the state-of-the-art in views of computer vision. Then we evaluate our method concretely by connection our datasets by visualization data and completely in testing to many methods to understand intelligent data analysis in object detection and segmentation by using more than 5000 image according to many similar objects. As far as, it illustrated efficiency of small applications such as food recognition, grasp and manipulation in robotics<br>


2021 ◽  
pp. 1-18
Author(s):  
Salahaldeen Rababa ◽  
Amer Al-Badarneh

Large-scale datasets collected from heterogeneous sources often require a join operation to extract valuable information. MapReduce is an efficient programming model for processing large-scale data. However, it has some limitations in processing heterogeneous datasets. This is because of the large amount of redundant intermediate records that are transferred through the network. Several filtering techniques have been developed to improve the join performance, but they require multiple MapReduce jobs to process the input datasets. To address this issue, the adaptive filter-based join algorithms are presented in this paper. Specifically, three join algorithms are introduced to perform the processes of filters creation and redundant records elimination within a single MapReduce job. A cost analysis of the introduced join algorithms shows that the I/O cost is reduced compared to the state-of-the-art filter-based join algorithms. The performance of the join algorithms was evaluated in terms of the total execution time and the total amount of I/O data transferred. The experimental results show that the adaptive Bloom join, semi-adaptive intersection Bloom join, and adaptive intersection Bloom join decrease the total execution time by 30%, 25%, and 35%, respectively; and reduce the total amount of I/O data transferred by 18%, 25%, and 50%, respectively.


2020 ◽  
Author(s):  
Than Le

In this paper, we focus on simple data-driven approach to solve deep learning based on implementing the Mask R-CNN module by analyzing deeper manipulation of datasets. We firstly approach to affine transformation and projective representation to data augmentation analysis in order to increasing large-scale data manually based on the state-of-the-art in views of computer vision. Then we evaluate our method concretely by connection our datasets by visualization data and completely in testing to many methods to understand intelligent data analysis in object detection and segmentation by using more than 5000 image according to many similar objects. As far as, it illustrated efficiency of small applications such as food recognition, grasp and manipulation in robotics<br>


2009 ◽  
Vol 28 (11) ◽  
pp. 2737-2740
Author(s):  
Xiao ZHANG ◽  
Shan WANG ◽  
Na LIAN

Sign in / Sign up

Export Citation Format

Share Document