Research of the data integration method for drilling data warehouse based on Hadoop

Author(s):  
Shijing Xiang ◽  
Ming Fang
2010 ◽  
Vol 30 (9) ◽  
pp. 2370-2373 ◽  
Author(s):  
Jiong LANG ◽  
Yan-bing LIU ◽  
Shi-yong XIONG

Author(s):  
G Sriman Narayana ◽  
Kuruva Arjun Kumar

In privacy-enhancing technology, it has been inevitably challenging to strike a maintain balance between privacy, efficiency and usability (utility). We propose a highly practical and efficient approach for privacy-preserving integration and sharing of datasets among a group of participants. At the heart of our solution is a new interactive protocol, Secure Channel. Through Secure Channel, each participant is able to randomize their datasets via an independent and untrusted third party, such that the resulting dataset can be merged with other randomized datasets contributed by other participants group in a privacy-preserving manner. Our process does not require any public or key sharing between participants in order to integrate different datasets. This, in turn, leads to a user can understand and use easily and scalable solution. Moreover, the accuracy of a randomized dataset which are returned by the third party can be securely verified by the other participant of group. We further demonstrate Secure Channel’s general utilities, using it to construct a structure preserving data integration protocol. This is mainly useful for, good quality integration of network traffic data.


2015 ◽  
Vol 20 (6) ◽  
pp. 483-489
Author(s):  
Xinming Wang ◽  
Haoxiang Tan ◽  
Kaijun Chen ◽  
Hua Tang ◽  
Gansen Zhao ◽  
...  

2012 ◽  
Vol 13 (1) ◽  
pp. 320 ◽  
Author(s):  
Shicheng Wu ◽  
Yawen Xu ◽  
Zeny Feng ◽  
Xiaojian Yang ◽  
Xiaogang Wang ◽  
...  

2013 ◽  
Vol 321-324 ◽  
pp. 2532-2538
Author(s):  
Xiao Guo Wang ◽  
Jian Shen ◽  
Chuan Sun

Considering the difficulty of information collection and integration due to the rapid growth of information, we need an efficient tool to do these jobs. A proposal is be put forward to build a data integration system to collect the source data and preprocess the heterogeneous data and then convert/extract data to the data warehouse. Through experiment and analysis, this paper designed an information process flow and implemented the data integration system, based on B/S framework with the database technology, to deal with the college related information.


2018 ◽  
Author(s):  
Jong-Eun Park ◽  
Krzysztof Polański ◽  
Kerstin Meyer ◽  
Sarah A. Teichmann

AbstractIncreasing numbers of large scale single cell RNA-Seq projects are leading to a data explosion, which can only be fully exploited through data integration. Therefore, efficient computational tools for combining diverse datasets are crucial for biology in the single cell genomics era. A number of methods have been developed to assist data integration by removing technical batch effects, but most are computationally intensive. To overcome the challenge of enormous datasets, we have developed BBKNN, an extremely fast graph-based data integration method. We illustrate the power of BBKNN for dimensionalityreduced visualisation and clustering in multiple biological scenarios, including a massive integrative study over several murine atlases. BBKNN successfully connects cell populations across experimentally heterogeneous mouse scRNA-Seq datasets, which reveals global markers of cell type and organspecificity and provides the foundation for inferring the underlying transcription factor network. BBKNN is available at https://github.com/Teichlab/bbknn.


Sign in / Sign up

Export Citation Format

Share Document