Optimizing data query performance of Bi-cluster for large-scale scientific data in supercomputers

Author(s):  
Xia Liao ◽  
Yixian Shen ◽  
Shengguo Li ◽  
Yutong Lu ◽  
Yufei Du ◽  
...  
2015 ◽  
Author(s):  
Tamara G. Kolda ◽  
Grey Ballard ◽  
Woody Nathan Austin
Keyword(s):  

2013 ◽  
Vol 441 ◽  
pp. 691-694
Author(s):  
Yi Qun Zeng ◽  
Jing Bin Wang

With the rapid development of information technology, data grows explosionly, how to deal with the large scale data become more and more important. Based on the characteristics of RDF data, we propose to compress RDF data. We construct an index structure called PAR-Tree Index, then base on the MapReduce parallel computing framework and the PAR-Tree Index to execute the query. Experimental results show that the algorithm can improve the efficiency of large data query.


2019 ◽  
Vol 9 (21) ◽  
pp. 4541
Author(s):  
Syed Asif Raza Shah ◽  
Seo-Young Noh

Large scientific experimental facilities currently are generating a tremendous amount of data. In recent years, the significant growth of scientific data analysis has been observed across scientific research centers. Scientific experimental facilities are producing an unprecedented amount of data and facing new challenges to transfer the large data sets across multi continents. In particular, these days the data transfer is playing an important role in new scientific discoveries. The performance of distributed scientific environment is highly dependent on high-performance, adaptive, and robust network service infrastructures. To support large scale data transfer for extreme-scale distributed science, there is the need of high performance, scalable, end-to-end, and programmable networks that enable scientific applications to use the networks efficiently. We worked on the AmoebaNet solution to address the problems of a dynamic programmable network for bulk data transfer in extreme-scale distributed science environments. A major goal of the AmoebaNet project is to apply software-defined networking (SDN) technology to provide “Application-aware” network to facilitate bulk data transfer. We have prototyped AmoebaNet’s SDN-enabled network service that allows application to dynamically program the networks at run-time for bulk data transfers. In this paper, we evaluated AmoebaNet solution with real world test cases and shown that how it efficiently and dynamically can use the networks for bulk data transfer in large-scale scientific environments.


2019 ◽  
Vol 22 (6) ◽  
pp. 1107-1123
Author(s):  
Yi Cao ◽  
Zeyao Mo ◽  
Zhiwei Ai ◽  
Huawei Wang ◽  
Li Xiao ◽  
...  

2013 ◽  
Vol 760-762 ◽  
pp. 1978-1981
Author(s):  
Luo Kai Hu ◽  
Yan Lin Cheng ◽  
Chao Liang

The performance of ontology query has become one of the bottlenecks of the large-scale bulk applications. Firstly OWL ontology files stored into the database in the form of triple table using Oracle 11g semantic technology. And then we designed and implemented the ontology parting method based on horizontal segmentation. Thirdly, several typical ontology query operations were achieved based on the multi-threading technology. Experimental results show that the parallel query methods described herein significantly improve query performance.


Sign in / Sign up

Export Citation Format

Share Document