scholarly journals Keyword Search in Decentralized Storage Systems

Electronics ◽  
2020 ◽  
Vol 9 (12) ◽  
pp. 2041
Author(s):  
Liyan Zhu ◽  
Chuqiao Xiao ◽  
Xueqing Gong

The emerging decentralized storage systems (DSSs), such as InterPlanetary File System (IPFS), Storj, and Sia, provide people with a new storage model. Instead of being centrally managed, the data are sliced up and distributed across the nodes of the network. Furthermore, each data object is uniquely identified by a cryptographic hash (ObjectId) and can only be retrieved by ObjectId. Compared with the search functions provided by the existing centralized storage systems, the application scenarios of the DSSs are subject to certain restrictions. In this paper, we first apply decentralized B+Tree and HashMap to the DSSs to provide keyword search. Both indexes are kept in blocks. Since these blocks may be scattered on multiple nodes, we ensure that all operations involve as few blocks as possible to reduce network cost and response time. In addition, the version control and version merging algorithms are designed to effectively organize the indexes and facilitate data integration. The experimental results prove that our indexes have excellent availability and scalability.

2020 ◽  
Vol 12 (12) ◽  
pp. 31-43
Author(s):  
Tatiana A. VASKOVSKAYA ◽  
◽  
Boris A. KLUS ◽  

The development of energy storage systems allows us to consider their usage for load profile leveling during operational planning on electricity markets. The paper proposes and analyses an application of an energy storage model to the electricity market in Russia with the focus on the day ahead market. We consider bidding, energy storage constraints for an optimal power flow problem, and locational marginal pricing. We show that the largest effect for the market and for the energy storage system would be gained by integration of the energy storage model into the market’s optimization models. The proposed theory has been tested on the optimal power flow model of the day ahead market in Russia of 10000-node Unified Energy System. It is shown that energy storage systems are in demand with a wide range of efficiencies and cycle costs.


2016 ◽  
Vol 10 (02) ◽  
pp. 167-191 ◽  
Author(s):  
Lavdim Halilaj ◽  
Irlán Grangel-González ◽  
Gökhan Coskun ◽  
Steffen Lohmann ◽  
Sören Auer

Collaborative vocabulary development in the context of data integration is the process of finding consensus between experts with different backgrounds, system understanding and domain knowledge. The complexity of this process increases with the number of people involved, the variety of the systems to be integrated and the dynamics of their domain. In this paper, we advocate that the usage of a powerful version control system is one of the keys to address this problem. Driven by this idea and the success of the version control system Git in the context of software development, we investigate the applicability of Git for collaborative vocabulary development. Even though vocabulary development and software development have much more similarities than differences, there are still important challenges. These need to be considered in the development of a successful versioning and collaboration system for vocabulary development. Therefore, this paper starts by presenting the challenges we are faced with during the collaborative creation of vocabularies and discusses its distinction to software development. Drawing from these findings, we present Git4Voc which comprises guidelines on how Git can be adopted to vocabulary development. Finally, we demonstrate how Git hooks can be implemented to go beyond the plain functionality of Git by realizing vocabulary-specific features like syntactic validation and semantic diffs.


2016 ◽  
Vol 2016 ◽  
pp. 1-8 ◽  
Author(s):  
Yongfeng Dong ◽  
Jingyu Liu ◽  
Jie Yan ◽  
Hongpu Liu ◽  
Youxi Wu

HS-RAID (Hybrid Semi-RAID), a power-aware RAID, saves energy by grouping disks in the array. All of the write operations in HS-RAID are small write which degrade the storage system’s performance severely. In this paper, we propose a redundancy algorithm, data incremental parity algorithm (DIP), which employs HS-RAID to minimize the write penalty and improves the performance and reliability of the storage systems. The experimental results show that HS-RAID2(HS-RAID with DIP) is faster and has higher reliability than HS-RAID remarkably.


2015 ◽  
Vol 81 (8) ◽  
pp. 1532-1541 ◽  
Author(s):  
Jin Li ◽  
Xiaofeng Chen ◽  
Fatos Xhafa ◽  
Leonard Barolli

2021 ◽  
Vol 17 (3) ◽  
pp. 1-24
Author(s):  
Duwon Hong ◽  
Keonsoo Ha ◽  
Minseok Ko ◽  
Myoungjun Chun ◽  
Yoona Kim ◽  
...  

A recent ultra-large SSD (e.g., a 32-TB SSD) provides many benefits in building cost-efficient enterprise storage systems. Owing to its large capacity, however, when such SSDs fail in a RAID storage system, a long rebuild overhead is inevitable for RAID reconstruction that requires a huge amount of data copies among SSDs. Motivated by modern SSD failure characteristics, we propose a new recovery scheme, called reparo , for a RAID storage system with ultra-large SSDs. Unlike existing RAID recovery schemes, reparo repairs a failed SSD at the NAND die granularity without replacing it with a new SSD, thus avoiding most of the inter-SSD data copies during a RAID recovery step. When a NAND die of an SSD fails, reparo exploits a multi-core processor of the SSD controller in identifying failed LBAs from the failed NAND die and recovering data from the failed LBAs. Furthermore, reparo ensures no negative post-recovery impact on the performance and lifetime of the repaired SSD. Experimental results using 32-TB enterprise SSDs show that reparo can recover from a NAND die failure about 57 times faster than the existing rebuild method while little degradation on the SSD performance and lifetime is observed after recovery.


2019 ◽  
pp. 254-277 ◽  
Author(s):  
Ying Zhang ◽  
Chaopeng Li ◽  
Na Chen ◽  
Shaowen Liu ◽  
Liming Du ◽  
...  

Since large amount of geospatial data are produced by various sources, geospatial data integration is difficult because of the shortage of semantics. Despite standardised data format and data access protocols, such as Web Feature Service (WFS), can enable end-users with access to heterogeneous data stored in different formats from various sources, it is still time-consuming and ineffective due to the lack of semantics. To solve this problem, a prototype to implement the geospatial data integration is proposed by addressing the following four problems, i.e., geospatial data retrieving, modeling, linking and integrating. We mainly adopt four kinds of geospatial data sources to evaluate the performance of the proposed approach. The experimental results illustrate that the proposed linking method can get high performance in generating the matched candidate record pairs in terms of Reduction Ratio(RR), Pairs Completeness(PC), Pairs Quality(PQ) and F-score. The integrating results denote that each data source can get much Complementary Completeness(CC) and Increased Completeness(IC).


Author(s):  
Ying Zhang ◽  
Chaopeng Li ◽  
Na Chen ◽  
Shaowen Liu ◽  
Liming Du ◽  
...  

Since large amount of geospatial data are produced by various sources, geospatial data integration is difficult because of the shortage of semantics. Despite standardised data format and data access protocols, such as Web Feature Service (WFS), can enable end-users with access to heterogeneous data stored in different formats from various sources, it is still time-consuming and ineffective due to the lack of semantics. To solve this problem, a prototype to implement the geospatial data integration is proposed by addressing the following four problems, i.e., geospatial data retrieving, modeling, linking and integrating. We mainly adopt four kinds of geospatial data sources to evaluate the performance of the proposed approach. The experimental results illustrate that the proposed linking method can get high performance in generating the matched candidate record pairs in terms of Reduction Ratio(RR), Pairs Completeness(PC), Pairs Quality(PQ) and F-score. The integrating results denote that each data source can get much Complementary Completeness(CC) and Increased Completeness(IC).


Sign in / Sign up

Export Citation Format

Share Document