data redistribution
Recently Published Documents


TOTAL DOCUMENTS

83
(FIVE YEARS 9)

H-INDEX

11
(FIVE YEARS 1)

2021 ◽  
Author(s):  
Hanchen Guo ◽  
Zhehan Lin ◽  
Yunfei Gu ◽  
Chentao Wu ◽  
Li Jiang ◽  
...  


Author(s):  
Svetlozar Kirilov Zahariev

One of the most significant error contributors to preliminary design tools for Photovoltaic power systems is related to the simple parametric Clear Sky models. Therefore, this paper focuses on providing a methodology and a more sophisticated open-source tool for 3 commonly used Clear Sky models. This includes all relevant steps involved in the process - from filtering the raw meteorological data, identification of Clear Sky regions, data redistribution to genetic optimization of selected model parameter, etc.use case is built upon a multiyear dataset obtained from TU Varna meteorological station between 2012-2016. A significantly higher density distribution of Clear sky segments was identified during the summer through the Clear Sky Identification algorithm. To avoid the risk of overfitting the models to purely summer months and poor model fits in winter months, which was found to be the case with the legacy model, the underrepresented clear sky regions (based on θ) were replicated until uniform distribution is attained.  Subsequently, a genetic optimization was applied to selected parameters in the Clear Sky algorithms and the updated models showed a significant improvement in low winter months (θ) and even overall performance boost RMSE / MAE /R2. Furthermore, such validations and optimizations are recommended prior to any design or real-time PV-system analysis for the specific location.



IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
Ana Moreton-Fernandez ◽  
Yuri Torres ◽  
Arturo Gonzalez-Escribano ◽  
Diego R. Llanos
Keyword(s):  


Author(s):  
Qinglei Cao ◽  
George Bosilca ◽  
Nuria Losada ◽  
Wei Wu ◽  
Dong Zhong ◽  
...  
Keyword(s):  


Author(s):  
Rhauani Fazul ◽  
Patrícia Barcelos

Data replication is the main fault tolerance mechanism of HDFS, the Hadoop Distributed File System. Although replication is essential to ensure high availability and reliability, the replicas might not always be placed evenly among the nodes. The HDFS Balancer is an integrated solution of Apache Hadoop that performs replica balancing through the rearrangement of the data blocks stored in the file system. The Balancer, however, demands a high computational effort of the nodes during its operation. This work presents a customization for the HDFS Balancer that considers the status of the nodes as a strategy to minimize the overhead caused by the balancing operation in the cluster. To this end, metrics obtained at runtime are used as a way to prioritize the nodes during data redistribution, making it occurs primarily between nodes with low communication traffic. Also, the Balancer starts to operate aiming at a minimum balance level, reducing the number of data transfers required to even up the data stored in the cluster. The evaluation results showed that the proposed customization allows reducing the time and bandwidth needed to reach the system balance.



Author(s):  
Rob H. Bisseling

This chapter demonstrates the use of different data distributions in different phases of a parallel fast Fourier transform (FFT), which is a regular computation with a predictable but challenging data access pattern. Both the block and cyclic distributions are used and also intermediates between them. Each required data redistribution is a permutation that involves communication. By making careful choices, the number of such redistributions can be kept to a minimum. FFT algorithms can be concisely expressed using matrix/vector notation and Kronecker matrix products. This notation is also used here. The chapter then shows how permutations with a regular pattern can be implemented more efficiently by packing the data. The parallelization techniques discussed for the specific case of the FFT are also applicable to other related computations, for instance in signal processing and weather forecasting.



Author(s):  
Qinglei Cao ◽  
George Bosilca ◽  
Wei Wu ◽  
Dong Zhong ◽  
Aurelien Bouteiller ◽  
...  


2020 ◽  
Vol 2020 ◽  
pp. 1-17
Author(s):  
Ziyang Li ◽  
Jiong Yu ◽  
Chen Bian ◽  
Yonglin Pu ◽  
Yuefei Wang ◽  
...  

As real-time and immediate feedback becomes increasingly important in tasks related to mobile information, big data stream processing systems are increasingly applied to process massive amounts of mobile data. However, when processing a drastically fluctuating mobile data stream, the lack of an elastic resource-scheduling strategy limits the elasticity and scalability of data stream processing systems. To address this problem, this paper builds a flow-network model, a resource allocation model, and a data redistribution model as the foundation for proposing Flink with an elastic resource-scheduling strategy (Flink-ER), which consists of a capacity detection algorithm, an elastic resource reallocation algorithm, and a data redistribution algorithm. The strategy improves the performance of the platform by dynamically rescaling the cluster and increasing the parallelism of operators based on the processing load. The experimental results show that the throughput of a cluster was promoted under the premise of meeting latency constraints, which verifies the efficiency of the strategy.





Algorithms ◽  
2019 ◽  
Vol 12 (7) ◽  
pp. 142
Author(s):  
Li ◽  
Yu

For many parallel and distributed systems, automatic data redistribution improves its locality and increases system performance for various computer problems and applications. In general, an array can be distributed to multiple processing systems by using regular or irregular distributions. Some data distribution adopts BLOCK, CYCLIC, or BLOCK-CYCLIC to specify data array decomposition and distribution. On the other hand, irregular distributions specify a different-size data array distribution according to user-defined commands or procedures. In this work, we propose three bipartite graph problems, including the “maximum edge coloring problem”, the “maximum degree edge coloring problem”, and the “cost-sharing maximum edge coloring problem” to formulate these kinds of distribution problems. Next, we propose an approximation algorithm with a ratio bound of two for the maximum edge coloring problem when the input graph is biplanar. Moreover, we also prove that the “cost-sharing maximum edge coloring problem” is an NP-complete problem even when the input graph is biplanar.



Sign in / Sign up

Export Citation Format

Share Document