A General Architecture for Demand Migration in a Demand-Driven Execution Engine in a Heterogeneous and Distributed Environment

Author(s):  
E. Vassev ◽  
J. Paquet
2018 ◽  
Vol 14 (3) ◽  
pp. 22-43
Author(s):  
Ratsimbazafy Rado ◽  
Omar Boussaid

Data warehousing (DW) area has always motivated a plethora of hard optimization problem that cannot be solved in polynomial time. Those optimization problems are more complex and interesting when it comes to multiple OLAP queries. In this article, the authors explore the potential of distributed environment for an established data warehouse, database-related optimization problem, the problem of Multiple Query Optimization (MQO). In traditional DW materializing views is an optimization technic to solve such problem by storing pre-computed join or frequently asked queries. In this era of big data this kind of view materialization is not suitable due to the data size. In this article, the authors tackle the problem of MQO on distributed DW by using a multiple, small, shared and easy to maintain shared data. The evaluation shows that, compared to available default execution engine, the authors' approach consumes on average 20% less memory in the Map-scan task and it is 12% faster regarding the execution time of interactive and reporting queries from TPC-DS.


2011 ◽  
Vol 21 (02) ◽  
pp. 155-171
Author(s):  
CORAL WALKER ◽  
DASHAN LU ◽  
DAVID W. WALKER

Distributed scientific and engineering computations on service-oriented architectures are often represented as data-driven workflows. Workflows are a convenient abstraction that allows users to compose applications in a visual programming environment, and execute them by means of a workflow execution engine. For a large class of scientific applications web-based portals can provide a user-friendly problem-solving environment that hides the complexities of executing workflow applications in a distributed environment. However, the creation and configuration of an application portal requires considerable expertise in portal technologies, which scientific end-users generally do not have. To address this problem this paper presents tools for automatically converting a workflow into a fully configured portal which can then be used to execute the workflow.


2005 ◽  
Vol 1 (03) ◽  
pp. 285-290 ◽  
Author(s):  
F. González-Longatt ◽  
◽  
A. Hernandez ◽  
F. Guillen ◽  
C. Fortoul

Author(s):  
Shalin Eliabeth S. ◽  
Sarju S.

Big data privacy preservation is one of the most disturbed issues in current industry. Sometimes the data privacy problems never identified when input data is published on cloud environment. Data privacy preservation in hadoop deals in hiding and publishing input dataset to the distributed environment. In this paper investigate the problem of big data anonymization for privacy preservation from the perspectives of scalability and time factor etc. At present, many cloud applications with big data anonymization faces the same kind of problems. For recovering this kind of problems, here introduced a data anonymization algorithm called Two Phase Top-Down Specialization (TPTDS) algorithm that is implemented in hadoop. For the data anonymization-45,222 records of adults information with 15 attribute values was taken as the input big data. With the help of multidimensional anonymization in map reduce framework, here implemented proposed Two-Phase Top-Down Specialization anonymization algorithm in hadoop and it will increases the efficiency on the big data processing system. By conducting experiment in both one dimensional and multidimensional map reduce framework with Two Phase Top-Down Specialization algorithm on hadoop, the better result shown in multidimensional anonymization on input adult dataset. Data sets is generalized in a top-down manner and the better result was shown in multidimensional map reduce framework by the better IGPL values generated by the algorithm. The anonymization was performed with specialization operation on taxonomy tree. The experiment shows that the solutions improves the IGPL values, anonymity parameter and decreases the execution time of big data privacy preservation by compared to the existing algorithm. This experimental result will leads to great application to the distributed environment.


2021 ◽  
Vol 22 (12) ◽  
pp. 6643
Author(s):  
Pawel Jaworski ◽  
Dorota Zyla-Uklejewicz ◽  
Malgorzata Nowaczyk-Cieszewska ◽  
Rafal Donczew ◽  
Thorsten Mielke ◽  
...  

oriC is a region of the bacterial chromosome at which the initiator protein DnaA interacts with specific sequences, leading to DNA unwinding and the initiation of chromosome replication. The general architecture of oriCs is universal; however, the structure of oriC and the mode of orisome assembly differ in distantly related bacteria. In this work, we characterized oriC of Helicobacter pylori, which consists of two DnaA box clusters and a DNA unwinding element (DUE); the latter can be subdivided into a GC-rich region, a DnaA-trio and an AT-rich region. We show that the DnaA-trio submodule is crucial for DNA unwinding, possibly because it enables proper DnaA oligomerization on ssDNA. However, we also observed the reverse effect: DNA unwinding, enabling subsequent DnaA–ssDNA oligomer formation—stabilized DnaA binding to box ts1. This suggests the interplay between DnaA binding to ssDNA and dsDNA upon DNA unwinding. Further investigation of the ts1 DnaA box revealed that this box, together with the newly identified c-ATP DnaA box in oriC1, constitute a new class of ATP–DnaA boxes. Indeed, in vitro ATP–DnaA unwinds H. pylori oriC more efficiently than ADP–DnaA. Our results expand the understanding of H. pylori orisome formation, indicating another regulatory pathway of H. pylori orisome assembly.


Author(s):  
Mythresh Korupolu ◽  
Srikanth Jannabhatla ◽  
Venkata Surendra Kommineni ◽  
Hemanth Kalyanam ◽  
Vijaykumar Vasantham

Electronics ◽  
2021 ◽  
Vol 10 (5) ◽  
pp. 621
Author(s):  
Giuseppe Psaila ◽  
Paolo Fosci

Internet technology and mobile technology have enabled producing and diffusing massive data sets concerning almost every aspect of day-by-day life. Remarkable examples are social media and apps for volunteered information production, as well as Open Data portals on which public administrations publish authoritative and (often) geo-referenced data sets. In this context, JSON has become the most popular standard for representing and exchanging possibly geo-referenced data sets over the Internet.Analysts, wishing to manage, integrate and cross-analyze such data sets, need a framework that allows them to access possibly remote storage systems for JSON data sets, to retrieve and query data sets by means of a unique query language (independent of the specific storage technology), by exploiting possibly-remote computational resources (such as cloud servers), comfortably working on their PC in their office, more or less unaware of real location of resources. In this paper, we present the current state of the J-CO Framework, a platform-independent and analyst-oriented software framework to manipulate and cross-analyze possibly geo-tagged JSON data sets. The paper presents the general approach behind the J-CO Framework, by illustrating the query language by means of a simple, yet non-trivial, example of geographical cross-analysis. The paper also presents the novel features introduced by the re-engineered version of the execution engine and the most recent components, i.e., the storage service for large single JSON documents and the user interface that allows analysts to comfortably share data sets and computational resources with other analysts possibly working in different places of the Earth globe. Finally, the paper reports the results of an experimental campaign, which show that the execution engine actually performs in a more than satisfactory way, proving that our framework can be actually used by analysts to process JSON data sets.


Author(s):  
Ramon Perez ◽  
Jaime Garcia-Reinoso ◽  
Aitor Zabala ◽  
Pablo Serrano ◽  
Albert Banchs

AbstractThe fifth generation (5G) of mobile networks is designed to accommodate different types of use cases, each of them with different and stringent requirements and key performance indicators (KPIs). To support the optimization of the network performance and validation of the KPIs, there exist the necessity of a flexible and efficient monitoring system and capable of realizing multi-site and multi-stakeholder scenarios. Nevertheless, for the evolution from 5G to 6G, the network is envisioned as a user-driven, distributed Cloud computing system where the resource pool is foreseen to integrate the participating users. In this paper, we present a distributed monitoring architecture for Beyond 5G multi-site platforms, where different stakeholders share the resource pool in a distributed environment. Taking advantage of the usage of publish-subscribe mechanisms adapted to the Edge, the developed lightweight monitoring solution can manage large amounts of real-time traffic generated by the applications located in the resource pool. We assess the performance of the implemented paradigm, revealing some interesting insights about the platform, such as the effect caused by the throughput of monitoring data in performance parameters such as the latency and packet loss, or the presence of a saturation effect due to software limitations that impacts in the performance of the system under specific conditions. In the end, the performance evaluation process has confirmed that the monitoring platform suits the requirements of the proposed scenarios, being capable of handling similar workloads in real 5G and Beyond 5G scenarios, then discussing how the architecture could be mapped to these real scenarios.


Sign in / Sign up

Export Citation Format

Share Document