Efficient Transfer of data from RDBMS to HDFS and conversion to JSON format

Dr. C. K. Gomathy

doi:10.22214/ijraset.2021.38710

Efficient Transfer of data from RDBMS to HDFS and conversion to JSON format

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.38710 ◽

2021 ◽

Vol 9 (10) ◽

pp. 1869-1871

Author(s):

Dr. C. K. Gomathy

Keyword(s):

Data Warehouse ◽

Relational Databases ◽

Command Line ◽

Command Line Interface ◽

Apache Hadoop ◽

Enterprise Data Warehouse ◽

Efficient Transfer ◽

Efficient Execution

Abstract: Apache Sqoop is mainly used to efficiently transfer large volumes of data between Apache Hadoop and relational databases. It helps to certain tasks, such as ETL (Extract transform load) processing, from an enterprise data warehouse to Hadoop, for efficient execution at a much less cost. Here first we import the table which presents in MYSQL Database with the help of command-line interface application called Sqoop and there is a chance of addition of new rows and updating new rows then we have to execute the query again. So, with the help of our project there is no need of executing queries again for that we are using Sqoop job, which consists of total commands for import and next after import we retrieve the data from hive using Java JDBC and we convert the data to JSON Format, which consists of data in an organized way and easy to access manner by using GSON Library. Keywords: Sqoop, Json, Gson, Maven and JDBC

Download Full-text

ESIS ON THE WORLD WIDE WEB

International Journal of Modern Physics C ◽

10.1142/s0129183194000921 ◽

1994 ◽

Vol 05 (05) ◽

pp. 805-809 ◽

Cited By ~ 1

Author(s):

SALIM G. ANSARI ◽

PAOLO GIOMMI ◽

ALBERTO MICOL

Keyword(s):

World Wide Web ◽

Web Application ◽

World Wide ◽

Markup Language ◽

Command Line ◽

Command Line Interface ◽

The World ◽

Hypertext Transfer Protocol ◽

Hypertext Markup Language ◽

Near Future

On 3rd November, 1993, ESIS announced its Homepage on the World Wide Web (WWW) to the user community. Ever since then, ESIS has steadily increased its Web support to the astronomical community to include a bibliographic service, the ESIS catalogue documentation and the ESIS Data Browser. More functionality will be added in the near future. All these services share a common ESIS structure that is used by other ESIS user paradigms such as the ESIS Graphical User Interface (Giommi and Ansari, 1993), and the ESIS Command Line Interface. A forms-based paradigm, each ESIS-Web application interfaces to the hypertext transfer protocol (http) translating queries from/to the hypertext markup language (html) format understood by the NCSA Mosaic interface. In this paper, we discuss the ESIS system and show how each ESIS service works on the World Wide Web client.

Download Full-text

Query processing over data warehouse using relational databases and NoSQL

2012 XXXVIII Conferencia Latinoamericana En Informatica (CLEI) ◽

10.1109/clei.2012.6427228 ◽

2012 ◽

Cited By ~ 5

Author(s):

Anderson Chaves Carniel ◽

Aried de Aguiar Sa ◽

Vinicius Henrique Porto Brisighello ◽

Marcela Xavier Ribeiro ◽

Renato Bueno ◽

...

Keyword(s):

Query Processing ◽

Data Warehouse ◽

Relational Databases

Download Full-text

Formalizing the Mapping of UML Conceptual Schemas to Column-Oriented Databases

International Journal of Data Warehousing and Mining ◽

10.4018/ijdwm.2018070103 ◽

2018 ◽

Vol 14 (3) ◽

pp. 44-68 ◽

Cited By ~ 1

Author(s):

Fatma Abdelhedi ◽

Amal Ait Brahim ◽

Gilles Zurfluh

Keyword(s):

Big Data ◽

Data Warehouse ◽

Relational Databases ◽

Traditional Approach ◽

Physical Models ◽

Decision Making Process ◽

Nosql Databases ◽

Care Field ◽

Sufficient Degree

Nowadays, most organizations need to improve their decision-making process using Big Data. To achieve this, they have to store Big Data, perform an analysis, and transform the results into useful and valuable information. To perform this, it's necessary to deal with new challenges in designing and creating data warehouse. Traditionally, creating a data warehouse followed well-governed process based on relational databases. The influence of Big Data challenged this traditional approach primarily due to the changing nature of data. As a result, using NoSQL databases has become a necessity to handle Big Data challenges. In this article, the authors show how to create a data warehouse on NoSQL systems. They propose the Object2NoSQL process that generates column-oriented physical models starting from a UML conceptual model. To ensure efficient automatic transformation, they propose a logical model that exhibits a sufficient degree of independence so as to enable its mapping to one or more column-oriented platforms. The authors provide experiments of their approach using a case study in the health care field.

Download Full-text

UCEasy: A software package for automating and simplifying the analysis of ultraconserved elements (UCEs)

Biodiversity Data Journal ◽

10.3897/bdj.9.e78132 ◽

2021 ◽

Vol 9 ◽

Author(s):

Caio Ribeiro ◽

Lucas Oliveira ◽

Romina Batista ◽

Marcos De Sousa

Keyword(s):

Best Practices ◽

Software Package ◽

Phylogenetic Trees ◽

Computational Analysis ◽

Data Matrix ◽

Command Line ◽

Command Line Interface ◽

Ultraconserved Elements ◽

Research Software ◽

Different Levels

The use of Ultraconserved Elements (UCEs) as genetic markers in phylogenomics has become popular and has provided promising results. Although UCE data can be easily obtained from targeted enriched sequencing, the protocol for in silico analysis of UCEs consist of the execution of heterogeneous and complex tools, a challenge for scientists without training in bioinformatics. Developing tools with the adoption of best practices in research software can lessen this problem by improving the execution of computational experiments, thus promoting better reproducibility. We present UCEasy, an easy-to-install and easy-to-use software package with a simple command line interface that facilitates the computational analysis of UCEs from sequencing samples, following the best practices of research software. UCEasy is a wrapper that standardises, automates and simplifies the quality control of raw reads, assembly and extraction and alignment of UCEs, generating at the end a data matrix with different levels of completeness that can be used to infer phylogenetic trees. We demonstrate the functionalities of UCEasy by reproducing the published results of phylogenomic studies of the bird genus Turdus (Aves) and of Adephaga families (Coleoptera) containing genomic datasets to efficiently extract UCEs.

Download Full-text

Ingestion of a Data Lake into a NoSQL Data Warehouse: The Case of Relational Databases

10.5220/0010690600003064 ◽

2021 ◽

Author(s):

Fatma Abdelhedi ◽

Rym Jemmali ◽

Gilles Zurfluh

Keyword(s):

Data Warehouse ◽

Relational Databases

Download Full-text

Enterprise Data warehouse and Business Intelligence Solution

Proceedings of the 11th International Conference on Theory and Practice of Electronic Governance - ICEGOV '18 ◽

10.1145/3209415.3209420 ◽

2018 ◽

Cited By ~ 1

Author(s):

Erika Bagambiki

Keyword(s):

Data Warehouse ◽

Business Intelligence ◽

Enterprise Data Warehouse

Download Full-text

Model-Centric Fulfillment Operations and Maintenance Automation

Advances in Wireless Technologies and Telecommunication - Emerging Automation Techniques for the Future Internet ◽

10.4018/978-1-5225-7146-9.ch001 ◽

2018 ◽

pp. 1-20

Author(s):

Patrick Moore

Keyword(s):

Support Systems ◽

Dynamic Capability ◽

Command Line ◽

Command Line Interface ◽

Business Support ◽

Service Models ◽

Operations And Maintenance ◽

Operational Models ◽

Multiple Templates

As networks have evolved, there has been an evolution in how they are managed as well. This evolution has seen a move from manual configuration via command line interface (CLI) to script-based automation and eventually to a template-based approach with workflow to coordinate multiple templates and scripts. The next step in this evolution is the introduction of models to provide a more dynamic capability than is in place today. This chapter will discuss three major layers of modelling that should be considered during implementation of this approach: device models focused on the configuration of the hardware itself; service models focused on the customer or network facing services that leverage the hardware level configuration; and operational models focused on people, processes, and tools involved in application of device and service models. This includes the orchestration of activities with other tools, such as operational support systems (OSS) and business support systems (BSS).

Download Full-text

Humanitites Data Warehousing

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch141 ◽

2008 ◽

pp. 2364-2370

Author(s):

Janet Delve

Keyword(s):

Data Warehouse ◽

Relational Databases ◽

Data Warehousing ◽

Numerical Data ◽

Complex Nature ◽

Data Warehouses ◽

Textual Data ◽

Numeric Data ◽

First Time ◽

And Linguistics

Data Warehousing is now a well-established part of the business and scientific worlds. However, up until recently, data warehouses were restricted to modeling essentially numerical data – examples being sales figures in the business arena (e.g. Wal-Mart’s data warehouse) and astronomical data (e.g. SKICAT) in scientific research, with textual data providing a descriptive rather than a central role. The lack of ability of data warehouses to cope with mainly non-numeric data is particularly problematic for humanities1 research utilizing material such as memoirs and trade directories. Recent innovations have opened up possibilities for non-numeric data warehouses, making them widely accessible to humanities research for the first time. Due to its irregular and complex nature, humanities research data is often difficult to model and manipulating time shifts in a relational database is problematic as is fitting such data into a normalized data model. History and linguistics are exemplars of areas where relational databases are cumbersome and which would benefit from the greater freedom afforded by data warehouse dimensional modeling.

Download Full-text

BioKEEN: a library for learning and evaluating biological knowledge graph embeddings

Bioinformatics ◽

10.1093/bioinformatics/btz117 ◽

2019 ◽

Vol 35 (18) ◽

pp. 3538-3540 ◽

Cited By ~ 8

Author(s):

Mehdi Ali ◽

Charles Tapley Hoyt ◽

Daniel Domingo-Fernández ◽

Jens Lehmann ◽

Hajira Jabeen

Keyword(s):

Supplementary Information ◽

Knowledge Graph ◽

Biological Knowledge ◽

Command Line ◽

Graph Embeddings ◽

Command Line Interface ◽

Software Ecosystem ◽

Mapping Resource ◽

Significant Attention

Abstract Summary Knowledge graph embeddings (KGEs) have received significant attention in other domains due to their ability to predict links and create dense representations for graphs’ nodes and edges. However, the software ecosystem for their application to bioinformatics remains limited and inaccessible for users without expertise in programing and machine learning. Therefore, we developed BioKEEN (Biological KnowlEdge EmbeddiNgs) and PyKEEN (Python KnowlEdge EmbeddiNgs) to facilitate their easy use through an interactive command line interface. Finally, we present a case study in which we used a novel biological pathway mapping resource to predict links that represent pathway crosstalks and hierarchies. Availability and implementation BioKEEN and PyKEEN are open source Python packages publicly available under the MIT License at https://github.com/SmartDataAnalytics/BioKEEN and https://github.com/SmartDataAnalytics/PyKEEN Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Recent developments in theCCP-EMsoftware suite

Acta Crystallographica Section D Structural Biology ◽

10.1107/s2059798317007859 ◽

2017 ◽

Vol 73 (6) ◽

pp. 469-477 ◽

Cited By ~ 101

Author(s):

Tom Burnley ◽

Colin M. Palmer ◽

Martyn Winn

Keyword(s):

Easy Access ◽

Software Framework ◽

Command Line ◽

Command Line Interface ◽

Software Suite ◽

Computational Support ◽

Recent Developments ◽

User Friendly ◽

Different Levels ◽

Friendly Graphical User Interface

As part of its remit to provide computational support to the cryo-EM community, the Collaborative Computational Project for Electron cryo-Microscopy (CCP-EM) has produced a software framework which enables easy access to a range of programs and utilities. The resulting software suite incorporates contributions from different collaborators by encapsulating them in Python task wrappers, which are then made accessibleviaa user-friendly graphical user interface as well as a command-line interface suitable for scripting. The framework includes tools for project and data management. An overview of the design of the framework is given, together with a survey of the functionality at different levels. The currentCCP-EMsuite has particular strength in the building and refinement of atomic models into cryo-EM reconstructions, which is described in detail.

Download Full-text