Engineering Information Into Open Documents

Author(s):  
Chia-Chu Chiang

Documents are perfectly suited for information exchange via the Internet. In order to insure that there are no misunderstandings, information embedded in a document needs to be precise and unambiguous. Having a (de facto) standard data model and conceptual information model insures that the involved parties will agree on what the information means. XML (eXtensible Markup Language) has become the de facto standard format for representing information in documents for document exchange. Many techniques have been proposed to create XML documents, including the validation and transformation of XML documents. However, very little is discussed when it comes to extracting information from non- XML documents and engineering the information into XML documents. The extraction process can be a highly labor intensive task if it is done manually. The use of automated tools would make the process more efficient. In this chapter, the author will briefly survey document engineering techniques for XML documents. Then, the author will present two techniques to extract data from Windows documents into XML documents. These two techniques have been successfully applied in two industrial projects. He believes that techniques that automate the extraction of data from non-XML documents into XML formats will definitely enhance the use of XML documents.

Author(s):  
Dr. Manish L Jivtode

Web services are applications that allow for communication between devices over the internet and are independent of the technology. The devices are built and use standardized eXtensible Markup Language (XML) for information exchange. A client or user is able to invoke a web service by sending an XML message and then gets back and XML response message. There are a number of communication protocols for web services that use the XML format such as Web Services Flow Language (WSFL), Blocks Extensible Exchange Protocol(BEEP) etc. Simple Object Access Protocol (SOAP) and Representational State Transfer (REST) are used options for accessing web services. It is not directly comparable that SOAP is a communications protocol while REST is a set of architectural principles for data transmission. In this paper, the data size of 1KB, 2KB, 4KB, 8KB and 16KB were tested each for Audio, Video and result obtained for CRUD methods. The encryption and decryption timings in milliseconds/seconds were recorded by programming extensibility points of a WCF REST web service in the Azure cloud..


Author(s):  
Sergio Flesca ◽  
Fillippo Furfaro ◽  
Sergio Greco ◽  
Ester Zumpano

The World Wide Web is of strategic importance as a global repository for information and a means of communicating and sharing knowledge. Its explosive growth has caused deep changes in all the aspects of human life, has been a driving force for the development of modern applications (e.g., Web portals, digital libraries, wrapper generators, etc.), and has greatly simplified the access to existing sources of information, ranging from traditional DBMS to semi-structured Web repositories. The adoption by the WWW consortium (W3C) of XML (eXtensible Markup Language) as the new standard for information exchange among Web applications has led researchers to investigate classical problems in the new environment of repositories containing large amounts of data in XML format.


Author(s):  
Joseph Fong ◽  
Herbert Shiu

Extensible Markup Language (XML) has become a standard for persistent storage and data interchange via the Internet due to its openness, self-descriptiveness and flexibility. This chapter proposes a systematic approach to reverse engineer arbitrary XML documents to their conceptual schema – Extended DTD Graphs ? which is a DTD Graph with data semantics. The proposed approach not only determines the structure of the XML document, but also derives candidate data semantics from the XML element instances by treating each XML element instance as a record in a table of a relational database. One application of the determined data semantics is to verify the linkages among elements. Implicit and explicit referential linkages are among XML elements modeled by the parent-children structure and ID/IDREF(S) respectively. As a result, an arbitrary XML document can be reverse engineered into its conceptual schema in an Extended DTD Graph format.


2019 ◽  
Vol 9 (24) ◽  
pp. 5358 ◽  
Author(s):  
Shaohua Jiang ◽  
Liping Jiang ◽  
Yunwei Han ◽  
Zheng Wu ◽  
Na Wang

The expansion of scale and the increase of complexity of construction projects puts higher requirements on the level of collaboration among different stakeholders. How to realize better information interoperability among multiple disciplines and different software platforms becomes a key problem in the collaborative process. openBIM (building information model), as a common approach of information exchange, can meet the needs of information interaction among different software well and improve the efficiency and accuracy of collaboration. To the best of our knowledge, there is currently no comprehensive survey of openBIM approach in the context of the AEC (Architecture, Engineering & Construction) industry, this paper fills the gap and presents a literature review of openBIM. In this paper, the openBIM related standards, software platforms, and tools enabling information interoperability are introduced and analyzed comprehensively based on related websites and literature. Furthermore, engineering information interoperability research supported by openBIM is analyzed from the perspectives of information representation, information query, information exchange, information extension, and information integration. Finally, research gaps and future directions are presented based on the analysis of existing research. The systematic analysis of the theory and practice of openBIM in this paper can provide support for its further research and application.


2014 ◽  
Vol 539 ◽  
pp. 251-255
Author(s):  
Zi Li Jiang

Human society is in the information age, the information revolution has full rise. Human society changed from the value calculated into a comprehensive problem solving. Human being is gradually put information into a computer system for the process of transplantation, resulting in the need of information management, information engineering, information exchange, and other branches of science. For the rapidly expanding range of problem-solving, the existing computer functions are a serious shortage, lack of intelligence. The lack of intelligence in computer science has been unable to generalize the entire contents of the information science. This paper mainly explains the development and application of computers in mobile technology.


PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e11333
Author(s):  
Daniyar Karabayev ◽  
Askhat Molkenov ◽  
Kaiyrgali Yerulanuly ◽  
Ilyas Kabimoldayev ◽  
Asset Daniyarov ◽  
...  

Background High-throughput sequencing platforms generate a massive amount of high-dimensional genomic datasets that are available for analysis. Modern and user-friendly bioinformatics tools for analysis and interpretation of genomics data becomes essential during the analysis of sequencing data. Different standard data types and file formats have been developed to store and analyze sequence and genomics data. Variant Call Format (VCF) is the most widespread genomics file type and standard format containing genomic information and variants of sequenced samples. Results Existing tools for processing VCF files don’t usually have an intuitive graphical interface, but instead have just a command-line interface that may be challenging to use for the broader biomedical community interested in genomics data analysis. re-Searcher solves this problem by pre-processing VCF files by chunks to not load RAM of computer. The tool can be used as standalone user-friendly multiplatform GUI application as well as web application (https://nla-lbsb.nu.edu.kz). The software including source code as well as tested VCF files and additional information are publicly available on the GitHub repository (https://github.com/LabBandSB/re-Searcher).


Author(s):  
Samir Mohammad ◽  
Patrick Martin

Extensible Markup Language (XML), which provides a flexible way to define semistructured data, is a de facto standard for information exchange in the World Wide Web. The trend towards storing data in its XML format has meant a rapid growth in XML databases and the need to query them. Indexing plays a key role in improving the execution of a query. In this chapter the authors give a brief history of the creation and the development of the XML data model. They discuss the three main categories of indexes proposed in the literature to handle the XML semistructured data model and provide an evaluation of indexing schemes within these categories. Finally, they discuss limitations and open problems related to the major existing indexing schemes.


Author(s):  
Hadj Mahboubi ◽  
Jérôme Darmont

Since XML (eXtensible Markup Language) (Bray, Paoli, Sperberg-McQueen, Maler & Yergeau, 2004) emerged as a standard for information representation and exchange, storing, indexing, and querying, XML documents have become major issues in database research. Query processing and optimization are very important in this context, and indices are data structures that help enhance performances substantially. Though XML indexing concepts are mainly inherited from relational databases, XML indices bear numerous specificities. The aim of this chapter is to present an overview of state-of-the-art XML indices and to discuss the main issues, trade-offs, and future trends in XML indexing. Furthermore, since XML is gaining importance for representing business data for analytics (Beyer, Chamberlin, Colby, Özcan, Pirahesh & Xu, 2005), we also present an index we developed specifically for XML data warehouses.


Sign in / Sign up

Export Citation Format

Share Document