xml documents Latest Research Papers

XML2HBase: Storing and querying large collections of XML documents using a NoSQL database system

Journal of Parallel and Distributed Computing ◽

10.1016/j.jpdc.2021.11.003 ◽

2021 ◽

Author(s):

Liang Bao ◽

Jin Yang ◽

Chase Q. Wu ◽

Haiyang Qi ◽

Xin Zhang ◽

...

Keyword(s):

Database System ◽

Xml Documents ◽

Nosql Database

Software review: The JATSdecoder package—extract metadata, abstract and sectioned text from NISO-JATS coded XML documents; Insights to PubMed central’s open access database

Scientometrics ◽

10.1007/s11192-021-04162-z ◽

2021 ◽

Author(s):

Ingmar Böschen

Keyword(s):

Open Access ◽

Reference List ◽

Pubmed Central ◽

Xml Documents ◽

Author Identification ◽

Open Access Database ◽

Document Collection ◽

The Subject ◽

The Rich ◽

Selection Of

AbstractJATSdecoder is a general toolbox which facilitates text extraction and analytical tasks on NISO-JATS coded XML documents. Its function JATSdecoder() outputs metadata, the abstract, the sectioned text and reference list as easy selectable elements. One of the biggest repositories for open access full texts covering biology and the medical and health sciences is PubMed Central (PMC), with more than 3.2 million files. This report provides an overview of the PMC document collection processed with JATSdecoder(). The development of extracted tags is displayed for the full corpus over time and in greater detail for some meta tags. Possibilities and limitations for text miners working with scientific literature are outlined. The NISO-JATS-tags are used quite consistently nowadays and allow a reliable extraction of metadata and text elements. International collaborations are more present than ever. There are obvious errors in the date stamps of some documents. Only about half of all articles from 2020 contain at least one author listed with an author identification code. Since many authors share the same name, the identification of person-related content is problematic, especially for authors with Asian names. JATSdecoder() reliably extracts key metadata and text elements from NISO-JATS coded XML files. When combined with the rich, publicly available content within PMCs database, new monitoring and text mining approaches can be carried out easily. Any selection of article subsets should be carefully performed with in- and exclusion criteria on several NISO-JATS tags, as both the subject and keyword tags are used quite inconsistently.

Classification of XML Documents Using Semantic Resources

10.1109/icrami52622.2021.9585995 ◽

2021 ◽

Author(s):

Makhlouf Ledmi ◽

Abdeldjalil Ledmi ◽

Mohammed El Habib Souidi

Keyword(s):

Xml Documents ◽

Semantic Resources

A New Hybrid Approach for Encrypting XML Documents

10.1109/icrito51393.2021.9596067 ◽

2021 ◽

Author(s):

Anil Kumar ◽

Priyanka Dahiya ◽

Garima ◽

Sarvesh Tanwar

Keyword(s):

Hybrid Approach ◽

Xml Documents

Recognize, Annotate, and Visualize Parallel Content Structures in XML Documents

10.1109/jcdl52503.2021.00078 ◽

2021 ◽

Author(s):

Marco Beck ◽

Moritz Schubotz ◽

Vincent Stange ◽

Norman Meuschke ◽

Bela Gipp

Keyword(s):

Xml Documents

The (unspoken) XML gotcha

Proceedings of Balisage: The Markup Conference 2021 ◽

10.4242/balisagevol26.usdin01 ◽

2021 ◽

Author(s):

B. Tommie Usdin

Keyword(s):

Time Cost ◽

Current Version ◽

Technical Debt ◽

Short Life ◽

The Real ◽

Xml Documents ◽

Real Work ◽

Long Time

XML is a platform-neutral way to exchange, share, and manipulate information. But what persuades many to use XML is the claim that XML provides a long-term way to store information, independent of tools (both hardware and software) with their short life spans. Projects spend significant resources on XML setup and then settle into doing the real work, using that XML infrastructure to compile, write, analyze, or whatever it is they do. Until, one day — something doesn’t work. Hardware is retired; software is upgraded; specifications go into new releases. Users get stuck. And when they complain, we respond that “of course that doesn’t work any more, you have been accumulating technical debt for years! It is time to reinvest.” They thought they had committed to a one-time cost, and now we tell them that it is an ongoing expense. If the user had put documents into their favorite spreadsheet, they complain, they could still import them into the current version. How do we answer that complaint? We (the XMLers) think we described the values of XML plainly and fairly. We (the XML users) think that the claim that XML documents last a long time is relying on a specious technicality, and we have been trapped dishonestly. I live on both sides of this: as a user I want to invest in infrastructure once and have it last; as a developer I want to be able to improve my product without the limitations imposed by backwards compatibility. We as a community often complain that not enough people are using XML. If we really want XML use to grow, we need to address the gotcha that too many XML users are feeling.

A Novel Three-Way Merge Algorithm for HTML/XML Documents Using a Hidden Markov Model

Lecture Notes in Networks and Systems - Intelligent Computing ◽

10.1007/978-3-030-80119-9_3 ◽

2021 ◽

pp. 75-101

Author(s):

Nikolaos G. Bakaoukas ◽

Anastasios G. Bakaoukas

Keyword(s):

Markov Model ◽

Hidden Markov Model ◽

Hidden Markov ◽

Xml Documents

A Framework to Analyze Business Process Log in XML Format

Turkish Journal of Computer and Mathematics Education (TURCOMAT) ◽

10.17762/turcomat.v12i3.1264 ◽

2021 ◽

Vol 12 (3) ◽

pp. 2623-2630

Author(s):

Ang Jin Sheng Et.al

Keyword(s):

Data Mining ◽

Business Process ◽

Accurate Result ◽

Structured Data ◽

Web Pages ◽

Web Searching ◽

General Application ◽

Data Mining Techniques ◽

Xml Documents ◽

The Relationship

XML has numerous uses in a wide variety of web pages and applications. Some common uses of XML include tasks for web publishing, web searching and automation, and general application such as for utilize, store, transfer and display business process log data. The amount of information expressed in XML has gone up rapidly. Many works have been done on sensible approaches to address issues related to the handling and review of XML documents. Mining XML documents offera way to understand both the structure and the content of XML documents. A common approach capable of analysing XML documents is frequent subtree mining.Frequent subtree mining is one of the data mining techniques that finds the relationship between transactions in a tree structured database. Due to the structure and the content of XML format, traditional data mining and statistical analysis hardly applied to get accurate result. This paper proposes a framework that can flatten a tree structured data into a flat and structured data, while preserving their structure and content.Enabling these XML documents into relational structured data allows a range of data mining techniques and statistical test can be applied and conducted to extract more information from the business process log.

Designing and Developing e-Assessment Delivery System Under IMS QTI ver.2.2 Specification

International Journal of Emerging Technologies in Learning (iJET) ◽

10.3991/ijet.v16i01.16257 ◽

2021 ◽

Vol 16 (01) ◽

pp. 219

Author(s):

Mohammed Boussakuk ◽

Ahmed Bouchboua ◽

Mohammed El Ghazi ◽

Moulhime El Bekkali ◽

Mohammed Fattah

Keyword(s):

Software Engineering ◽

Delivery System ◽

Learning Management Systems ◽

Authoring Tool ◽

Primary Task ◽

Management Systems ◽

Learning Resources ◽

Global Learning ◽

Xml Documents ◽

Online Examination

Using technology to create online examination is important, more than that, exploring the recent research in the area of standardization of the learning and the assessment content is of paramount. However, there is still a lack of reusable, interoperable and exchangeable learning resources that afford much motivation to instructors. Assessing acquired knowledge is the primary task of any Learning Management Systems (LMS). Consequent-ly, sharing evaluation contents is a major challenge for software engineering. In this paper, we present the “CleverTesting” assessment authoring tool, focusing on its architecture and its conformance to the Question and Test Interoperability (QTI) specification proposed by IMS Global Learning Consortium. We adopt the QTI specification to create reusable, interoperable and sharable assessment content. Our system has three important components: The first component is “CleverTesting” a quiz authoring tool that allows building sharable question items. The second represents a question bank. The latest component is a player for interpreting and parsing QTI XML documents.

A Dual-Index Based Representation for Processing XPath Queries on Very Large XML Documents

Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering - Cloud Computing ◽

10.1007/978-3-030-69992-5_2 ◽

2021 ◽

pp. 18-30

Author(s):

Wei Hao ◽

Kiminori Matsuzaki ◽

Shigeyuki Sato

Keyword(s):

Xml Documents

xml documents
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

XML2HBase: Storing and querying large collections of XML documents using a NoSQL database system

Software review: The JATSdecoder package—extract metadata, abstract and sectioned text from NISO-JATS coded XML documents; Insights to PubMed central’s open access database

Classification of XML Documents Using Semantic Resources

A New Hybrid Approach for Encrypting XML Documents

Recognize, Annotate, and Visualize Parallel Content Structures in XML Documents

The (unspoken) XML gotcha

A Novel Three-Way Merge Algorithm for HTML/XML Documents Using a Hidden Markov Model

A Framework to Analyze Business Process Log in XML Format

Designing and Developing e-Assessment Delivery System Under IMS QTI ver.2.2 Specification

A Dual-Index Based Representation for Processing XPath Queries on Very Large XML Documents

Export Citation Format

xml documentsRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

XML2HBase: Storing and querying large collections of XML documents using a NoSQL database system

Software review: The JATSdecoder package—extract metadata, abstract and sectioned text from NISO-JATS coded XML documents; Insights to PubMed central’s open access database

Classification of XML Documents Using Semantic Resources

A New Hybrid Approach for Encrypting XML Documents

Recognize, Annotate, and Visualize Parallel Content Structures in XML Documents

The (unspoken) XML gotcha

A Novel Three-Way Merge Algorithm for HTML/XML Documents Using a Hidden Markov Model

A Framework to Analyze Business Process Log in XML Format

Designing and Developing e-Assessment Delivery System Under IMS QTI ver.2.2 Specification

A Dual-Index Based Representation for Processing XPath Queries on Very Large XML Documents

xml documents
Recently Published Documents