machine readable
Recently Published Documents


TOTAL DOCUMENTS

1013
(FIVE YEARS 333)

H-INDEX

24
(FIVE YEARS 7)

Author(s):  
Kaneeka Vidanage ◽  
Noor Maizura Mohamad Noor ◽  
Rosmayati Mohemad ◽  
Zuriana Abu Bakar

Ontologies are domain-specific conceptualizations that are both human and machine-readable. Due to this remarkable attribute of ontologies, its applications are not limited to computing domains. Banking, medicine, agriculture, and law are a few of the non-computing domains, where ontologies are being used very effectively. When creating ontologies for non-computing domains, involvement of the non-computing domain specialists like bankers, lawyers, farmers become very vital. Hence, they are not semantic specialists, particularly designed visualization assistance is required for the ontology schema verifications and sense-making. Existing visualization methods are not fine-tuned for non-technical domain specialists and there are lots of complexities. In this research, a novel algorithm capable of generating domain specialists’ friendlier visualization canvas has been explored. This proposed algorithm and the visualization canvas has been tested for three different domains and overall success of 85% has been yielded.


Author(s):  
Erhan Turan ◽  
Umut Orhan

In this study, a novel confidence indexing algorithm is proposed to minimize human labor in controlling the reliability of automatically extracted synsets from a non-machine-readable monolingual dictionary. Contemporary Turkish Dictionary of Turkish Language Association is used as the monolingual dictionary data. First, the synonym relations are extracted by traditional text processing methods from dictionary definitions and a graph is prepared in Lemma-Sense network architecture. After each synonym relation is labeled by a proper confidence index, synonym pairs with desired confidence indexes are analyzed to detect synsets with a spanning tree-based method. This approach can label synsets with one of three cumulative confidence levels (CL-1, CL-2, and CL-3). According to the confidence levels, synsets are compared with KeNet which is the only open access Turkish Wordnet. Consequently, while most matches with the synsets of KeNet is determined in CL-1 and CL-2 confidence levels, the synsets determined at CL-3 level reveal errors in the dictionary definitions. This novel approach does not find only the reliability of automatically detected synsets, but it can also point out errors of detected synsets from the dictionary.


AI & Society ◽  
2022 ◽  
Author(s):  
Brenda O’Neill ◽  
Larry Stapleton

AbstractThis paper is a survey of standards being used in the domain of digital cultural heritage with focus on the Metadata Encoding and Transmission Standard (METS) created by the Library of Congress in the United States of America. The process of digitization of cultural heritage requires silo breaking in a number of areas—one area is that of academic disciplines to enable the performance of rich interdisciplinary work. This lays the foundation for the emancipation of the second form of silo which are the silos of knowledge, both traditional and born digital, held in individual institutions, such as galleries, libraries, archives and museums. Disciplinary silo breaking is the key to unlocking these institutional knowledge silos. Interdisciplinary teams, such as developers and librarians, work together to make the data accessible as open data on the “semantic web”. Description logic is the area of mathematics which underpins many ontology building applications today. Creating these ontologies requires a human–machine symbiosis. Currently in the cultural heritage domain, the institutions’ role is that of provider of this  open data to the national aggregator which in turn can make the data available to the trans-European aggregator known as Europeana. Current ingests to the aggregators are in the form of machine readable cataloguing metadata which is limited in the richness it provides to disparate object descriptions. METS can provide this richness.


2022 ◽  
Vol 6 (1) ◽  
pp. 3
Author(s):  
Thaddaeus J. Kiker ◽  
Nina Hooper ◽  
Martin Elvis

Abstract Dozens of exotic materials are found only in meteorites. These “meteorite minerals” are formed in the solar system's cold, long-lived, proto-planetary disk, in the slowly cooling cores of planetesimals, and in high-speed collisions. To the best of our knowledge no recent published work has aggregated information about minerals only found in meteorites in a comprehensive and machine readable manner. Thus, we have compiled a preliminary catalog of 81 known meteorite minerals from the literature to serve as a stepping stone for a future, more extensive effort. We also explore the distribution of these meteorite minerals by meteorite type.


Data Science ◽  
2022 ◽  
pp. 1-42
Author(s):  
Stian Soiland-Reyes ◽  
Peter Sefton ◽  
Mercè Crosas ◽  
Leyla Jael Castro ◽  
Frederik Coppens ◽  
...  

An increasing number of researchers support reproducibility by including pointers to and descriptions of datasets, software and methods in their publications. However, scientific articles may be ambiguous, incomplete and difficult to process by automated systems. In this paper we introduce RO-Crate, an open, community-driven, and lightweight approach to packaging research artefacts along with their metadata in a machine readable manner. RO-Crate is based on Schema.org annotations in JSON-LD, aiming to establish best practices to formally describe metadata in an accessible and practical way for their use in a wide variety of situations. An RO-Crate is a structured archive of all the items that contributed to a research outcome, including their identifiers, provenance, relations and annotations. As a general purpose packaging approach for data and their metadata, RO-Crate is used across multiple areas, including bioinformatics, digital humanities and regulatory sciences. By applying “just enough” Linked Data standards, RO-Crate simplifies the process of making research outputs FAIR while also enhancing research reproducibility. An RO-Crate for this article11 https://w3id.org/ro/doi/10.5281/zenodo.5146227 is archived at https://doi.org/10.5281/zenodo.5146227.


2021 ◽  
Author(s):  
Adam J H Newton ◽  
David Chartash ◽  
Steven H Kleinstein ◽  
Robert A McDougal

Objective: The accelerating pace of biomedical publication has made retrieving papers and extracting specific comprehensive scientific information a key challenge. A timely example of such a challenge is to retrieve the subset of papers that report on immune signatures (coherent sets of biomarkers) to understand the immune response mechanisms which drive differential SARS-CoV-2 infection outcomes. A systematic and scalable approach is needed to identify and extract COVID-19 immune signatures in a structured and machine-readable format. Materials and Methods: We used SPECTER embeddings with SVM classifiers to automatically identify papers containing immune signatures. A generic web platform was used to manually screen papers and allow anonymous submission. Results: We demonstrate a classifier that retrieves papers with human COVID-19 immune signatures with a positive predictive value of 86%. Semi-automated queries to the corresponding authors of these publications requesting signature information achieved a 31% response rate. This demonstrates the efficacy of using a SVM classifier with document embeddings of the abstract and title, to retrieve papers with scientifically salient information, even when that information is rarely present in the abstract. Additionally, classification based on the embeddings identified the type of immune signature (e.g., gene expression vs. other types of profiling) with a positive predictive value of 74%. Conclusions: Coupling a classifier based on document embeddings with direct author engagement offers a promising pathway to build a semi-structured representation of scientifically relevant information. Through this approach, partially automated literature mining can help rapidly create semi-structured knowledge repositories for automatic analysis of emerging health threats.


Author(s):  
Anupam Agrawal ◽  

The paper describes a method of intrusion detection that keeps check of it with help of machine learning algorithms. The experiments have been conducted over KDD’99 cup dataset, which is an imbalanced dataset, cause of which recall of some classes coming drastically low as there were not enough instances of it in there. For Preprocessing of dataset One Hot Encoding and Label Encoding to make it machine readable. The dimensionality of dataset has been reduced using Principal Component Analysis and classification of dataset into classes viz. attack and normal is done by Naïve Bayes Classifier. Due to imbalanced nature, shift of focus was on recall and overall recall and compared with other models which have achieved great accuracy. Based on the results, using a self optimizing loop, model has achieved better geometric mean accuracy.


2021 ◽  
Author(s):  
Ashleigh Hawkins

AbstractMass digitisation and the exponential growth of born-digital archives over the past two decades have resulted in an enormous volume of archives and archival data being available digitally. This has produced a valuable but under-utilised source of large-scale digital data ripe for interrogation by scholars and practitioners in the Digital Humanities. However, current digitisation approaches fall short of the requirements of digital humanists for structured, integrated, interoperable, and interrogable data. Linked Data provides a viable means of producing such data, creating machine-readable archival data suited to analysis using digital humanities research methods. While a growing body of archival scholarship and praxis has explored Linked Data, its potential to open up digitised and born-digital archives to the Digital Humanities is under-examined. This article approaches Archival Linked Data from the perspective of the Digital Humanities, extrapolating from both archival and digital humanities Linked Data scholarship to identify the benefits to digital humanists of the production and provision of access to Archival Linked Data. It will consider some of the current barriers preventing digital humanists from being able to experience the benefits of Archival Linked Data evidenced, and to fully utilise archives which have been made available digitally. The article argues for increased collaboration between the two disciplines, challenges individuals and institutions to engage with Linked Data, and suggests the incorporation of AI and low-barrier tools such as Wikidata into the Linked Data production workflow in order to scale up the production of Archival Linked Data as a means of increasing access to and utilisation of digitised and born-digital archives.


2021 ◽  
pp. 1-10
Author(s):  
María Auxilio Medina Nieto ◽  
Jorge de la Calleja Mora ◽  
Claudia Zepeda Cortés ◽  
Eduardo López Domínguez

This paper describes Onto4AIR2, an ontology to manage theses from open repositories, this fosters unique and formal definitions of concepts from the Mexican repositories domain in English and Spanish languages, its goal is to support the construction of machine-readable datasets that are semantically labeled for further consultations in educational organizations. The ontology instances are sample data of theses from the National Repository of Mexico, an initiative promoted by the National Council of Science and Technology. The paper describes advantages derived from the formalisms of the ontology, and describes an assessment technique where participants are developers and potential users. Developers followed a competency questions-based approach and determined that the ontology represents questions and answers using its terminology; whereas potential users participated in a satisfaction survey; the results showed a positive perception. At present, the level of the ontology is proof of concept.


Sign in / Sign up

Export Citation Format

Share Document