machine readable form
Recently Published Documents

Abstract The paper proposes the implementation of the method of optical recognition of technical documentation and the transformation of graphic information into a machine-readable form available for cognitive analysis, which is based on the methods of binarization and alignment of images, text segmentation and recognition. The use of the proposed method will provide a dramatic reduction in the costs of cataloging, checking the completeness and inventory of documentation, as well as an increase in design quality due to the semantic analysis of documentation using a knowledge base that is updated automatically. The article presents the development of the algorithm for optical recognition of a document, preparation of an image for optical recognition of a document, an example of the application of the Sauvola method for binarization of an image, and an analysis of the research results. The proposed implementation allows the text recognition on scanned/photographed documents.

Download Full-text

Towards automation of operando experiments: A case study in contactless conductivity measurements

10.33774/chemrxiv-2021-mh17g ◽

2021 ◽

Author(s):

Peter Kraus ◽

Elisabeth Wolf ◽

Charlotte Prinz ◽

Giulia Bellini ◽

Annette Trunschke ◽

...

Keyword(s):

Best Practice ◽

Conductivity Measurement ◽

Lessons Learned ◽

Main Body ◽

Process Data ◽

Post Process ◽

Set Up ◽

Machine Readable ◽

Machine Readable Form

Automation of experiments is a key component on the path of digitalisation in catalysis and related sciences. Here we present the lessons learned and caveats avoided during the automation of our contactless conductivity measurement set-up, capable of operando measurement of catalytic samples. We briefly discuss the motivation behind the work, the technical groundwork required, and the philosophy guiding our design. The main body of this work is dedicated to the detailing of the implementation of the automation, data structures, as well as the modular data processing pipeline. The open-source toolset developed as part of this work allows us to carry out unattended and reproducible experiments, as well as post-process data according to current best practice. This process is illustrated by implementing two routine sample protocols, one of which was included in the Handbook of Catalysis, providing several case studies showing the benefits of such automation, including increased throughput and higher data quality. The datasets included as part of this work contain catalytic and operando conductivity data, and are self-consistent, annotated with metadata, and are available on a public repository in a machine-readable form. We hope the datasets as well as the tools and workflows developed as part of this work will be an useful guide on the path towards automation and digital catalysis.

Download Full-text

Improving Recommendations for Online Retail Markets Based on Ontology Evolution

Electronics ◽

10.3390/electronics10141650 ◽

2021 ◽

Vol 10 (14) ◽

pp. 1650

Author(s):

Rana Alaa ◽

Mariam Gawish ◽

Manuel Fernández-Veiga

Keyword(s):

Semantic Web ◽

Recommendation System ◽

User Preferences ◽

Personalized Recommendation ◽

Purchase Decisions ◽

Online Retail ◽

Ontology Evolution ◽

The One ◽

Machine Readable ◽

Machine Readable Form

The semantic web is considered to be an extension of the present web. In the semantic web, information is given with well-defined meanings, and thus helps people worldwide to cooperate together and exchange knowledge. The semantic web plays a significant role in describing the contents and services in a machine-readable form. It has been developed based on ontologies, which are deemed the backbone of the semantic web. Ontologies are a key technique with which semantics are annotated, and they provide common comprehensible foundation for resources on the semantic web. The use of semantics and artificial intelligence leads to what is known to be “Smarter Web”, where it will be easy to retrieve what customers want to see on e-commerce platforms, and thus will help users save time and enhance their search for the products they need. The semantic web is used as well as webs 3.0, which helps enhancing systems performance. Previous personalized recommendation methods based on ontologies identify users’ preferences by means of static snapshots of purchase data. However, as the user preferences evolve with time, the one-shot ontology construction is too constrained for capturing individual diverse opinions and users’ preferences evolution over time. This paper will present a novel recommendation system architecture based on ontology evolution, the proposed subsystem architecture for ontology evolution. Furthermore, the paper proposes an ontology building methodology based on a semi-automatic technique as well as development of online retail ontology. Additionally, a recommendation method based on the ontology reasoning is proposed. Based on the proposed method, e-retailers can develop a more convenient product recommendation system to support consumers’ purchase decisions.

Download Full-text

Taking the Pulse of Hospitals’ Response to the New Price Transparency Rule

Medical Care Research and Review ◽

10.1177/10775587211024786 ◽

2021 ◽

pp. 107755872110247

Author(s):

Sayeh Nikpay ◽

Ezra Golberstein ◽

Hannah T. Neprash ◽

Caitlin Carroll ◽

Jean M. Abraham

Keyword(s):

Informed Decision Making ◽

Size Category ◽

Price Transparency ◽

Reporting Requirements ◽

Information Reporting ◽

Readable Form ◽

Nationally Representative ◽

Machine Readable ◽

Machine Readable Form ◽

Complete Reporting

As of January 1, 2021, most U.S. hospitals are required to publish pricing information on their website to promote more informed decision making by consumers regarding their care. In a nationally representative sample of 470 hospitals, we analyzed whether hospitals met price transparency information reporting requirements and the extent to which complete reporting was associated with ownership status, bed size category, system affiliation, and location in a metropolitan area. Fewer than one quarter of sampled hospitals met the price transparency information requirements of the new rule, which include five types of standard charges in machine-readable form and the consumer-shoppable display of 300 shoppable services. Our analyses of hospital reporting by organizational and market attributes revealed limited differences, with some exceptions for nonprofit and system-member hospitals demonstrating greater responsiveness with respect to the consumer-shoppable aspects of the rule.

Download Full-text

Corpus Linguistics and Eighteenth Century Collections Online (ECCO)

Research in Corpus Linguistics ◽

10.32714/ricl.09.01.03 ◽

2021 ◽

Vol 9 (1) ◽

pp. 19-34

Author(s):

Mikko Tolonen ◽

Eetu Mäkelä ◽

Ali Ijaz ◽

Leo Lahti

Keyword(s):

Eighteenth Century ◽

Corpus Linguistics ◽

Character Recognition ◽

Optical Character Recognition ◽

Historical Source ◽

Optical Character ◽

Key Aspects ◽

Machine Readable ◽

Machine Readable Form

Eighteenth Century Collections Online (ECCO) is the most comprehensive dataset available in machine-readable form for eighteenth-century printed texts. It plays a crucial role in studies of eighteenth-century language and it has vast potential for corpus linguistics. At the same time, it is an unbalanced corpus that poses a series of different problems. The aim of this paper is to offer a general overview of ECCO for corpus linguistics by analysing, for example, its publication countries and languages. We will also analyse the role of the substantial number of reprints and new editions in the data, discuss genres and the estimates of Optical Character Recognition (OCR) quality. Our conclusion is that whereas ECCO provides a valuable source for corpus linguistics, scholars need to pay attention to historical source criticism. We have highlighted key aspects that need to be taken into consideration when considering its possible uses.

Download Full-text

Ein Metadatenlabel für das semantische Sensorweb

10.31237/osf.io/fs48a ◽

2020 ◽

Author(s):

Anika Graupner

Keyword(s):

Cloud Computing ◽

World Wide Web ◽

Sensor Network ◽

Linked Data ◽

World Wide ◽

Sensor Data ◽

Readable Form ◽

Machine Readable ◽

Machine Readable Form ◽

New Machine

Deutsch: Das World Wide Web entwickelt sich zu einem semantischen Web. Daten werden mit Hilfe von Ontologien semantisch annotiert, um sie maschinenlesbar bereitzustellen. Zu diesen Daten zählen auch Sensordaten. Da die Anzahl an online zur Verfügung gestellten Sensordaten steigt, fällt es dem Nutzer immer schwerer, geeignete Daten zu finden. Die neuen Sensorbeschreibungen in Form von Ontologien werden in der folgenden Arbeit dazu eingesetzt, dem Nutzer die Suche wieder zu vereinfachen. Die Erfassung von Informationen aus den neuen maschinenlesbaren Beschreibungen kann für den Menschen schwierig sein. Daher wird das von GeoViQua ursprünglich für GEOSS entwickelte GEO Label dazu eingesetzt, auch visuelle Zusammenfassungen der Verfügbarkeit von Informationen in Sensorbeschreibungen bereitzustellen. Das Label besteht aus Facetten, die Auskunft über das Vorhandensein von Informationen wie Herstellerprofil, Benutzerfeedback und Qualitätsinformationen in Metadatendokumenten geben. In der vorliegenden Arbeit wird die 52 North Implementierung der Java GEO Label API um die Generierung von Labels basierend auf der Semantic Sensor Network Ontology (SSNO) und verbundenen Linked Data Ontologien erweitert. Zuvor werden aus diesen Ontologien Informationsquellen für die Facetten des Labels gesucht. Zudem wird die GEO Label API mit einem Serverless Cloud-Computing Anbieter zur Verfügung gestellt, wobei die Frage beantwortet wird, welche Cloud-Ressourcen für die Labelgenerierung notwendig sind. English: The World Wide Web is developing into a semantic web. Data are semantically annotated using ontologies to make them available in a machine-readable form. These data include also sensor data. As the amount of sensor data made available online increases, it becomes more difficult for users to find suitable data. The new sensor descriptions in form of ontologies will be used in the following work to simplify the search for users again. The acquisition of information from the new machine-readable descriptions may be difficult for humans. Therefore, the GEO Label, originally developed by GeoViQua for GEOSS, is used to provide visual summaries of the availability of information in sensor descriptions. The label consists of facets that provide information about the availability of information in metadata documents such as producer profile, user feedback and quality information. In this thesis the 52 North implementation of the Java GEO label API is extended by the generation of labels based on the Semantic Sensor Network Ontology (SSNO) and related Linked Data Ontologies. Prior to this, information sources for the facets of the label are searched for from these ontologies. In addition, the GEO Label API will be made available with a serverless cloud computing provider, whereby the question will be answered which cloud resources are necessary for label generation.

Download Full-text

Feature Extraction Techniques Based on Swarm Intelligence in OCR

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.l2480.1081219 ◽

2019 ◽

Vol 8 (12) ◽

pp. 13-19

Keyword(s):

Feature Extraction ◽

Character Recognition ◽

Optical Character Recognition ◽

Feature Vector ◽

Extraction Techniques ◽

Optical Character ◽

Readable Form ◽

Machine Readable ◽

Machine Readable Form ◽

Recognition Technique

Optical Character Recognition is a most recent field in area of pattern recognition and machine learning in last decade. In this article, the suitable techniques are designated for better character recognition in document into machine readable form. It is belonging with Content Based Image Retrieval (CBIR) system, which solve the delinquent of searching images in huge dataset. The recognition technique of handwritten character is not developed efficiently till, because of variations in size, shape, style, slats etc. in writing skill of human being. To overcome such problems, the part of concentration is feature extraction and algorithm that take care of such variation. In this paper independent component analysis is used for extracting features. For feature vector selection particle swarm optimization and firefly algorithms are applied. It is observed that due to distributed neighborhood pixel of an image, the PSO gives better recognition rates.

Download Full-text

Spatio Partitioning of Character Image for Automatic Recognition of Digits

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.c6650.098319 ◽

2019 ◽

Vol 8 (3) ◽

pp. 8171-8177

Keyword(s):

Character Recognition ◽

Optical Character Recognition ◽

Fast Response ◽

Software Systems ◽

Decision Tree Classifier ◽

Optical Character ◽

Tree Classifier ◽

Spatially Distributed ◽

Machine Readable ◽

Machine Readable Form

In the running word, there is growing demand for the software systems to recognize characters in computer system when information is scanned through paper documents as we have number of newspapers and books which are in printed format related to different subjects the current capacity to translate paper documents quickly and accurately into machine readable form using optical character recognition technology augments the opportunities in document searching and storing as well as automated documents processing. A fast response in translating large collections of image-based electronic documents into structured electronics documents is still a problem. As an enhancement to the optical character recognition [1] (OCR) technology, I would like to propose a framework that recognize a printed digits in the character image using “spatio partitioning method”. The proposed system is efficiently recognize the digits from 0 to 9 different font size based on the new concept of feature extraction and which is classified under decision tree classifier, efficiency and time complexity of the proposed system also described. Partitioning is based on the pixel distribution of the character image; the pixel distribution describes the patter of the characters that is by spatially distributed foreground pixel.

Download Full-text

Speech Recognition using Multiscale Scattering of Audio Signals and Long Short-Term Memory of Neural Networks

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.k2270.0981119 ◽

2019 ◽

Vol 8 (11) ◽

pp. 2955-2961

Keyword(s):

Speech Recognition ◽

Short Term Memory ◽

Numerical Data ◽

Audio Signals ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory ◽

Lstm Network ◽

Machine Readable ◽

Machine Readable Form

Communication is one of the key elements of interaction. In order to understand the audio language used by humans, machines use different techniques to convert speech to machine readable form called speech recognition. This paper takes one of the most classic examples of the speech recognition domain, the spoken digit’s recognition. The recognition is done with the help of a technique called wavelet scattering that initially extracts useful information from the signals and sends this information further to a Long Short-Term Memory (LSTM) network to classify the signals. A major advantage of using the LSTM is that it overcomes the vanishing gradient problem and this proposed technique can be used in applications like entry of numerical data for blind people. This method provides an increased accuracy than other standard methods that uses Melfrequency Cepstral coefficients (MFFC) and LSTM network to recognize digits. The main objective of this work achieved its primary purpose to validate the efficiency of wavelet scattering technique and LSTM networks for spoken digits’ recognition

Download Full-text

Сlassification culture of the Russia

Scientific and Technical Libraries ◽

10.33186/1027-3689-2019-4-37-52 ◽

2019 ◽

pp. 37-52 ◽

Cited By ~ 1

Author(s):

E. R. Sukiasyan

Keyword(s):

School Libraries ◽

Classification Systems ◽

History Of ◽

Ancient Times ◽

High Level ◽

Complete Translation ◽

Machine Readable ◽

Machine Readable Form ◽

Decimal Classification ◽

Full Medium

Classification culture – a set of achievements of the country in the field of classification systems (CS): the study of their history and theory, the creation of own systems and the development of foreign experience, publications and options, practice of application. It is told about the achievements of Russia: works on the history of library CS, the directions of theoretical studies. For example, he CS history is well studied – from ancient times to the present day. Interesting CS were developed in the 18–19th century (for the largest and some university libraries). Schedules of the foreign CS – complete translation of the Colon Classification and the Dewey Decimal Classification into Russian was published. Published Russian UDC retranslation in 10 volumes. The country has CS, recognized by the international community as the National System of Russia (firstly published in the years 1961–1968 in the 30 books). Contemporary classification practice is distinguished by the presence of variants of schedules – full, medium and abridged, in book and machine-readable form. The experience of developing classification schedules for children's and school libraries is unique. The conclusion is made: Russia certainly has a high level of classification culture.

Download Full-text

machine readable formRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Method of the optical recognition of technical documentation and the transformation of graphic information into machine-readable form for cognitive analysis

Towards automation of operando experiments: A case study in contactless conductivity measurements

Improving Recommendations for Online Retail Markets Based on Ontology Evolution

Taking the Pulse of Hospitals’ Response to the New Price Transparency Rule

Corpus Linguistics and Eighteenth Century Collections Online (ECCO)

Ein Metadatenlabel für das semantische Sensorweb

Feature Extraction Techniques Based on Swarm Intelligence in OCR

Spatio Partitioning of Character Image for Automatic Recognition of Digits

Speech Recognition using Multiscale Scattering of Audio Signals and Long Short-Term Memory of Neural Networks

Сlassification culture of the Russia

machine readable form
Recently Published Documents