Dynamical entropic analysis of scientific concepts

In the present era of information, the problem of effective knowledge retrieval from a collection of scientific documents becomes especially important for continuous scientific progress. The information available in scientific publications traditionally consists of bibliometric metadata and its semantic component such as title, abstract and text. While the former having a machine-readable format usually used for knowledge mapping and pattern recognition, the latter designed for human interpretation and analysis. Only a few studies use full-text analysis, based on carefully selected scientific ontology, to map the actual structure of the scientific knowledge or uncover similarities between documents. Unfortunately, the presence of common (basic) concepts across semantically unrelated documents creates spurious connections between different topics. We revise the known method based on the entropic information-theoretic measure used for selecting basic concepts and propose to analyse the dynamics of Shannon entropy for more rigorous sorting of concepts by their generality.

Download Full-text

CFTR Lifecycle Map—A Systems Medicine Model of CFTR Maturation to Predict Possible Active Compound Combinations

International Journal of Molecular Sciences ◽

10.3390/ijms22147590 ◽

2021 ◽

Vol 22 (14) ◽

pp. 7590

Author(s):

Liza Vinhoven ◽

Frauke Stanke ◽

Sylvia Hafkemeyer ◽

Manuel Manfred Nietert

Keyword(s):

Large Scale ◽

Synergistic Effects ◽

Small Scale ◽

Systems Medicine ◽

Promising Candidate ◽

Cftr Mutations ◽

Machine Readable ◽

High Throughput Screens ◽

Readable Format ◽

Machine Readable Format

Different causative therapeutics for CF patients have been developed. There are still no mutation-specific therapeutics for some patients, especially those with rare CFTR mutations. For this purpose, high-throughput screens have been performed which result in various candidate compounds, with mostly unclear modes of action. In order to elucidate the mechanism of action for promising candidate substances and to be able to predict possible synergistic effects of substance combinations, we used a systems biology approach to create a model of the CFTR maturation pathway in cells in a standardized, human- and machine-readable format. It is composed of a core map, manually curated from small-scale experiments in human cells, and a coarse map including interactors identified in large-scale efforts. The manually curated core map includes 170 different molecular entities and 156 reactions from 221 publications. The coarse map encompasses 1384 unique proteins from four publications. The overlap between the two data sources amounts to 46 proteins. The CFTR Lifecycle Map can be used to support the identification of potential targets inside the cell and elucidate the mode of action for candidate substances. It thereby provides a backbone to structure available data as well as a tool to develop hypotheses regarding novel therapeutics.

Download Full-text

Componential analysis: the sememe and the concept of distinctiveness

The Canadian Journal of Linguistics / La revue canadienne de linguistique ◽

10.1017/s0008413100007519 ◽

1974 ◽

Vol 19 (1) ◽

pp. 1-17 ◽

Cited By ~ 10

Author(s):

Rodney Huddleston

Keyword(s):

Semantic Structure ◽

Sign Theory ◽

Semantic Domain ◽

Componential Analysis ◽

Semantic Component ◽

Separate Treatment ◽

Basic Concepts

The term ‘componential analysis’ is used here to refer to the theory of semantic structure developed by Goodenough (1956, 1965, etc.), Lounsbury (1956, 1964, etc.) and others. Obviously the notion of a semantic component - or ‘feature,’ or whatever other term is applied - is common to a wide variety of semantic theories, and ‘componential analysis’ is sometimes used (e.g. by Lyons 1968) to cover the whole of this wider field; nevertheless the Goodenough-Lounsbury theory is sufficiently unified and different from others to warrant separate treatment.Although the theory is intended to be, and undoubtedly is, much more widely applicable, a great deal of the descriptive work in componential analysis is in the field of kinship terminologies, and the basic concepts of the theory may conveniently be exemplified from this semantic domain. Drawing on the sign theory of Charles Morris, Goodenough distinguishes between the denotatum, designatum and significatum of a word (1965: 286 n.3):

Download Full-text

Optical Music Analysis for Printed Music Score and Handwritten Music Manuscript

Visual Perception of Music Notation ◽

10.4018/978-1-59140-298-5.ch004 ◽

2011 ◽

pp. 108-127 ◽

Cited By ~ 5

Author(s):

Kia Ng

Keyword(s):

Domain Knowledge ◽

Imaging System ◽

Data Representation ◽

Divide And Conquer ◽

Music Score ◽

Segmentation Approach ◽

Printed Music ◽

High Level ◽

Machine Readable ◽

Readable Format

This chapter describes an optical document imaging system to transform paper-based music scores and manuscripts into machine-readable format and a restoration system to touch-up small imperfections (for example broken stave lines and stems), to restore deteriorated master copy for reprinting. The chapter presents a brief background of this field, discusses the main obstacles, and presents the processes involved for printed music scores processing; using a divide-and-conquer approach to sub-segment compound musical symbols (e.g., chords) and inter-connected groups (e.g., beamed quavers) into lower-level graphical primitives (e.g., lines and ellipses) before recognition and reconstruction. This is followed by discussions on the developments of a handwritten manuscripts prototype with a segmentation approach to separate handwritten musical primitives. Issues and approaches for recognition, reconstruction and revalidation using basic music syntax and high-level domain knowledge, and data representation are also presented.

Download Full-text

Internet and Access to Scholarly Publications

Encyclopedia of E-Commerce, E-Government, and Mobile Commerce ◽

10.4018/978-1-59140-799-7.ch105 ◽

2011 ◽

pp. 653-659

Author(s):

Jean-Philippe Rennard

Keyword(s):

Scientific Knowledge ◽

Scientific Progress ◽

The Internet ◽

Digital Networks ◽

Cultural Goods ◽

Scientific Publications ◽

Sir Isaac Newton ◽

Isaac Newton ◽

Scholarly Publications

“If I have seen further it is by standing upon the shoulders of giants.” The famous statement of Sir Isaac Newton demonstrates that the progress of science relies on the dissemination of discoveries and scientific knowledge. Even though scientific progress is not strictly cumulative (Kuhn, 1970), information sharing is the heart of this progress. In the Gutenberg era, researchers had no alternative: Publishers were the only way to reach readers. The development of e-commerce and of digital networks led to the post-Gutenberg era, and offers a powerful alternative that can lead in the long term to a new organization of scientific publications (Harnad, 1999). As well as e-commerce is revolutionizing the distribution of cultural goods (particularly music), the distribution of scientific knowledge through the Internet should contribute to the emergence of a new economic model.

Download Full-text

A Design Tool for Business Process Design and Representation

Selected Readings on Information Technology Management ◽

10.4018/978-1-60566-092-9.ch010 ◽

2010 ◽

pp. 160-177

Author(s):

Roberto Paiano ◽

Anna Lisa Guido

Keyword(s):

Business Process ◽

Process Design ◽

Design Tool ◽

Formal Representation ◽

Middle Point ◽

Business Process Design ◽

Semantic Web Technology ◽

Web Information System ◽

Machine Readable ◽

Readable Format

In this chapter the focus is on business process design as middle point between requirement elicitation and implementation of a Web information system. We face both the problem of the notation to adopt in order to represent in a simple way the business process and the problem of a formal representation, in a machine-readable format, of the design. We adopt Semantic Web technology to represent process and we explain how this technology has been used to reach our goals.

Download Full-text

A Design Tool for Business Process Design and Representation

Electronic Business ◽

10.4018/978-1-60566-056-1.ch029 ◽

2009 ◽

pp. 451-468

Author(s):

Roberto Paiano ◽

Anna Lisa Guido

Keyword(s):

Business Process ◽

Process Design ◽

Design Tool ◽

Formal Representation ◽

Middle Point ◽

Business Process Design ◽

Semantic Web Technology ◽

Web Information System ◽

Machine Readable ◽

Readable Format

Download Full-text

Capturing Expert Arguments from Medical Adjudication Discussions in a Machine-readable Format

Companion Proceedings of The 2019 World Wide Web Conference on - WWW '19 ◽

10.1145/3308560.3317085 ◽

2019 ◽

Cited By ~ 3

Author(s):

Mike Schaekermann ◽

Graeme Beaton ◽

Minahz Habib ◽

Andrew Lim ◽

Kate Larson ◽

...

Keyword(s):

Machine Readable ◽

Readable Format ◽

Machine Readable Format

Download Full-text

IRS Announces Availability of EO Data in Machine-Readable Format

Bruce R Hopkins Nonprofit Counsel ◽

10.1002/npc.30226 ◽

2016 ◽

Vol 33 (8) ◽

pp. 6-6

Keyword(s):

Machine Readable ◽

Readable Format ◽

Machine Readable Format

Download Full-text

THE ROLE OF LITERARY WORKS IN THE FORMATION OF VALUE ORIENTATIONS IN ADOLESCENTS: PSYCHOLOGICAL ASPECT

SOCIETY INTEGRATION EDUCATION Proceedings of the International Scientific Conference ◽

10.17770/sie2019vol1.3948 ◽

2019 ◽

Vol 1 ◽

Author(s):

Yuliia Breus ◽

Roman Kozlov ◽

Tetiana Virchenko

Keyword(s):

Value Orientations ◽

Value System ◽

Literary Works ◽

Information Theoretic ◽

Specific Factors ◽

Sample Characteristics ◽

Basic Concepts ◽

Character Building ◽

Intensive Development

The article is devoted to the problem of developing values among adolescents, along with the significance of literature in this process. Adolescence with all its complexity and contradictions is a period of intensive development of the value system, which influences both character-building and personality in general. The authors of the study have identified fundamental characteristics of values and value perspectives of youth, and reasonably established specific factors in their formation while studying literary works. A teenager perceives life values only in personal, concrete embodiment. For this reason, literary works and their protagonists are rightly considered an effective way of passing on a set of values from generation to generation. The information theoretic unit of the article is complemented by the results of the empiric interdisciplinary research with basic concepts defined, sample characteristics and research methods given, and the outcomes of the experiment presented. Using M. Rokeach’s value list, the authors have defined the value system of modern adolescents, and analyzed the mechanisms of their formation through the curriculum pieces of Ukrainian literature.

Download Full-text

Auto-CORPus: Automated and Consistent Outputs from Research Publications

10.1101/2021.01.08.425887 ◽

2021 ◽

Author(s):

Yan Hu ◽

Shujian Sun ◽

Thomas Rowlands ◽

Tim Beck ◽

Joram Matthias Posma

Keyword(s):

Language Processing ◽

Association Studies ◽

Biomedical Literature ◽

Pubmed Central ◽

Manual Curation ◽

Genome Wide ◽

Ontology Extraction ◽

Table Data ◽

Machine Readable ◽

Readable Format

Motivation: The availability of improved natural language processing (NLP) algorithms and models enable researchers to analyse larger corpora using open source tools. Text mining of biomedical literature is one area for which NLP has been used in recent years with large untapped potential. However, in order to generate corpora that can be analyzed using machine learning NLP algorithms, these need to be standardized. Summarizing data from literature to be stored into databases typically requires manual curation, especially for extracting data from result tables. Results: We present here an automated pipeline that cleans HTML files from biomedical literature. The output is a single JSON file that contains the text for each section, table data in machine-readable format and lists of phenotypes and abbreviations found in the article. We analyzed a total of 2,441 Open Access articles from PubMed Central, from both Genome-Wide and Metabolome-Wide Association Studies, and developed a model to standardize the section headers based on the Information Artifact Ontology. Extraction of table data was developed on PubMed articles and fine-tuned using the equivalent publisher versions. Availability: The Auto-CORPus package is freely available with detailed instructions from Github at https://github.com/jmp111/AutoCORPus/.

Download Full-text