Dynamical entropic analysis of scientific concepts
In the present era of information, the problem of effective knowledge retrieval from a collection of scientific documents becomes especially important for continuous scientific progress. The information available in scientific publications traditionally consists of bibliometric metadata and its semantic component such as title, abstract and text. While the former having a machine-readable format usually used for knowledge mapping and pattern recognition, the latter designed for human interpretation and analysis. Only a few studies use full-text analysis, based on carefully selected scientific ontology, to map the actual structure of the scientific knowledge or uncover similarities between documents. Unfortunately, the presence of common (basic) concepts across semantically unrelated documents creates spurious connections between different topics. We revise the known method based on the entropic information-theoretic measure used for selecting basic concepts and propose to analyse the dynamics of Shannon entropy for more rigorous sorting of concepts by their generality.