scholarly journals Neural embeddings of scholarly periodicals reveal complex disciplinary organizations

2021 ◽  
Vol 7 (17) ◽  
pp. eabb9004
Author(s):  
Hao Peng ◽  
Qing Ke ◽  
Ceren Budak ◽  
Daniel M. Romero ◽  
Yong-Yeol Ahn

Understanding the structure of knowledge domains is one of the foundational challenges in the science of science. Here, we propose a neural embedding technique that leverages the information contained in the citation network to obtain continuous vector representations of scientific periodicals. We demonstrate that our periodical embeddings encode nuanced relationships between periodicals and the complex disciplinary and interdisciplinary structure of science, allowing us to make cross-disciplinary analogies between periodicals. Furthermore, we show that the embeddings capture meaningful “axes” that encompass knowledge domains, such as an axis from “soft” to “hard” sciences or from “social” to “biological” sciences, which allow us to quantitatively ground periodicals on a given dimension. By offering novel quantification in the science of science, our framework may, in turn, facilitate the study of how knowledge is created and organized.

2019 ◽  
Vol 4 (2) ◽  
pp. 79-92 ◽  
Author(s):  
Zhesi Shen ◽  
Fuyou Chen ◽  
Liying Yang ◽  
Jinshan Wu

Abstract Purpose To investigate the effectiveness of using node2vec on journal citation networks to represent journals as vectors for tasks such as clustering, science mapping, and journal diversity measure. Design/methodology/approach Node2vec is used in a journal citation network to generate journal vector representations. Findings 1. Journals are clustered based on the node2vec trained vectors to form a science map. 2. The norm of the vector can be seen as an indicator of the diversity of journals. 3. Using node2vec trained journal vectors to determine the Rao-Stirling diversity measure leads to a better measure of diversity than that of direct citation vectors. Research limitations All analyses use citation data and only focus on the journal level. Practical implications Node2vec trained journal vectors embed rich information about journals, can be used to form a science map and may generate better values of journal diversity measures. Originality/value The effectiveness of node2vec in scientometric analysis is tested. Possible indicators for journal diversity measure are presented.


2020 ◽  
Vol 125 (3) ◽  
pp. 2915-2954
Author(s):  
Christin Katharina Kreutz ◽  
Premtim Sahitaj ◽  
Ralf Schenkel

AbstractIdentification of important works and assessment of importance of publications in vast scientific corpora are challenging yet common tasks subjected by many research projects. While the influence of citations in finding seminal papers has been analysed thoroughly, citation-based approaches come with several problems. Their impracticality when confronted with new publications which did not yet receive any citations, area-dependent citation practices and different reasons for citing are only a few drawbacks of them. Methods relying on more than citations, for example semantic features such as words or topics contained in publications of citation networks, are regarded with less vigour while providing promising preliminary results. In this work we tackle the issue of classifying publications with their respective referenced and citing papers as either seminal, survey or uninfluential by utilising semantometrics. We use distance measures over words, semantics, topics and publication years of papers in their citation network to engineer features on which we predict the class of a publication. We present the SUSdblp dataset consisting of 1980 labelled entries to provide a means of evaluating this approach. A classification accuracy of up to .9247 was achieved when combining multiple types of features using semantometrics. This is +.1232 compared to the current state of the art (SOTA) which uses binary classification to identify papers from classes seminal and survey. The utilisation of one-vector representations for the ternary classification task resulted in an accuracy of .949 which is +.1475 compared to the binary SOTA. Classification based on information available at publication time derived with semantometrics resulted in an accuracy of .8152 while an accuracy of .9323 could be achieved when using one-vector representations.


1956 ◽  
Vol 50 (4) ◽  
pp. 961-979 ◽  
Author(s):  
Harold D. Lasswell

My intention is to consider political science as a discipline and as a profession in relation to the impact of the physical and biological sciences and of engineering upon the life of man. I propose to inquire into the possible reconciliation of man's mastery over Nature with freedom, the overriding goal of policy in our body politic.In the interest of concreteness I shall have something to say about past and potential applications of science in three areas: armament, production, and evolution.It is trite to acknowledge that for years we have lived in the afterglow of a mushroom cloud and in the midst of an arms race of unprecedented gravity. Here I shall support a proposition that may at first evoke some incredulous exclamations. The proposition is that our intellectual tools have been sufficiently sharp to enable political scientists to make a largely correct appraisal of the consequences of unconventional weapons for world politics.


2021 ◽  
Author(s):  
Alfredo Silva ◽  
Marcelo Mendoza

Word embeddings are vital descriptors of words in unigram representations of documents for many tasks in natural language processing and information retrieval. The representation of queries has been one of the most critical challenges in this area because it consists of a few terms and has little descriptive capacity. Strategies such as average word embeddings can enrich the queries' descriptive capacity since they favor the identification of related terms from the continuous vector representations that characterize these approaches. We propose a datadriven strategy to combine word embeddings. We use Idf combinations of embeddings to represent queries, showing that these representations outperform the average word embeddings recently proposed in the literature. Experimental results on benchmark data show that our proposal performs well, suggesting that data-driven combinations of word embeddings are a promising line of research in ad-hoc information retrieval.


Author(s):  
Anna I. Radchenko ◽  
◽  
Natalia V. Koval ◽  

This is the third article in the cycle of publications on the monopolization level of scientific periodicals of National Academy of Sciences of Ukraine. It is devoted to the journals of Section of Physical, Engineering and Mathematical Sciences. The paper finishes the first stage of research aimed at revealing the breadth of the authors’ community of the NAS of Ukraine journals during 2015–2017. As well as in the previous articles of this cycle, the estimation of the authors’ community and its monopolization level is made by the Herfindahl–Hirschman Index calculation. Thus, the quantitative weighted distribution of affiliation is made for each journal. The analysis of the journals of Section of Social Sciences and Humanities and the Section of Chemical and Biological Sciences is made in the articles previously published in the “Herald of the National Academy of Sciences of Ukraine” (2018, No. 9 and 2019, No. 10). It was shown that the chemical and biological journals of NAS of Ukraine have the lowest monopolization level — there are only 23% of middle-monopolized and 27% of low-monopolized ones. Meanwhile, only 19% of journals of the Section of Social Sciences and Humanities have the moderate monopolization level and the rest have the high one. Journals of the Section of Physical, Engineering and Mathematical Sciences are more like chemical and biological ones. They mostly have high monopolization level (54%), however 17% are moderately monopolized and 25% are low-monopolized.


2018 ◽  
Vol 111 (2) ◽  
pp. 445 ◽  
Author(s):  
Franklyn Da Cruz LIMA ◽  
Andressa Juliana Almeida SIMÕES ◽  
Isabela Maria Monteiro VIEIRA ◽  
Daniel Pereira SILVA ◽  
Denise Santos RUZENE

Industrial food production causes a high amount of waste. This waste must be taken to a suitable location where it can be further processing. During industrial processing of the pineapple, about 50 % of the mass of the fruit ends up being discarded becoming a residue. Researchers have studied these residues in order to add value to these by-products, to reduce disposal costs and guarantee environmental sustainability. This work investigates the development characteristics of research on agroindustrial residues of pineapple based on bibliometric methods to explore the structure of knowledge in this field over the years, according to the year of publication, periodicals, country, authors, area of knowledge, institutions, keywords, subject type, and citation analysis. In total 927 articles were found and after a careful analysis and selection of papers, 364 articles remained of which 82 % were published only in the last decade. Most studies focused on agricultural and biological sciences. About 1183 authors from 50 different countries contributed to this subject, in which India has the largest number of publications. The results obtained with this study, highlighting the different uses for pineapple residues, can provide valuable information for researchers interested in the field of agroindustrial wastes.


Author(s):  
StanisŁaw PurgaŁ ◽  
Julian Parsert ◽  
Cezary Kaliszyk

Abstract Applying machine learning to mathematical terms and formulas requires a suitable representation of formulas that is adequate for AI methods. In this paper, we develop an encoding that allows for logical properties to be preserved and is additionally reversible. This means that the tree shape of a formula including all symbols can be reconstructed from the dense vector representation. We do that by training two decoders: one that extracts the top symbol of the tree and one that extracts embedding vectors of subtrees. The syntactic and semantic logical properties that we aim to preserve include both structural formula properties, applicability of natural deduction steps and even more complex operations like unifiability. We propose datasets that can be used to train these syntactic and semantic properties. We evaluate the viability of the developed encoding across the proposed datasets as well as for the practical theorem proving problem of premise selection in the Mizar corpus.


Author(s):  
D. C. Brindley ◽  
M. McGill

Morphological and cytochemical studies of platelets have reported a surface coat, or glycocalyx, external to the plasma membrane (1). Biochemical analyses have likewise confirmed the highly adsorptive properties of platelets as transporters of coagulation factors (2). However, visualization of the platelet membrane by conventional EM procedures does not reflect this special relationship between the platelet and its plasma environment. By the routine method of alcohol-propylene oxide dehydration for Epon embedding, the lipid bilayer nature of the platelet membrane appears similar to other blood cells (Fig. 1). A new rapid embedding technique using dimethoxypropane (DMP) as dehydrating agent (13) has permitted ultrastructural analyses of the surface features of the platelet-plasma interface.Aliquots of human or rabbit platelet-rich plasma (PRP) were added to equal volumes of 6% glutaraldehyde in Millonig's buffer at 37° for 45 minutes, rinsed in buffer and postfixed in 1% osmium in Millonig's buffer for 45 minutes.


Sign in / Sign up

Export Citation Format

Share Document