Identification of argumentative sentences in Russian scientific and popular science texts

Abstract In this study we analyze the applicability of specific machine learning algorithms to the task of detecting sentences containing argumentation in Russian text. We employ a collection of scientific and popular science texts with manually annotated argumentation to evaluate the quality of identifying argumentative sentences in terms of precision, recall, and F-measure. The experiment involves three algorithms: MNB, SVM, and MLP. The bag of words model is used for representing texts. Lemmas of words in analyzed sentences serve as features for the classification. We perform the automatic selection of informative features in accordance with Variance and χ2 criteria combined with the weight-based filtration of lemmas (via TF*IDF and EMI). The training set includes around 800 sentences, while the test set contains 180. The MNB algorithm demonstrates the highest F-measure and recall scores on almost all feature sets (maximal values reached equal 68.7% and 89% respectively), while the MLP algorithm shows the best precision for about half of feature selection variations (the maximal value is 72.5%).

Download Full-text

How Sure Can We Be about ML Methods-Based Evaluation of Compound Activity: Incorporation of Information about Prediction Uncertainty Using Deep Learning Techniques

Molecules ◽

10.3390/molecules25061452 ◽

2020 ◽

Vol 25 (6) ◽

pp. 1452

Author(s):

Igor Sieradzki ◽

Damian Leśniak ◽

Sabina Podlewska

Keyword(s):

Machine Learning ◽

Training Set ◽

Activity Prediction ◽

Prediction Uncertainty ◽

Screening Experiments ◽

Learning Techniques ◽

Model Training ◽

Short Time ◽

Selection Of

A great variety of computational approaches support drug design processes, helping in selection of new potentially active compounds, and optimization of their physicochemical and ADMET properties. Machine learning is a group of methods that are able to evaluate in relatively short time enormous amounts of data. However, the quality of machine-learning-based prediction depends on the data supplied for model training. In this study, we used deep neural networks for the task of compound activity prediction and developed dropout-based approaches for estimating prediction uncertainty. Several types of analyses were performed: the relationships between the prediction error, similarity to the training set, prediction uncertainty, number and standard deviation of activity values were examined. It was tested whether incorporation of information about prediction uncertainty influences compounds ranking based on predicted activity and prediction uncertainty was used to search for the potential errors in the ChEMBL database. The obtained outcome indicates that incorporation of information about uncertainty of compound activity prediction can be of great help during virtual screening experiments.

Download Full-text

Speech methods of representation of scientific knowledge as a tool for increasing reader interest to popular science texts (based on the material of online versions of journals)

E3S Web of Conferences ◽

10.1051/e3sconf/202127311029 ◽

2021 ◽

Vol 273 ◽

pp. 11029

Author(s):

Yana Kosyakova

Keyword(s):

Qualitative Analysis ◽

Scientific Knowledge ◽

Popular Science ◽

Science Texts ◽

Continuous Sampling ◽

Science Discourse ◽

Modern Media ◽

Speech Patterns ◽

Selection Of

The purpose of this work is to: 1) identify, study and analyze speech methods of updating scientific knowledge as a tool for influencing the reader's consciousness; 2) identify potential criteria for increasing the audience's interest in the presented scientific knowledge in the aspect of popular science discourse on the example of popular science articles from selected journals for analysis; 3) describe the influencing potential of these speech methods of presenting knowledge to the addressee. Methodology. The influencing potential of media sources that increase the interest of the readership is revealed through a series of studies describing the factors and methods of popularizing scientific knowledge in modern media on the basis of intersecting discourses (social-political, pedagogical, medical, etc.). The research is also based on the method of continuous sampling in the selection of practical material, the method of quantitative and qualitative analysis. The article substantiates the most effective and frequent speech patterns.

Download Full-text

Randomized SMILES strings improve the quality of molecular generative models

Journal of Cheminformatics ◽

10.1186/s13321-019-0393-0 ◽

2019 ◽

Vol 11 (1) ◽

Cited By ~ 22

Author(s):

Josep Arús-Pous ◽

Simon Viet Johansson ◽

Oleksii Prykhodko ◽

Esben Jannik Bjerrum ◽

Christian Tyrchan ◽

...

Keyword(s):

Recurrent Neural Networks ◽

Chemical Space ◽

Cell Types ◽

Generative Models ◽

The Other ◽

Probability Models ◽

Training Set ◽

String Representation ◽

Almost All

AbstractRecurrent Neural Networks (RNNs) trained with a set of molecules represented as unique (canonical) SMILES strings, have shown the capacity to create large chemical spaces of valid and meaningful structures. Herein we perform an extensive benchmark on models trained with subsets of GDB-13 of different sizes (1 million, 10,000 and 1000), with different SMILES variants (canonical, randomized and DeepSMILES), with two different recurrent cell types (LSTM and GRU) and with different hyperparameter combinations. To guide the benchmarks new metrics were developed that define how well a model has generalized the training set. The generated chemical space is evaluated with respect to its uniformity, closedness and completeness. Results show that models that use LSTM cells trained with 1 million randomized SMILES, a non-unique molecular string representation, are able to generalize to larger chemical spaces than the other approaches and they represent more accurately the target chemical space. Specifically, a model was trained with randomized SMILES that was able to generate almost all molecules from GDB-13 with a quasi-uniform probability. Models trained with smaller samples show an even bigger improvement when trained with randomized SMILES models. Additionally, models were trained on molecules obtained from ChEMBL and illustrate again that training with randomized SMILES lead to models having a better representation of the drug-like chemical space. Namely, the model trained with randomized SMILES was able to generate at least double the amount of unique molecules with the same distribution of properties comparing to one trained with canonical SMILES.

Download Full-text

REBRANDING PRODUK KERIPIK JAMUR TIRAM UNTUK PENINGKATAN PENJUALAN PADA UMKM SPORAMUSHROOM

Jurnal Pengabdian Kepada Masyarakat MEMBANGUN NEGERI ◽

10.35326/pkm.v4i1.549 ◽

2020 ◽

Vol 4 (1) ◽

pp. 77-83

Author(s):

Andi Iva Mundiyah ◽

Dudi Septiadi ◽

Sharfina Nabila ◽

Ni Made Wirastika Sari ◽

Ni Made Zeamita

Keyword(s):

Pearl Oyster ◽

Oyster Mushroom ◽

Brand Marketing ◽

Medium Enterprise ◽

Oyster Mushrooms ◽

The Right ◽

Almost All ◽

Product Sales ◽

Selection Of

ABSTRACT Small-Medium Enterprise (SME) “Sporamushroom” which processes pearl-oyster mushrooms into pearl-oyster mushroom chips is located on Jalan Pelita, Makassar City. Pearl-oyster mushrooms are rich in nutrition and have savory taste and chicken-like texture, so that almost all people like it. The problem faced by SME “Sporamushroom” lies in the packaging of the mushroom chips that are not attractive and are not able to preserve the quality of the products contained therein. In addition, the mushroom chips brand have not been determined. The results of the activities carried out indicate the need for assistance and information sharing about the types of packaging for processed chips, so that the packaging will be produced accordingly, which is aluminum plastic packaging that is suitable for processed chip products. From the brand aspect, the selection of the right and easy-to-remember brand has an effect on product sales. The JAMBUL brand was chosen as the brand of pearl-oyster mushroom chips because it is easy to be remembered and has appropriate philosophy behind it. Key words: brand, marketing, packaging, SME

Download Full-text

PROSES REKRUTMEN POLITIK CALON LEGISLATIF LOKAL DI MEDAN PADA PEMILU 2009

PERSPEKTIF ◽

10.31289/perspektif.v2i1.102 ◽

2016 ◽

Vol 2 (1) ◽

Author(s):

Fernanda Putra Adela

Keyword(s):

Succession Planning ◽

Political Life ◽

Graduate Degree ◽

Legislative Elections ◽

Recruitment Process ◽

Postgraduate Level ◽

The Face ◽

Almost All ◽

Selection Of

This research aimed to find out more specifically about the recruitment process of legislative candidates by analyzing the processes and mechanisms conducted by Prosperous Justice Party in the face of legislative elections in 2009 in Medan. This study successfully demonstrated that the quality of PKS candidates are good enough when viewed from the level of education that almost all legislative candidates PKS is a graduate degree and some have passed the postgraduate level. PKS has two times following the general election after changing the name, so that from the experience, candidates for legislative members of PKS has experienced enough, considering that there are also some legislative candidate who was also member of the previous legislature. Personally, the popularity of candidates for legislative members of PKS is not so high, but PKS as an institution has a high popularity in the community. PKS candidates recruitment process is closed, causing the PKS tends to be more exclusive. This is because efforts to preserve the Islamic ideology as the party with the good succession planning, so that the selection of candidates is highly selective made PKS as a form of consistency to the ideology of the party, and regarded as a cadre of people capable of carrying the PKS remain as party da’wah in the political life of the state.

Download Full-text

Use of Machine Learning in the Pattern Finding

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.a1237.059120 ◽

2020 ◽

Vol 9 (1) ◽

pp. 527-531

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Machine Learning Algorithms ◽

Training Set ◽

Learning Tasks ◽

Sample Data ◽

Approximate Result ◽

Generate Model ◽

Pattern Finding

Today is the generation of Machine Learning and Artificial Intelligence. Machine Learning is a field of scientific study and statistical models to predict the answers of never before asked questions. Machine Learning algorithms use a huge quantity of sample data that is further used to generate model. The higher amount and quality of training set lead to higher accuracy in approximate result calculation. ML is the most popular field to research and also helpful in pattern finding, artificial intelligence and data analysis. In this paper we are going to explain the basic concept of Machine Learning with its various types of methods. These methods can be used according to user’s requirement. Machine Learning tasks are divided into various categories . These tasks are accomplished by computer system without being explicitly programmed.

Download Full-text

Research platform for the study of argumentation in popular science discourse

Ontology of Designing ◽

10.18287/2223-9537-2020-10-4-489-502 ◽

2020 ◽

Vol 10 (4) ◽

pp. 489-502

Author(s):

E.A. Sidorova ◽

◽

I.R. Akhmadeeva ◽

Yu.A. Zagorulko ◽

A.S. Sery ◽

...

Keyword(s):

Popular Science ◽

Russian Language ◽

The Body ◽

Science Texts ◽

Science Discourse ◽

Text Corpora ◽

Research Platform ◽

Analytical Tools ◽

Web Platform ◽

Selection Of

The paper discusses a software system designed to support the study of argumentation in Russian-language popular science texts. This system is based on an ontology built on modern principles of argumentation modeling. In particular, this ontology contains formal descriptions of typical reasoning schemes that are used for annotating texts, analyzing the arguments presented in them, and assessment of its persuasiveness relative to a given audience. A method of argumentative marking of a text is proposed, which provides the allocation of statements and the construction on their basis of an argumentation graph using knowledge about typical reasoning schemes. The paper also describes a set of web tools that provide the creation of thematic corpora, visualization of the argumentation ontology used, the construction of the argumentation graph, the selection of argumentation indicators in the texts, as well as the search for various entities in the text corpora in ontology terms. Analytical tools are presented by means of collecting statistical information on the occurrence of typical elements of argumentation in the body of texts, by means of researching indicators of argumentation and by means of analyzing the persuasiveness of argumentation. The novelty of the work consists in the development of an original methodology for studying argumentation in popular science discourse, based on the ontology of argumentation and supported by a specialized web platform.

Download Full-text

The problem of the quality of popular science texts in the media

Вопросы теории и практики журналистики ◽

10.17150/2308-6203.2016.5(2).233-246 ◽

2016 ◽

Vol 5 (2) ◽

pp. 233-246

Author(s):

Татьяна Фролова ◽

Софья Суворова ◽

Даниил Ильченко ◽

Анастасия Бугаева

Keyword(s):

Popular Science ◽

Science Texts ◽

The Media

Download Full-text

Seizure Forecasting and the Preictal State in Canine Epilepsy

International Journal of Neural Systems ◽

10.1142/s0129065716500465 ◽

2016 ◽

Vol 27 (01) ◽

pp. 1650046 ◽

Cited By ~ 15

Author(s):

Yogatheesan Varatharajah ◽

Ravishankar K. Iyer ◽

Brent M. Berry ◽

Gregory A. Worrell ◽

Benjamin H. Brinkmann

Keyword(s):

Machine Learning ◽

Epileptic Seizures ◽

Machine Learning Algorithms ◽

Challenging Problem ◽

Multiple Features ◽

Feature Sets ◽

Machine Learning Methods ◽

Patients With Epilepsy ◽

Preictal State

The ability to predict seizures may enable patients with epilepsy to better manage their medications and activities, potentially reducing side effects and improving quality of life. Forecasting epileptic seizures remains a challenging problem, but machine learning methods using intracranial electroencephalographic (iEEG) measures have shown promise. A machine-learning-based pipeline was developed to process iEEG recordings and generate seizure warnings. Results support the ability to forecast seizures at rates greater than a Poisson random predictor for all feature sets and machine learning algorithms tested. In addition, subject-specific neurophysiological changes in multiple features are reported preceding lead seizures, providing evidence supporting the existence of a distinct and identifiable preictal state.

Download Full-text

Selection of formwork systems for arrangement of monolithic columns according to the method of integer rating of laborability and process duration

WAYS TO IMPROVE CONSTRUCTION EFFICIENCY ◽

10.32347/2707-501x.2021.47(1).96-107 ◽

2021 ◽

Vol 1 (47) ◽

pp. 96-107

Author(s):

G. Tonkacheev ◽

V. Tonkacheev ◽

K. Nosach

Keyword(s):

Concrete Columns ◽

Construction Equipment ◽

Monolithic Columns ◽

Cross Sectional ◽

Reasonable Choice ◽

Almost All ◽

Monolithic Structures ◽

Structural Versatility ◽

Selection Of

The relevance of this article is related to the problem of standardization of technological processes in which there is construction equipment for concrete work. Almost all construction processes are performed using construction equipment. The arrangement of monolithic structures is accompanied by the processes of installation and dismantling of the formwork. It is impossible to make a reasonable choice of options for equipping construction processes according to the existing standard time norms. For all possible variants of column formwork, almost the same time norms are used [1]. The article considers the comparison of several formwork options for the installation of monolithic reinforced concrete columns of frame structures. To determine the duration and complexity of the process used the method of integer rationing [2], which allows taking into account even minor changes in the formwork. Calculation by this method makes it possible to select the most effective options for formwork. As a basis, this technique uses an analysis of the number of actions and responsibilities of these actions in relation to the quality of the process and its reliability. Any construction equipment is characterized by structural and technological versatility, which also affects the choice of options conditions of the process, so the article provides an analysis of these factors in terms of quality. If in the structure of the structure there are columns with different cross-sectional dimensions or heights, the preference is given to formwork options with greater structural versatility. The material of the article opens a whole direction for further research in the field of construction equipment in other processes.

Download Full-text