Multi-label classification of feedbacks

This work deals with educational text mining, a field of natural language processing applied to education. The objective is to classify the feedback generated by teachers in online courses to the activities sent by students according to the model of Hattie and Timperley (2007), considering that feedback may be at the levels task, process, regulation, praise and other. Four multi-label classification methods of the data transformation approach - binary relevance, classification chains, power labelset and rakel-d - are compared with the base algorithms SVM, Random Forest, Logistic Regression and Naive Bayes. The methodology was applied to a case study in which 11013 feedbacks written in Spanish language from 121 online courses of the Law degree from a public university in Mexico were collected from the Blackboard learning manager system. The results show that the random forests algorithms and vector support machines will have the best performance when using the binary relevance transformation and classifier chains methods.

Download Full-text

Identification of Heracleum sosnowskyi-Invaded Land Using Earth Remote Sensing Data

Sustainability ◽

10.3390/su12030759 ◽

2020 ◽

Vol 12 (3) ◽

pp. 759

Author(s):

Jūratė Sužiedelytė Visockienė ◽

Eglė Tumelienė ◽

Vida Maliene

Keyword(s):

Remote Sensing ◽

Invasive Plant ◽

Remote Sensing Data ◽

Unsupervised Classification ◽

Economic Losses ◽

Classification Methods ◽

Earth Remote Sensing ◽

Heracleum Sosnowskyi

H. sosnowskyi (Heracleum sosnowskyi) is a plant that is widespread both in Lithuania and other countries and causes abundant problems. The damage caused by the population of the plant is many-sided: it menaces the biodiversity of the land, poses risk to human health, and causes considerable economic losses. In order to find effective and complex measures against this invasive plant, it is very important to identify places and areas where H. sosnowskyi grows, carry out a detailed analysis, and monitor its spread to avoid leaving this process to chance. In this paper, the remote sensing methodology was proposed to identify territories covered with H. sosnowskyi plants (land classification). Two categories of land cover classification were used: supervised (human-guided) and unsupervised (calculated by software). In the application of the supervised method, the average wavelength of the spectrum of H. sosnowskyi was calculated for the classification of the RGB image and according to this, the unsupervised classification by the program was accomplished. The combination of both classification methods, performed in steps, allowed obtaining better results than using one. The application of authors’ proposed methodology was demonstrated in a Lithuanian case study discussed in this paper.

Download Full-text

Morphological classification of conspecific birds from closely situated breeding areas – A case study of the Common Nightingale

Ornis Hungarica ◽

10.1515/orhu-2015-0011 ◽

2015 ◽

Vol 23 (2) ◽

pp. 20-30 ◽

Cited By ~ 1

Author(s):

Dávid Kováts ◽

Andrea Harnos

Keyword(s):

Environmental Parameters ◽

Classification Methods ◽

Morphological Comparison ◽

Linear Discriminant ◽

The North ◽

North Eastern ◽

Morphological Differences ◽

Bill Length

Abstract In this paper, a complex morphological comparison of four Common Nightingale groups (Luscinia megarhynchos) is demonstrated. In total, 121 territorial nightingales were mist-netted and measured individually on four study areas called ‘Bódva’, ‘Felső-Tisza’, ‘Szatmár-Bereg’ and ‘Bátorliget’ in the North-Eastern part of Hungary in 2006–2013. To distinguish groups by morphology, Classification and Regression Trees (CART), Random Forest (RF) and Linear Discriminant Analysis (LDA) methods were used. Comparison of the four studied Common Nightingale groups shows substantial morphological differences in the length of the second, the third and the fourth primaries (P2, P3, P4), in bill length (BL) and bill width (BW), while other characteristics showed greater similarities. Based on the results of all the applied classification methods, birds originated from Szatmár-Bereg were clearly distinguishable from the others. The differences in morphology can be explained by interspecific competition or phenotypic plasticity resulting from the change of ecological, environmental parameters. Our case study highlights the advantageous differences of the classification methods to distinguish groups with similar morphology and to choose important variables for classification. In conclusion, broad application of the classification methods RF and CART is highly recommended in comparative ecological studies.

Download Full-text

Automatic Classification of Open-Ended Questions: Check-All-That-Apply Questions

Social Science Computer Review ◽

10.1177/0894439319869210 ◽

2019 ◽

pp. 089443931986921 ◽

Cited By ~ 1

Author(s):

Matthias Schonlau ◽

Hyukjun Gweon ◽

Marika Wenemark

Keyword(s):

Learning Algorithm ◽

Civil Disobedience ◽

Automatic Classification ◽

Average Error ◽

Data Sets ◽

Statistical Machine Learning ◽

Binary Relevance ◽

Label Correlations ◽

Classifier Chains

Text data from open-ended questions in surveys are challenging to analyze and are often ignored. Open-ended questions are important though because they do not constrain respondents’ answers. Where open-ended questions are necessary, often human coders manually code answers. When data sets are large, it is impractical or too costly to manually code all answer texts. Instead, text answers can be converted into numerical variables, and a statistical/machine learning algorithm can be trained on a subset of manually coded data. This statistical model is then used to predict the codes of the remainder. We consider open-ended questions where the answers are coded into multiple labels (all-that-apply questions). For example, in the open-ended question in our Happy example respondents are explicitly told they may list multiple things that make them happy. Algorithms for multilabel data take into account the correlation among the answer codes and may therefore give better prediction results. For example, when giving examples of civil disobedience, respondents talking about “minor nonviolent offenses” were also likely to talk about “crimes.” We compare the performance of two different multilabel algorithms (random k-labelsets [RAKEL], classifier chains [CC]) to the default method of binary relevance (BR) which applies single-label algorithms to each code separately. Performance is evaluated on data from three open-ended questions (Happy, Civil Disobedience, and Immigrant). We found weak bivariate label correlations in the Happy data (90th percentile: 7.6%), and stronger bivariate label correlations in the Civil Disobedience (90th percentile: 17.2%) and Immigrant (90th percentile: 19.2%) data. For the data with stronger correlations, we found both multilabel methods performed substantially better than BR using 0/1 loss (“at least one label is incorrect”) and had little effect when using Hamming loss (average error). For data with weak label correlations, we found no difference in performance between multilabel methods and BR. We conclude that automatic classification of open-ended questions that allow multiple answers may benefit from using multilabel algorithms for 0/1 loss. The degree of correlations among the labels may be a useful prognostic tool.

Download Full-text

Student Expectations Regarding Online Learning: Implications For Distance Learning Programs

Journal of College Teaching & Learning (TLC) ◽

10.19030/tlc.v1i10.2002 ◽

2004 ◽

Vol 1 (10) ◽

Author(s):

Duncan G. LaBay ◽

Clare L. Comm

Keyword(s):

Distance Learning ◽

Online Courses ◽

Initial Step ◽

Public University ◽

Traditional Classroom ◽

Student Expectations ◽

Traditional Course ◽

Online Coursework ◽

Learning Programs

What are student expectations in a traditional course versus a distance learning course? The authors analyze student course selection and expected outcomes from data collected in an undergraduate marketing course at a public university in the Northeast. Key findings reveal that students generally have a favorable predisposition towards online coursework despite their beliefs that online courses require more work and have lower learning outcomes. Further, this case study provides an initial step in better understanding student expectations in online courses as well as in the traditional classroom.

Download Full-text

Ontology-based multi-label classification of economic articles

Computer Science and Information Systems ◽

10.2298/csis100420034v ◽

2011 ◽

Vol 8 (1) ◽

pp. 101-119 ◽

Cited By ~ 24

Author(s):

Sergeja Vogrincic ◽

Zoran Bosnic

Keyword(s):

Good Alternative ◽

Supervised Machine Learning ◽

Classification Models ◽

Classification Methods ◽

Single Class ◽

Ranking Models ◽

Binary Relevance ◽

Test Corpus ◽

Document Categorization

The paper presents an approach to the task of automatic document categorization in the field of economics. Since the documents can be annotated with multiple keywords (labels), we approach this task by applying and evaluating multi-label classification methods of supervised machine learning. We describe forming a test corpus of 1015 economic documents that we automatically classify using a tool which integrates ontology construction with text mining methods. In our experimental work, we evaluate three groups of multi-label classification approaches: transformation to single-class problems, specialized multi-label models, and hierarchical/ranking models. The classification accuracies of all tested classification models indicate that there is a potential for using all of the evaluated methods to solve this task. The results show the benefits of using complex groups of approaches which benefit from exploiting dependence between the labels. A good alternative to these approaches is also single-class naive Bayes classifiers coupled with the binary relevance transformation approach.

Download Full-text

Learning IS-A relations from specialized-domain texts with co-occurrence measures

Journal of Computer-Assisted Linguistic Research ◽

10.4995/jclr.2018.9916 ◽

2018 ◽

Vol 2 (1) ◽

pp. 21

Author(s):

Pedro Urena

Keyword(s):

Language Processing ◽

Intelligent Systems ◽

Knowledge Engineering ◽

Classification Problem ◽

Chemical Substance ◽

Distributional Semantics ◽

Input Unit ◽

Feature Extraction Method

<p>Ontology enrichment is a classification problem in which an algorithm categorizes an input conceptual unit in the corresponding node in a target ontology. Conceptual enrichment is of great importance both to Knowledge Engineering and Natural Language Processing, because it helps maximize the efficacy of intelligent systems, making them more adaptable to scenarios where information is produced by means of language. Following previous research on distributional semantics, this paper presents a case study of ontology enrichment using a feature-extraction method which relies on collocational information from corpora. The major advantage of this method is that it can help locate an input unit within its corresponding superordinate node in a taxonomy using a relatively small number of lexical features. In order to evaluate the proposed framework, this paper presents an experiment consisting of the automatic classification of a chemical substance in a taxonomy of toxicology.</p>

Download Full-text

An Exploration of Differences Among Community of Inquiry Indicators in Low and High Disenrollment Online Courses

Online Learning ◽

10.24059/olj.v15i2.196 ◽

2011 ◽

Vol 15 (2) ◽

Author(s):

Philip Ice ◽

Angela M. Gibson ◽

Wally Boston ◽

Dave Becher

Keyword(s):

Instructional Design ◽

Student Retention ◽

Student Satisfaction ◽

Community Of Inquiry ◽

Online Courses ◽

Public University ◽

Cognitive Presence ◽

Cognitive Integration ◽

Face To Face ◽

Rapid Pace

Though online enrollments continue to accelerate at a rapid pace, there is significant concern over student retention. With drop rates significantly higher than in face-to-face classes it is imperative that online providers develop an understanding of factors that lead students to disenroll. This study examines course-level disenrollment through the lens of student satisfaction with the projection of Teaching, Social and Cognitive Presence. In comparing the highest and lowest disenrollment quartiles of all courses at American Public University the value of effective Instructional Design and Organization, and initiation of the Triggering Event phase of Cognitive Presence were found to be significant predictors of student satisfaction in the lowest disenrollment quartile. For the highest disenrollment quartile, the lack of follow-through vis-à-vis Facilitation of Discourse and Cognitive Integration were found to be negative predictors of student satisfaction.

Download Full-text

High Achiever's Learning Style: A Case Study of a Student on the President's List at a Public University

SSRN Electronic Journal ◽

10.2139/ssrn.2189177 ◽

2012 ◽

Author(s):

Wirawani Kamarulzaman

Keyword(s):

Learning Style ◽

Public University

Download Full-text

Assessment of health/functioning of older adults who consume psychoactive substances

Revista Brasileira de Enfermagem ◽

10.1590/0034-7167-2016-0637 ◽

2018 ◽

Vol 71 (3) ◽

pp. 942-950

Author(s):

Vania Dias Cruz ◽

Silvana Sidney Costa Santos ◽

Jamila Geri Tomaschewski-Barlem ◽

Bárbara Tarouco da Silva ◽

Celmira Lange ◽

...

Keyword(s):

Older Adults ◽

Qualitative Case Study ◽

International Classification Of Functioning ◽

Psychoactive Substances ◽

Systematic Observation ◽

Health Functioning ◽

Final Consideration ◽

Theory Of Complexity

ABSTRACT Objective: To assess the health/functioning of the older adult who consumes psychoactive substances through the International Classification of Functioning, Disability and Health, considering the theory of complexity. Method: Qualitative case study, with 11 older adults, held between December 2015 and February 2016 in the state of Rio Grande do Sul, using interviews, documents and non-systematic observation. It was approved by the ethics committee. The analysis followed the propositions of the case study, using the complexity of Morin as theoretical basis. Results: We identified older adults who consider themselves healthy and show alterations - the alterations can be exacerbated by the use of psychoactive substances - of health/functioning expected according to the natural course of aging such as: systemic arterial hypertension; depressive symptoms; dizziness; tinnitus; harmed sleep/rest; and inadequate food and water consumption. Final consideration: The assessment of health/functioning of older adults who use psychoactive substances, guided by complex thinking, exceeds the accuracy limits to risk the understanding of the phenomena in its complexity.

Download Full-text

ER rule classifier with an optimization operator recommendation

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-210629 ◽

2021 ◽

pp. 1-13

Author(s):

Xiaoyan Wang ◽

Jianbin Sun ◽

Qingsong Zhao ◽

Yaqian You ◽

Jiang Jiang

Keyword(s):

Classification Accuracy ◽

Comprehensive Analysis ◽

Small Sample ◽

Empirical Knowledge ◽

Classification Methods ◽

Turbofan Engine ◽

Training Samples ◽

Optimization Operator ◽

Recommendation Strategy

It is difficult for many classic classification methods to consider expert experience and classify small-sample datasets well. The evidential reasoning rule (ER rule) classifier can solve these problems. The ER rule has strong processing and comprehensive analysis abilities for diversified mixed information and can solve problems with expert experience effectively. Moreover, the initial parameters of the classifier constructed based on the ER rule can be set according to empirical knowledge instead of being trained by a large number of samples, which can help the classifier classify small-sample datasets well. However, the initial parameters of the ER rule classifier need to be optimized, and choosing the best optimization algorithm is still a challenge. Considering these problems, the ER rule classifier with an optimization operator recommendation is proposed in this paper. First, the initial ER rule classifier is constructed based on training samples and expert experience. Second, the adjustable parameters are optimized, in which the optimization operator recommendation strategy is applied to select the best algorithm by partial samples, and then experiments with full samples are carried out. Finally, a case study on a turbofan engine degradation simulation dataset is carried out, and the results indicate that the ER rule classifier has a higher classification accuracy than other classic classifiers, which demonstrates the capability and effectiveness of the proposed ER rule classifier with an optimization operator recommendation.

Download Full-text