Hierarchical Temporal Multi-Instance Learning for Video-based Student Learning Engagement Assessment

Video-based automatic assessment of a student's learning engagement on the fly can provide immense values for delivering personalized instructional services, a vehicle particularly important for massive online education. To train such an assessor, a major challenge lies in the collection of sufficient labels at the appropriate temporal granularity since a learner's engagement status may continuously change throughout a study session. Supplying labels at either frame or clip level incurs a high annotation cost. To overcome such a challenge, this paper proposes a novel hierarchical multiple instance learning (MIL) solution, which only requires labels anchored on full-length videos to learn to assess student engagement at an arbitrary temporal granularity and for an arbitrary duration in a study session. The hierarchical model mainly comprises a bottom module and a top module, respectively dedicated to learning the latent relationship between a clip and its constituent frames and that between a video and its constituent clips, with the constraints on the training stage that the average engagements of local clips is that of the video label. To verify the effectiveness of our method, we compare the performance of the proposed approach with that of several state-of-the-art peer solutions through extensive experiments.

Download Full-text

Interpreting testing and assessment: A state-of-the-art review

Language Testing ◽

10.1177/02655322211036100 ◽

2021 ◽

pp. 026553222110361

Author(s):

Chao Han

Keyword(s):

State Of The Art ◽

Professional Certification ◽

Automatic Assessment ◽

Prospective Students ◽

Assessment Practice ◽

High Stakes ◽

Future Directions ◽

Assessment Tasks ◽

Interpreter Education ◽

Selection Of

Over the past decade, testing and assessing spoken-language interpreting has garnered an increasing amount of attention from stakeholders in interpreter education, professional certification, and interpreting research. This is because in these fields assessment results provide a critical evidential basis for high-stakes decisions, such as the selection of prospective students, the certification of interpreters, and the confirmation/refutation of research hypotheses. However, few reviews exist providing a comprehensive mapping of relevant practice and research. The present article therefore aims to offer a state-of-the-art review, summarizing the existing literature and discovering potential lacunae. In particular, the article first provides an overview of interpreting ability/competence and relevant research, followed by main testing and assessment practice (e.g., assessment tasks, assessment criteria, scoring methods, specificities of scoring operationalization), with a focus on operational diversity and psychometric properties. Second, the review describes a limited yet steadily growing body of empirical research that examines rater-mediated interpreting assessment, and casts light on automatic assessment as an emerging research topic. Third, the review discusses epistemological, psychometric, and practical challenges facing interpreting testers. Finally, it identifies future directions that could address the challenges arising from fast-changing pedagogical, educational, and professional landscapes.

Download Full-text

Compact dictionary pair learning and refining based on principal components analysis

International Journal of Wavelets Multiresolution and Information Processing ◽

10.1142/s0219691319500334 ◽

2019 ◽

Vol 17 (05) ◽

pp. 1950033

Author(s):

Yibin Yu ◽

Min Yang ◽

Yulan Zhang ◽

Shifang Yuan

Keyword(s):

Principal Components Analysis ◽

Principal Components ◽

State Of The Art ◽

Great Success ◽

Computation Complexity ◽

Training Stage ◽

Learned Dictionary ◽

Sample Data ◽

Components Analysis ◽

Row Space

Although traditional dictionary learning (DL) methods have made great success in pattern recognition and machine learning, it is extremely time-consuming, especially in the training stage. The projective dictionary pair learning (DPL) learned the synthesis dictionary and the analysis dictionary jointly to achieve a fast and accurate classifier. However, the dictionary pair is initialized as random matrices without using any data samples information, it required many iterations to ensure convergence. In this paper, we propose a novel compact DPL and refining method based on the observation that the eigenvalue curve of sample data covariance matrix usually decrease very fast, which means we can compact the synthesis dictionary and analysis dictionary. For each class of the data samples, we utilize the principal components analysis (PCA) to retain global important information and compact the row space of a synthesis dictionary and the column space of an analysis dictionary in the first stage. We further refine the learned dictionary pair to achieve a more accurate classifier during compact dictionary pair refining, which combines the orthogonality of PCA with the redundancy of DL. We solve this refining problem in closed-form completely, naturally reducing the computation complexity significantly. Experimental results on the Extended YaleB database and AR database show that the proposed method achieves competitive accuracy and low computational complexity compared with other state-of-the-art methods.

Download Full-text

Loss-Based Attention for Deep Multiple Instance Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6030 ◽

2020 ◽

Vol 34 (04) ◽

pp. 5742-5749

Author(s):

Xiaoshuang Shi ◽

Fuyong Xing ◽

Yuanpu Xie ◽

Zizhao Zhang ◽

Lei Cui ◽

...

Keyword(s):

State Of The Art ◽

Classification Performance ◽

Multiple Instance Learning ◽

Attention Mechanism ◽

Cross Entropy ◽

General Category ◽

Source Codes ◽

Model Generalization ◽

Fully Connected ◽

Entropy Functions

Although attention mechanisms have been widely used in deep learning for many tasks, they are rarely utilized to solve multiple instance learning (MIL) problems, where only a general category label is given for multiple instances contained in one bag. Additionally, previous deep MIL methods firstly utilize the attention mechanism to learn instance weights and then employ a fully connected layer to predict the bag label, so that the bag prediction is largely determined by the effectiveness of learned instance weights. To alleviate this issue, in this paper, we propose a novel loss based attention mechanism, which simultaneously learns instance weights and predictions, and bag predictions for deep multiple instance learning. Specifically, it calculates instance weights based on the loss function, e.g. softmax+cross-entropy, and shares the parameters with the fully connected layer, which is to predict instance and bag predictions. Additionally, a regularization term consisting of learned weights and cross-entropy functions is utilized to boost the recall of instances, and a consistency cost is used to smooth the training process of neural networks for boosting the model generalization performance. Extensive experiments on multiple types of benchmark databases demonstrate that the proposed attention mechanism is a general, effective and efficient framework, which can achieve superior bag and image classification performance over other state-of-the-art MIL methods, with obtaining higher instance precision and recall than previous attention mechanisms. Source codes are available on https://github.com/xsshi2015/Loss-Attention.

Download Full-text

Multiple-Instance Learning via an RBF Kernel-Based Extreme Learning Machine

Journal of Intelligent Systems ◽

10.1515/jisys-2015-0011 ◽

2017 ◽

Vol 26 (1) ◽

pp. 185-195 ◽

Cited By ~ 3

Author(s):

Jie Wang ◽

Liangjian Cai ◽

Xin Zhao

Keyword(s):

Extreme Learning Machine ◽

State Of The Art ◽

Multiple Instance Learning ◽

Training Data ◽

Data Sets ◽

Rbf Kernel ◽

Kernel Extreme Learning Machine ◽

Learning Machine ◽

Hidden Layer ◽

Instance Space

AbstractAs we are usually confronted with a large instance space for real-word data sets, it is significant to develop a useful and efficient multiple-instance learning (MIL) algorithm. MIL, where training data are prepared in the form of labeled bags rather than labeled instances, is a variant of supervised learning. This paper presents a novel MIL algorithm for an extreme learning machine called MI-ELM. A radial basis kernel extreme learning machine is adapted to approach the MIL problem using Hausdorff distance to measure the distance between the bags. The clusters in the hidden layer are composed of bags that are randomly generated. Because we do not need to tune the parameters for the hidden layer, MI-ELM can learn very fast. The experimental results on classifications and multiple-instance regression data sets demonstrate that the MI-ELM is useful and efficient as compared to the state-of-the-art algorithms.

Download Full-text

Detection of Malpractice in E-exams by Head Pose and Gaze Estimation

International Journal of Emerging Technologies in Learning (iJET) ◽

10.3991/ijet.v16i08.15995 ◽

2021 ◽

Vol 16 (08) ◽

pp. 47

Author(s):

Chirag S Indi ◽

Varun Pritham ◽

Vasundhara Acharya ◽

Krishna Prakasha

Keyword(s):

Online Education ◽

State Of The Art ◽

Eye Gaze ◽

Visual Aids ◽

Head Pose ◽

Machine Learn ◽

Hybrid Classifier ◽

Use Of Technology ◽

System Use ◽

Transmission Quality

Examination malpractice is a deliberate wrong doing contrary to official examina-tion rules designed to place a candidate at unfair advantage or disadvantage. The proposed system depicts a new use of technology to identify malpractice in E-Exams which is essential due to growth of online education. The current solu-tions for such a problem either require complete manual labor or have various vulnerabilities that can be exploited by an examinee. The proposed application en-compasses an end-to-end system that assists an examiner/evaluator in deciding whether a student passes an online exam without any probable attempts of mal-practice or cheating in e-exams with the help of visual aids. The system works by categorizing the student’s VFOA (visual focus of attention) data by capturing the head pose estimates and eye gaze estimates using state-of-the-art machine learn-ing techniques. The system only requires the student (test-taker) to have a func-tioning internet connection along with a webcam to transmit the feed. The exam-iner is alerted when the student wavers in his VFOA, from the screen greater than X, a predefined threshold of times. If this threshold X is crossed, the appli-cation will save the data of the person when his VFOA is off the screen and send it to the examiner to be manually checked and marked whether the action per-formed by the student was an attempt at malpractice or just momentary lapse in concentration. The system use a hybrid classifier approach where two different classifiers are used, one when gaze values are being read successfully (which may fail due to various reasons like transmission quality or glare from his specta-cles), the model falls back to the default classifier which only reads the head pose values to classify the attention metric, which is used to map the student’s VFOA to check the likelihood of malpractice. The model has achieved an accuracy of 96.04 percent in classifying the attention metric.

Download Full-text

Background Suppression Network for Weakly-Supervised Temporal Action Localization

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6793 ◽

2020 ◽

Vol 34 (07) ◽

pp. 11320-11327 ◽

Cited By ~ 4

Author(s):

Pilhyeon Lee ◽

Youngjung Uh ◽

Hyeran Byun

Keyword(s):

State Of The Art ◽

Background Suppression ◽

Not Given ◽

Training Strategy ◽

Training Stage ◽

Action Localization ◽

Localization Performance ◽

Weakly Supervised ◽

Branch Weight ◽

Temporal Action

Weakly-supervised temporal action localization is a very challenging problem because frame-wise labels are not given in the training stage while the only hint is video-level labels: whether each video contains action frames of interest. Previous methods aggregate frame-level class scores to produce video-level prediction and learn from video-level action labels. This formulation does not fully model the problem in that background frames are forced to be misclassified as action classes to predict video-level labels accurately. In this paper, we design Background Suppression Network (BaS-Net) which introduces an auxiliary class for background and has a two-branch weight-sharing architecture with an asymmetrical training strategy. This enables BaS-Net to suppress activations from background frames to improve localization performance. Extensive experiments demonstrate the effectiveness of BaS-Net and its superiority over the state-of-the-art methods on the most popular benchmarks – THUMOS'14 and ActivityNet. Our code and the trained model are available at https://github.com/Pilhyeon/BaSNet-pytorch.

Download Full-text

Improving Generalization via Attribute Selection on Out-of-the-Box Data

Neural Computation ◽

10.1162/neco_a_01256 ◽

2020 ◽

Vol 32 (2) ◽

pp. 485-514

Author(s):

Xiaofeng Xu ◽

Ivor W. Tsang ◽

Chuancai Liu

Keyword(s):

Test Data ◽

State Of The Art ◽

Attribute Selection ◽

Generalization Error ◽

Generalization Capability ◽

Training Stage ◽

Unseen Data ◽

Negative Impacts ◽

Unseen Objects ◽

Pseudo Data

Zero-shot learning (ZSL) aims to recognize unseen objects (test classes) given some other seen objects (training classes) by sharing information of attributes between different objects. Attributes are artificially annotated for objects and treated equally in recent ZSL tasks. However, some inferior attributes with poor predictability or poor discriminability may have negative impacts on the ZSL system performance. This letter first derives a generalization error bound for ZSL tasks. Our theoretical analysis verifies that selecting the subset of key attributes can improve the generalization performance of the original ZSL model, which uses all the attributes. Unfortunately, previous attribute selection methods have been conducted based on the seen data, and their selected attributes have poor generalization capability to the unseen data, which is unavailable in the training stage of ZSL tasks. Inspired by learning from pseudo-relevance feedback, this letter introduces out-of-the-box data—pseudo-data generated by an attribute-guided generative model—to mimic the unseen data. We then present an iterative attribute selection (IAS) strategy that iteratively selects key attributes based on the out-of-the-box data. Since the distribution of the generated out-of-the-box data is similar to that of the test data, the key attributes selected by IAS can be effectively generalized to test data. Extensive experiments demonstrate that IAS can significantly improve existing attribute-based ZSL methods and achieve state-of-the-art performance.

Download Full-text

Weakly Supervised Sentiment-Specific Region Discovery for VSA

The Computer Journal ◽

10.1093/comjnl/bxaa112 ◽

2020 ◽

Author(s):

Luoyang Xue ◽

Ang Xu ◽

Qirong Mao ◽

Lijian Gao ◽

Jie Chen

Keyword(s):

Sentiment Analysis ◽

State Of The Art ◽

Multiple Instance Learning ◽

Local Information ◽

Specific Region ◽

Convolution Kernels ◽

Weakly Supervised ◽

Candidate Regions ◽

Region Discovery ◽

Object Level

Abstract Local information has significant contributions to visual sentiment analysis (VSA). Recent studies about local region discovery need manually annotate region location. Affective local information learning and automatic discovery of sentiment-specific region are still the challenges in VSA. In this paper, we propose an end-to-end VSA method for weakly supervised sentiment-specific region discovery. Our method contains two branches: an automatic sentiment-specific region discovery branch and a sentiment analysis branch. In the sentiment-specific region discovery branch, a region proposal network with multiple convolution kernels is proposed to generate candidate affective regions. Then, we design the multiple instance learning (MIL) loss to remove redundant and noisy candidate regions. Finally, the sentiment analysis branch integrates both holistic and localized information obtained in the first branch by feature map coupling for final sentiment classification. Our method automatically discovers sentiment-specific regions by the constraint of MIL loss function without object-level labels. Quantitative and qualitative evaluations on four benchmark affective datasets demonstrate that our proposed method outperforms the state-of-the-art methods.

Download Full-text

Patch Based Multiple Instance Learning Algorithm for Object Tracking

Computational Intelligence and Neuroscience ◽

10.1155/2017/2426475 ◽

2017 ◽

Vol 2017 ◽

pp. 1-7 ◽

Cited By ~ 4

Author(s):

Zhenjie Wang ◽

Lijia Wang ◽

Hua Zhang

Keyword(s):

Object Tracking ◽

Real Time ◽

Learning Algorithm ◽

State Of The Art ◽

Multiple Instance Learning ◽

Partial Occlusion ◽

Learning Rates ◽

Classification Score ◽

Object Based ◽

Illumination Changes

To deal with the problems of illumination changes or pose variations and serious partial occlusion, patch based multiple instance learning (P-MIL) algorithm is proposed. The algorithm divides an object into many blocks. Then, the online MIL algorithm is applied on each block for obtaining strong classifier. The algorithm takes account of both the average classification score and classification scores of all the blocks for detecting the object. In particular, compared with the whole object based MIL algorithm, the P-MIL algorithm detects the object according to the unoccluded patches when partial occlusion occurs. After detecting the object, the learning rates for updating weak classifiers’ parameters are adaptively tuned. The classifier updating strategy avoids overupdating and underupdating the parameters. Finally, the proposed method is compared with other state-of-the-art algorithms on several classical videos. The experiment results illustrate that the proposed method performs well especially in case of illumination changes or pose variations and partial occlusion. Moreover, the algorithm realizes real-time object tracking.

Download Full-text

Zoonotic Diseases Identified by the State-Of-The-Art Zoe Project Research in Romania

Bulletin of University of Agricultural Sciences and Veterinary Medicine Cluj-Napoca Veterinary Medicine ◽

10.15835/buasvmcn-vm:001817 ◽

2018 ◽

Vol 75 (1) ◽

pp. 15

Author(s):

Cintia COLIBABA ◽

Anca Cristina COLIBAB ◽

Irina GHEORGHIU ◽

Stefan COLIBABA ◽

Ovidiu URSA ◽

...

Keyword(s):

Online Education ◽

State Of The Art ◽

Zoonotic Diseases ◽

The State ◽

Point Of View ◽

Local Context ◽

Online Course ◽

Infectious Agents ◽

Art Research ◽

Disease Present

The article is based on the ZOE project, Zoonoses Online Education, (2016-1-RO01-KA203-024732), funded by Erasmus+. The project aims to create open digital educational resources in the field of veterinary medicine based on developing innovative guidelines on zoonotic diseases.The article looks at the main findings of the project’s state-of-the-art research in Romania on the 5-10 most encountered and identified zoonoses in the last 10 years at national level.The project research has evaluated the medical literature on zoonotic diseases in the veterinary field and highlighted a variety of diseases that cover the spectrum of infectious agents and display a variety of transmission patterns in different environments, identifying ways of intervention in the local context. This article looks at the second most spread disease present in the-state of-the-art research, Leishmaniasis.The analysis will be incorporated into a guide and an open online course guide on main infectious diseases transmitted from non-human animals to humans, including videos capturing zoonoses bio-manipulation in simulation centres. It will also be part of a guide and an open online course on medical communication, including the zoonoses videos processed from a linguistic, cultural and communication point of view, available in six languages.

Download Full-text