scholarly journals Learning fine-grained estimation of physiological states from coarse-grained labels by distribution restoration

2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Zengyi Qin ◽  
Jiansheng Chen ◽  
Zhenyu Jiang ◽  
Xumin Yu ◽  
Chunhua Hu ◽  
...  

AbstractDue to its importance in clinical science, the estimation of physiological states (e.g., the severity of pathological tremor) has aroused growing interest in machine learning community. While the physiological state is a continuous variable, its continuity is lost when the physiological state is quantized into a few discrete classes during recording and labeling. The discreteness introduces misalignment between the true value and its label, meaning that these labels are unfortunately imprecise and coarse-grained. Most previous work did not consider the inaccuracy and directly utilized the coarse labels to train the machine learning algorithms, whose predictions are also coarse-grained. In this work, we propose to learn a precise, fine-grained estimation of physiological states using these coarse-grained ground truths. Established on mathematical rigorous proof, we utilize imprecise labels to restore the probabilistic distribution of precise labels in an approximate order-preserving fashion, then the deep neural network learns from this distribution and offers fine-grained estimation. We demonstrate the effectiveness of our approach in assessing the pathological tremor in Parkinson’s Disease and estimating the systolic blood pressure from bioelectrical signals.

2018 ◽  
Vol 18 (3-4) ◽  
pp. 623-637 ◽  
Author(s):  
ARINDAM MITRA ◽  
CHITTA BARAL

AbstractOver the years the Artificial Intelligence (AI) community has produced several datasets which have given the machine learning algorithms the opportunity to learn various skills across various domains. However, a subclass of these machine learning algorithms that aimed at learning logic programs, namely the Inductive Logic Programming algorithms, have often failed at the task due to the vastness of these datasets. This has impacted the usability of knowledge representation and reasoning techniques in the development of AI systems. In this research, we try to address this scalability issue for the algorithms that learn answer set programs. We present a sound and complete algorithm which takes the input in a slightly different manner and performs an efficient and more user controlled search for a solution. We show via experiments that our algorithm can learn from two popular datasets from machine learning community, namely bAbl (a question answering dataset) and MNIST (a dataset for handwritten digit recognition), which to the best of our knowledge was not previously possible. The system is publicly available athttps://goo.gl/KdWAcV.


2020 ◽  
pp. 1-26
Author(s):  
Joshua Eykens ◽  
Raf Guns ◽  
Tim C.E. Engels

We compare two supervised machine learning algorithms—Multinomial Naïve Bayes and Gradient Boosting—to classify social science articles using textual data. The high level of granularity of the classification scheme used and the possibility that multiple categories are assigned to a document make this task challenging. To collect the training data, we query three discipline specific thesauri to retrieve articles corresponding to specialties in the classification. The resulting dataset consists of 113,909 records and covers 245 specialties, aggregated into 31 subdisciplines from three disciplines. Experts were consulted to validate the thesauri-based classification. The resulting multi-label dataset is used to train the machine learning algorithms in different configurations. We deploy a multi-label classifier chaining model, allowing for an arbitrary number of categories to be assigned to each document. The best results are obtained with Gradient Boosting. The approach does not rely on citation data. It can be applied in settings where such information is not available. We conclude that fine-grained text-based classification of social sciences publications at a subdisciplinary level is a hard task, for humans and machines alike. A combination of human expertise and machine learning is suggested as a way forward to improve the classification of social sciences documents.


Author(s):  
Argelia B. Urbina Nájera ◽  
Jorge De la Calleja

RESUMEN  En este documento se presenta un método para mejorar el proceso de tutoría académica en la educación superior. El método incluye a identificación de las habilidades principales de los tutores de forma automática utilizando el algoritmo árboles de decisión, uno de los algoritmos más utilizados en la comunidad de aprendizaje automático para resolver problemas del mundo real con gran precisión. En el estudio, el algoritmo arboles de decisión fue capaz de identificar las habilidades y afinidades entre estudiantes y tutores. Los experimentos se llevaron a cabo utilizando un conjunto de datos de 277 estudiantes y 19 tutores, mismos que fueron seleccionados por muestreo aleatorio simple y participación voluntaria en el caso de los tutores. Los resultados preliminares muestran que los atributos más importantes para los tutores son la comunicación, la autodirección y las habilidades digitales. Al mismo tiempo, se presenta un proceso de tutoría en el que la asignación del tutor se basa en estos atributos, asumiendo que puede ayudar a fortalecer las habilidades de los estudiantes que demanda la sociedad actual. De la misma forma, el árbol de decisión obtenido se puede utilizar para agrupar a tutores y estudiantes basados en sus habilidades y afinidades personales utilizando otros algoritmos de aprendizaje automático. La aplicación del proceso de tutoría sugerido podría dar la pauta para ver el proceso de tutoría de manera individual sin vincularla a procesos de desempeño académico o deserción escolar.ABSTRACTIn this paper, we present a method for the tutoring process in order to improve academic tutoring in higher education. The method includes identifying the main skills of tutors in an automated manner using decision trees, one of the most used algorithms in the machine learning community for solving several real-world problems with high accuracy. In our study, the decision tree algorithm was able to identify those skills and personal affinities between students and tutors. Experiments were carried out using a data set of 277 students and 19 tutors, which were selected by random sampling and voluntary participation, respectively. Preliminary results show that the most important attributes for tutors are communication, self-direction and digital skills. At the same time, we introduce a tutoring process where the tutor assignment is based on these attributes, assuming that it can help to strengthen the student's skills demanded by today's society. In the same way, the decision tree obtained can be used to create cluster of tutors and clusters of students based on their personal abilities and affinities using other machine learning algorithms. The application of the suggested tutoring process could set the tone to see the tutoring process individually without linking it to processes of academic performance or school dropout.


2021 ◽  
Vol 8 (1) ◽  
pp. 205395172110175
Author(s):  
Hendrik Heuer ◽  
Juliane Jarke ◽  
Andreas Breiter

Machine learning has become a key component of contemporary information systems. Unlike prior information systems explicitly programmed in formal languages, ML systems infer rules from data. This paper shows what this difference means for the critical analysis of socio-technical systems based on machine learning. To provide a foundation for future critical analysis of machine learning-based systems, we engage with how the term is framed and constructed in self-education resources. For this, we analyze machine learning tutorials, an important information source for self-learners and a key tool for the formation of the practices of the machine learning community. Our analysis identifies canonical examples of machine learning as well as important misconceptions and problematic framings. Our results show that machine learning is presented as being universally applicable and that the application of machine learning without special expertise is actively encouraged. Explanations of machine learning algorithms are missing or strongly limited. Meanwhile, the importance of data is vastly understated. This has implications for the manifestation of (new) social inequalities through machine learning-based systems.


2021 ◽  
Vol 248 ◽  
pp. 281-289
Author(s):  
Vladislav Pelikh ◽  
Valery Salov ◽  
Alexander Burdonov ◽  
Nikita Lukyanov

The paper is devoted to developing a model of baddeleyite recovery from dump products of an apatite-baddeleyite processing plant using centrifugal concentrators. The relevance of the work arises from the acquisition of new knowledge on the optimization of technological parameters of centrifugal concentrators using Knelson CVD (continuous variable discharge) technology – in particular, setting the frequency of valve opening and the duration of valves remaining open. The purpose of the research was to assess the applicability of CVD technology in the treatment of various dump products of the processing plant and to build a model of dependencies between the concentrate and tailings yields and the adjustable parameters, which will allow to perform preliminary calculations of the efficiency of implementing this technology at processing plants. The research objects are middling and main separation tailings of the coarse-grained stream and combined product of main and recleaner separation tailings of the fine-grained stream. The study uses general methods of mathematical statistics: methods of regression analysis, aimed at building statistically significant models, describing dependence of a particular variable on a set of regressors; group method of data handling, the main idea of which is to build a set of models of a given class and choose the optimal one among them. Authors proposed an algorithm for processing experiment results based on classical regression analysis and formulated an original criterion for model selection. Models of dependencies between the concentrate and tailings yields and the adjustable parameters were built, which allowed to establish a relationship between the concentrate yield and the valve opening time, as well as a relationship between the tailings yield and the G-force of the installation.


2019 ◽  
pp. 016555151987182
Author(s):  
Abinash Pujahari ◽  
Dilip Singh Sisodia

Clickbaits are online articles with deliberately designed misleading titles for luring more and more readers to open the intended web page. Clickbaits are used to tempt visitors to click on a particular link either to monetise the landing page or to spread the false news for sensationalisation. The presence of clickbaits on any news aggregator portal may lead to unpleasant experience to readers. Automatic detection of clickbait headlines from news headlines has been a challenging issue for the machine learning community. A lot of methods have been proposed for preventing clickbait articles in recent past. However, the recent techniques available in detecting clickbaits are not much robust. This article proposes a hybrid categorisation technique for separating clickbait and non-clickbait articles by integrating different features, sentence structure and clustering. During preliminary categorisation, the headlines are separated using 11 features. After that, the headlines are recategorised using sentence formality and syntactic similarity measures. In the last phase, the headlines are again recategorised by applying clustering using word vector similarity based on t-stochastic neighbourhood embedding ( t-SNE) approach. After categorisation of these headlines, machine learning models are applied to the dataset to evaluate machine learning algorithms. The obtained experimental results indicate that the proposed hybrid model is more robust, reliable and efficient than any individual categorisation techniques for the dataset we have used.


2020 ◽  
Vol 39 (4) ◽  
pp. 5687-5698
Author(s):  
Chunfeng Guo

There are currently few studies on the stress of athletes, so it is impossible to provide effective stadium guidance for athletes. Based on this, this study combines machine learning algorithms to identify athletes’ pre-game emotions. At the same time, this study obtains the data related to the research through the survey access form and obtains the physiological parameters of the athletes under stress in the experimental way and processes the physiological parameters of the athletes with the machine learning algorithm. In order to improve the efficiency of data processing, this study improves the traditional machine learning algorithm, and combines the particle optimization algorithm with the support vector machine to realize the effective recognition of the athlete’s physiological state. In addition, through the experimental method combined with the contrast method, this paper compares the performance of the improved algorithm with the traditional algorithm and combines the data analysis to analyze the test results. Finally, this study analyzes the effectiveness of the proposed algorithm by example analysis. The research shows that the proposed algorithm has better performance than the traditional algorithm and has certain practical significance and can provide theoretical reference for subsequent related research.


2018 ◽  
Vol 30 (10) ◽  
pp. 2805-2832 ◽  
Author(s):  
Richard M. Golden

Although the number of artificial neural network and machine learning architectures is growing at an exponential pace, more attention needs to be paid to theoretical guarantees of asymptotic convergence for novel, nonlinear, high-dimensional adaptive learning algorithms. When properly understood, such guarantees can guide the algorithm development and evaluation process and provide theoretical validation for a particular algorithm design. For many decades, the machine learning community has widely recognized the importance of stochastic approximation theory as a powerful tool for identifying explicit convergence conditions for adaptive learning machines. However, the verification of such conditions is challenging for multidisciplinary researchers not working in the area of stochastic approximation theory. For this reason, this letter presents a new stochastic approximation theorem for both passive and reactive learning environments with assumptions that are easily verifiable. The theorem is widely applicable to the analysis and design of important machine learning algorithms including deep learning algorithms with multiple strict local minimizers, Monte Carlo expectation-maximization algorithms, contrastive divergence learning in Markov fields, and policy gradient reinforcement learning.


2018 ◽  
Vol 8 (3) ◽  
pp. 159-171 ◽  
Author(s):  
Max W. Y. Lam

AbstractThere is a growing interest in applying machine learning algorithms to real-world examples by explicitly deriving models based on probabilistic reasoning. Sports analytics, being favoured mostly by the statistics community and less discussed in the machine learning community, becomes our focus in this paper. Specifically, we model two-team sports for the sake of one-match-ahead forecasting. We present a pioneering modeling approach based on stacked Bayesian regressions, in a way that winning probability can be calculated analytically. Benefiting from regression flexibility and high standard of performance, Sparse Spectrum Gaussian Process Regression (SSGPR) – an improved algorithm for the standard Gaussian Process Regression (GPR), was used to solve Bayesian regression tasks, resulting in a novel predictive model called TLGProb. For evaluation, TLGProb was applied to a popular sports event – National Basketball Association (NBA). Finally, 85.28% of the matches in NBA 2014/2015 regular season were correctly predicted by TLGProb, surpassing the existing predictive models for NBA.


Sign in / Sign up

Export Citation Format

Share Document