scholarly journals Improving the Efficacy of Deep-Learning Models for Heart Beat Detection on Heterogeneous Datasets

2021 ◽  
Vol 8 (12) ◽  
pp. 193
Author(s):  
Andrea Bizzego ◽  
Giulio Gabrieli ◽  
Michelle Jin Yee Neoh ◽  
Gianluca Esposito

Deep learning (DL) has greatly contributed to bioelectric signal processing, in particular to extract physiological markers. However, the efficacy and applicability of the results proposed in the literature is often constrained to the population represented by the data used to train the models. In this study, we investigate the issues related to applying a DL model on heterogeneous datasets. In particular, by focusing on heart beat detection from electrocardiogram signals (ECG), we show that the performance of a model trained on data from healthy subjects decreases when applied to patients with cardiac conditions and to signals collected with different devices. We then evaluate the use of transfer learning (TL) to adapt the model to the different datasets. In particular, we show that the classification performance is improved, even with datasets with a small sample size. These results suggest that a greater effort should be made towards the generalizability of DL models applied on bioelectric signals, in particular, by retrieving more representative datasets.

Information ◽  
2018 ◽  
Vol 9 (9) ◽  
pp. 234 ◽  
Author(s):  
Sumet Mehta ◽  
Xiangjun Shen ◽  
Jiangping Gou ◽  
Dejiao Niu

The K-nearest neighbour classifier is very effective and simple non-parametric technique in pattern classification; however, it only considers the distance closeness, but not the geometricalplacement of the k neighbors. Also, its classification performance is highly influenced by the neighborhood size k and existing outliers. In this paper, we propose a new local mean based k-harmonic nearest centroid neighbor (LMKHNCN) classifier in orderto consider both distance-based proximity, as well as spatial distribution of k neighbors. In our method, firstly the k nearest centroid neighbors in each class are found which are used to find k different local mean vectors, and then employed to compute their harmonic mean distance to the query sample. Lastly, the query sample is assigned to the class with minimum harmonic mean distance. The experimental results based on twenty-six real-world datasets shows that the proposed LMKHNCN classifier achieves lower error rates, particularly in small sample-size situations, and that it is less sensitive to parameter k when compared to therelated four KNN-based classifiers.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Jing Zhang ◽  
Guang Lu ◽  
Jiaquan Li ◽  
Chuanwen Li

Mining useful knowledge from high-dimensional data is a hot research topic. Efficient and effective sample classification and feature selection are challenging tasks due to high dimensionality and small sample size of microarray data. Feature selection is necessary in the process of constructing the model to reduce time and space consumption. Therefore, a feature selection model based on prior knowledge and rough set is proposed. Pathway knowledge is used to select feature subsets, and rough set based on intersection neighborhood is then used to select important feature in each subset, since it can select features without redundancy and deals with numerical features directly. In order to improve the diversity among base classifiers and the efficiency of classification, it is necessary to select part of base classifiers. Classifiers are grouped into several clusters by k-means clustering using the proposed combination distance of Kappa-based diversity and accuracy. The base classifier with the best classification performance in each cluster will be selected to generate the final ensemble model. Experimental results on three Arabidopsis thaliana stress response datasets showed that the proposed method achieved better classification performance than existing ensemble models.


2016 ◽  
Vol 2016 ◽  
pp. 1-10
Author(s):  
Zhicheng Lu ◽  
Zhizheng Liang

Linear discriminant analysis has been widely studied in data mining and pattern recognition. However, when performing the eigen-decomposition on the matrix pair (within-class scatter matrix and between-class scatter matrix) in some cases, one can find that there exist some degenerated eigenvalues, thereby resulting in indistinguishability of information from the eigen-subspace corresponding to some degenerated eigenvalue. In order to address this problem, we revisit linear discriminant analysis in this paper and propose a stable and effective algorithm for linear discriminant analysis in terms of an optimization criterion. By discussing the properties of the optimization criterion, we find that the eigenvectors in some eigen-subspaces may be indistinguishable if the degenerated eigenvalue occurs. Inspired from the idea of the maximum margin criterion (MMC), we embed MMC into the eigen-subspace corresponding to the degenerated eigenvalue to exploit discriminability of the eigenvectors in the eigen-subspace. Since the proposed algorithm can deal with the degenerated case of eigenvalues, it not only handles the small-sample-size problem but also enables us to select projection vectors from the null space of the between-class scatter matrix. Extensive experiments on several face images and microarray data sets are conducted to evaluate the proposed algorithm in terms of the classification performance, and experimental results show that our method has smaller standard deviations than other methods in most cases.


Sensors ◽  
2021 ◽  
Vol 21 (17) ◽  
pp. 5863 ◽  
Author(s):  
Annica Kristoffersson ◽  
Jiaying Du ◽  
Maria Ehn

Sensor-based fall risk assessment (SFRA) utilizes wearable sensors for monitoring individuals’ motions in fall risk assessment tasks. Previous SFRA reviews recommend methodological improvements to better support the use of SFRA in clinical practice. This systematic review aimed to investigate the existing evidence of SFRA (discriminative capability, classification performance) and methodological factors (study design, samples, sensor features, and model validation) contributing to the risk of bias. The review was conducted according to recommended guidelines and 33 of 389 screened records were eligible for inclusion. Evidence of SFRA was identified: several sensor features and three classification models differed significantly between groups with different fall risk (mostly fallers/non-fallers). Moreover, classification performance corresponding the AUCs of at least 0.74 and/or accuracies of at least 84% were obtained from sensor features in six studies and from classification models in seven studies. Specificity was at least as high as sensitivity among studies reporting both values. Insufficient use of prospective design, small sample size, low in-sample inclusion of participants with elevated fall risk, high amounts and low degree of consensus in used features, and limited use of recommended model validation methods were identified in the included studies. Hence, future SFRA research should further reduce risk of bias by continuously improving methodology.


2020 ◽  
Vol 492 (4) ◽  
pp. 5377-5390 ◽  
Author(s):  
Shengda Luo ◽  
Alex P Leung ◽  
C Y Hui ◽  
K L Li

ABSTRACT We have investigated a number of factors that can have significant impacts on the classification performance of gamma-ray sources detected by Fermi Large Area Telescope (LAT) with machine learning techniques. We show that a framework of automatic feature selection can construct a simple model with a small set of features that yields better performance over previous results. Secondly, because of the small sample size of the training/test sets of certain classes in gamma-ray, nested re-sampling and cross-validations are suggested for quantifying the statistical fluctuations of the quoted accuracy. We have also constructed a test set by cross-matching the identified active galactic nuclei (AGNs) and the pulsars (PSRs) in the Fermi-LAT 8-yr point source catalogue (4FGL) with those unidentified sources in the previous 3rd Fermi-LAT Source Catalog (3FGL). Using this cross-matched set, we show that some features used for building classification model with the identified source can suffer from the problem of covariate shift, which can be a result of various observational effects. This can possibly hamper the actual performance when one applies such model in classifying unidentified sources. Using our framework, both AGN/PSR and young pulsar (YNG)/millisecond pulsar (MSP) classifiers are automatically updated with the new features and the enlarged training samples in 4FGL catalogue incorporated. Using a two-layer model with these updated classifiers, we have selected 20 promising MSP candidates with confidence scores $\gt 98{{\ \rm per\ cent}}$ from the unidentified sources in 4FGL catalogue that can provide inputs for a multiwavelength identification campaign.


Author(s):  
Rohit Keshari ◽  
Soumyadeep Ghosh ◽  
Saheb Chhabra ◽  
Mayank Vatsa ◽  
Richa Singh

2021 ◽  
Vol 40 (1) ◽  
pp. 685-702
Author(s):  
Huiru Wang ◽  
Zhijian Zhou

 In Rough margin-based ν-Twin Support Vector Machine (Rν-TSVM) algorithm, the rough theory is introduced. Rν-TSVM gives different penalties to the corresponding misclassified samples according to their positions, so it avoids the overfitting problem to some extent. While the input data is a tensor, Rν-TSVM cannot handle it directly and may not utilize the data information effectively. Therefore, we propose a novel classifier based on tensor data, termed as Rough margin-based ν-Twin Support Tensor Machine (Rν-TSTM). Similar to Rν-TSVM, Rν-TSTM constructs rough lower margin, rough upper margin and rough boundary in tensor space. Rν-TSTM not only retains the superiority of Rν-TSVM, but also has its unique advantages. Firstly, the data topology is retained more efficiently by the direct use of tensor representation. Secondly, it has better classification performance compared to other classification algorithms. Thirdly, it can avoid overfitting problem to a great extent. Lastly, it is more suitable for high dimensional and small sample size problem. To solve the corresponding optimization problem in Rν-TSTM, we adopt the alternating iteration method in which the parameters corresponding to the hyperplanes are estimated by solving a series of Rν-TSVM optimization problem. The efficiency and superiority of the proposed method are demonstrated by computational experiments.


2021 ◽  
pp. 1-12
Author(s):  
Alexander Cohan ◽  
Jake Schuster ◽  
Jose Fernandez

Predicting athlete injury risk has been a holy grail in sports medicine with little progress to date due to a variety of factors such as small sample sizes, significantly imbalanced data, and inadequate statistical approaches. Modeling approaches which are not able to account for the multiple interactions across factors can be misleading. We address the small sample size by collecting longitudinal data of NBA player injuries using publicly available data sources and develop a state of the art deep learning model, METIC, to predict future injuries based on past injuries, game activity, and player statistics. We evaluate model performance using metrics appropriate for imbalanced data and find that METIC performs significantly better than other traditional machine learning approaches. METIC uses feature learning to create interactive features which become meaningful in combination with each other. METIC can be used by practitioners and front offices to improve athlete management and reduce injury incidence, potentially saving sports teams millions in revenue due to reduced athlete injuries.


2021 ◽  
Author(s):  
Jacob Johnson ◽  
Kaneel Senevirathne ◽  
Lawrence Ngo

In this work, we report the results of a deep-learning based liver lesion detection algorithm. While several liver lesion segmentation and classification algorithms have been developed, none of the previous work has focused on detecting suspicious liver lesions. Furthermore, their generalizability remains a pitfall due to their small sample size and sample homogeneity. Here, we developed and validated a highly generalizable deep-learning algorithm for detection of suspicious liver lesions. The algorithm was trained and tested on a diverse dataset containing CT exams from over 2,000 hospital sites in the United States. Our final model achieved an AUROC of 0.84 with a specificity of 0.99 while maintaining a sensitivity of 0.33.


Sign in / Sign up

Export Citation Format

Share Document