scholarly journals Training data sets for TensorFlow models from TeleEcho data.

2020 ◽  
Author(s):  
Anil Kumar Bheemaiah

Abstract:Data streams are persisted and visualized for a practice of biofeedback based therapy, with the option of @edge decision support for premium services, in the form of on-demand telemedical services and CDS based decision support services, and integrated services like Amazon Pharmacy.Keywords: Digital Medicine, CDS HL7 webhooks, bio-feedback, LSL streams, AWS S3, Wolfram cloud, feature extraction functions, visualization of filters.What:Extraction of data by data-mining from hyperscale data from tele-echo data repositories, to create training data sets for a specific thread for Tensorflow model templates for transfer learning, with deployment of pre-trained networks using TensorFlow lite.Pre-Trained models are evaluated for prediction accuracy in integrated feature space and classification fitness models, for scalable deployment.How:We consider the use of TensorFlow Models, and train the models on an EC2 P3 image using GPU computing on SageMaker, using a Thread for the purpose.We consider the creation of the following : A MUSE 2 headset for PPG, Gyro Accelerometer data for breath and heart diagnostics is made using a python script and a 1D tensor model.(alexandrebarachant n.d.; “tf.nn.conv1d | TensorFlow Core r2.0” n.d., “tf.keras.layers.Conv1D | TensorFlow Core r2.0” n.d., “Tensorflow - Math behind 1D Convolution with Advanced Examples in TF | Tensorflow Tutorial” n.d.; Lee 2018)Why:Digital Medicine is accessible in the mental wellness community with an EEG wearable such as MUSE 2 , which has ppg and accelerometer data which can be data mined with a classifier 1D convolution Tensor Net for detecting any anomalies, requiring telemedicine.

2021 ◽  
Vol 16 (1) ◽  
pp. 1-24
Author(s):  
Yaojin Lin ◽  
Qinghua Hu ◽  
Jinghua Liu ◽  
Xingquan Zhu ◽  
Xindong Wu

In multi-label learning, label correlations commonly exist in the data. Such correlation not only provides useful information, but also imposes significant challenges for multi-label learning. Recently, label-specific feature embedding has been proposed to explore label-specific features from the training data, and uses feature highly customized to the multi-label set for learning. While such feature embedding methods have demonstrated good performance, the creation of the feature embedding space is only based on a single label, without considering label correlations in the data. In this article, we propose to combine multiple label-specific feature spaces, using label correlation, for multi-label learning. The proposed algorithm, mu lti- l abel-specific f eature space e nsemble (MULFE), takes consideration label-specific features, label correlation, and weighted ensemble principle to form a learning framework. By conducting clustering analysis on each label’s negative and positive instances, MULFE first creates features customized to each label. After that, MULFE utilizes the label correlation to optimize the margin distribution of the base classifiers which are induced by the related label-specific feature spaces. By combining multiple label-specific features, label correlation based weighting, and ensemble learning, MULFE achieves maximum margin multi-label classification goal through the underlying optimization framework. Empirical studies on 10 public data sets manifest the effectiveness of MULFE.


Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1573
Author(s):  
Loris Nanni ◽  
Giovanni Minchio ◽  
Sheryl Brahnam ◽  
Gianluca Maguolo ◽  
Alessandra Lumini

Traditionally, classifiers are trained to predict patterns within a feature space. The image classification system presented here trains classifiers to predict patterns within a vector space by combining the dissimilarity spaces generated by a large set of Siamese Neural Networks (SNNs). A set of centroids from the patterns in the training data sets is calculated with supervised k-means clustering. The centroids are used to generate the dissimilarity space via the Siamese networks. The vector space descriptors are extracted by projecting patterns onto the similarity spaces, and SVMs classify an image by its dissimilarity vector. The versatility of the proposed approach in image classification is demonstrated by evaluating the system on different types of images across two domains: two medical data sets and two animal audio data sets with vocalizations represented as images (spectrograms). Results show that the proposed system’s performance competes competitively against the best-performing methods in the literature, obtaining state-of-the-art performance on one of the medical data sets, and does so without ad-hoc optimization of the clustering methods on the tested data sets.


2020 ◽  
Vol 10 (7) ◽  
pp. 1486-1493
Author(s):  
Jianjun Sun

The rehabilitation of armless or footless patients is of great importance. One choice for such issue is using the electroencephalograph (EEG) brain computer interface to help the patients communicate with outside. Classifying the EEG signals generated from mental activity is one of the most important technologies. However, existing classification methods often suffer the overfitting problem caused by the small training data sets while big dimensionality of feature space. Fuzzy inference can imitate the human judgement, effectively dealing with uncertainty and small-sample learning problems. Besides, biclustering has shown excellent performance in constructing rule base. This paper proposes a novel biclustering based fuzzy inference method for EEG classification. It can be divided into five steps. The first step is generating features with common spatial pattern. The second step is searching local coherent patterns with column nearly constant biclustering. The third step is to transform the patterns to if-then rules with column averaging and majority voting strategy. Subsequent step is to employ Mamdani fuzzy inference to map the input feature vector into decimals. Finally, particle swarm optimization is utilized to generate optimal threshold for linear classification. Experiments on several commonly used data sets show that the proposed method has advantages over competitors in terms of classification accuracy.


Author(s):  
Ruslan Babudzhan ◽  
Konstantyn Isaienkov ◽  
Danilo Krasiy ◽  
Oleksii Vodka ◽  
Ivan Zadorozhny ◽  
...  

The paper investigates the relationship between vibration acceleration of bearings with their operational state. To determine these dependencies, a testbench was built and 112 experiments were carried out with different bearings: 100 bearings that developed an internal defect during operation and 12bearings without a defect. From the obtained records, a dataset was formed, which was used to build classifiers. Dataset is freely available. A methodfor classifying new and used bearings was proposed, which consists in searching for dependencies and regularities of the signal using descriptive functions: statistical, entropy, fractal dimensions and others. In addition to processing the signal itself, the frequency domain of the bearing operationsignal was also used to complement the feature space. The paper considered the possibility of generalizing the classification for its application on thosesignals that were not obtained in the course of laboratory experiments. An extraneous dataset was found in the public domain. This dataset was used todetermine how accurate a classifier was when it was trained and tested on significantly different signals. Training and validation were carried out usingthe bootstrapping method to eradicate the effect of randomness, given the small amount of training data available. To estimate the quality of theclassifiers, the F1-measure was used as the main metric due to the imbalance of the data sets. The following supervised machine learning methodswere chosen as classifier models: logistic regression, support vector machine, random forest, and K nearest neighbors. The results are presented in theform of plots of density distribution and diagrams.


2019 ◽  
Vol 5 ◽  
pp. e242
Author(s):  
Hyukjun Gweon ◽  
Matthias Schonlau ◽  
Stefan H. Steiner

Multi-label classification is a type of supervised learning where an instance may belong to multiple labels simultaneously. Predicting each label independently has been criticized for not exploiting any correlation between labels. In this article we propose a novel approach, Nearest Labelset using Double Distances (NLDD), that predicts the labelset observed in the training data that minimizes a weighted sum of the distances in both the feature space and the label space to the new instance. The weights specify the relative tradeoff between the two distances. The weights are estimated from a binomial regression of the number of misclassified labels as a function of the two distances. Model parameters are estimated by maximum likelihood. NLDD only considers labelsets observed in the training data, thus implicitly taking into account label dependencies. Experiments on benchmark multi-label data sets show that the proposed method on average outperforms other well-known approaches in terms of 0/1 loss, and multi-label accuracy and ranks second on the F-measure (after a method called ECC) and on Hamming loss (after a method called RF-PCT).


Author(s):  
Gustavo Camps-Valls ◽  
Manel Martínez-Ramón ◽  
José Luis Rojo-Álvarez

Machine learning has experienced a great advance in the eighties and nineties due to the active research in artificial neural networks and adaptive systems. These tools have demonstrated good results in many real applications, since neither a priori knowledge about the distribution of the available data nor the relationships among the independent variables should be necessarily assumed. Overfitting due to reduced training data sets is controlled by means of a regularized functional which minimizes the complexity of the machine. Working with high dimensional input spaces is no longer a problem thanks to the use of kernel methods. Such methods also provide us with new ways to interpret the classification or estimation results. Kernel methods are emerging and innovative techniques that are based on first mapping the data from the original input feature space to a kernel feature space of higher dimensionality, and then solving a linear problem in that space. These methods allow us to geometrically design (and interpret) learning algorithms in the kernel space (which is nonlinearly related to the input space), thus combining statistics and geometry in an effective way. This theoretical elegance is also matched by their practical performance.


2007 ◽  
Vol 19 (7) ◽  
pp. 1919-1938 ◽  
Author(s):  
Jooyoung Park ◽  
Daesung Kang ◽  
Jongho Kim ◽  
James T. Kwok ◽  
Ivor W. Tsang

The support vector data description (SVDD) is one of the best-known one-class support vector learning methods, in which one tries the strategy of using balls defined on the feature space in order to distinguish a set of normal data from all other possible abnormal objects. The major concern of this letter is to extend the main idea of SVDD to pattern denoising. Combining the geodesic projection to the spherical decision boundary resulting from the SVDD, together with solving the preimage problem, we propose a new method for pattern denoising. We first solve SVDD for the training data and then for each noisy test pattern, obtain its denoised feature by moving its feature vector along the geodesic on the manifold to the nearest decision boundary of the SVDD ball. Finally we find the location of the denoised pattern by obtaining the pre-image of the denoised feature. The applicability of the proposed method is illustrated by a number of toy and real-world data sets.


Author(s):  
YUE JIANG ◽  
BOJAN CUKIC ◽  
TIM MENZIES ◽  
JIE LIN

The identification of fault-prone modules has a significant impact on software quality assurance. In addition to prediction accuracy, one of the most important goals is to detect fault prone modules as early as possible in the development lifecycle. Requirements, design, and code metrics have been successfully used for predicting fault-prone modules. In this paper, we investigate the benefits of the incremental development of software fault prediction models. We compare the performance of these models as the volume of data and their life cycle origin (design, code, or their combination) evolve during project development. We analyze 14 data sets from publicly available software engineering data repositories. These data sets offer both design and code metrics. Using a number of modeling techniques and statistical significance tests, we confirm that increasing the volume of training data improves model performance. Further models built from code metrics typically outperform those that are built using design metrics only. However, both types of models prove to be useful as they can be constructed in different phases of the life cycle. Code-based models can be used to increase the effectiveness of assigning verification and validation activities late in the development life cycle. We also conclude that models that utilize a combination of design and code level metrics outperform models which use either one metric set exclusively.


2014 ◽  
Vol 2014 ◽  
pp. 1-9 ◽  
Author(s):  
Jiangyuan Mei ◽  
Jian Hou ◽  
Jicheng Chen ◽  
Hamid Reza Karimi

Large data sets classification is widely used in many industrial applications. It is a challenging task to classify large data sets efficiently, accurately, and robustly, as large data sets always contain numerous instances with high dimensional feature space. In order to deal with this problem, in this paper we present an online Logdet divergence based metric learning (LDML) model by making use of the powerfulness of metric learning. We firstly generate a Mahalanobis matrix via learning the training data with LDML model. Meanwhile, we propose a compressed representation for high dimensional Mahalanobis matrix to reduce the computation complexity in each iteration. The final Mahalanobis matrix obtained this way measures the distances between instances accurately and serves as the basis of classifiers, for example, thek-nearest neighbors classifier. Experiments on benchmark data sets demonstrate that the proposed algorithm compares favorably with the state-of-the-art methods.


AI Magazine ◽  
2019 ◽  
Vol 40 (3) ◽  
pp. 41-57
Author(s):  
Manisha Mishra ◽  
Pujitha Mannaru ◽  
David Sidoti ◽  
Adam Bienkowski ◽  
Lingyi Zhang ◽  
...  

A synergy between AI and the Internet of Things (IoT) will significantly improve sense-making, situational awareness, proactivity, and collaboration. However, the key challenge is to identify the underlying context within which humans interact with smart machines. Knowledge of the context facilitates proactive allocation among members of a human–smart machine (agent) collective that balances auto­nomy with human interaction, without displacing humans from their supervisory role of ensuring that the system goals are achievable. In this article, we address four research questions as a means of advancing toward proactive autonomy: how to represent the interdependencies among the key elements of a hybrid team; how to rapidly identify and characterize critical contextual elements that require adaptation over time; how to allocate system tasks among machines and agents for superior performance; and how to enhance the performance of machine counterparts to provide intelligent and proactive courses of action while considering the cognitive states of human operators. The answers to these four questions help us to illustrate the integration of AI and IoT applied to the maritime domain, where we define context as an evolving multidimensional feature space for heterogeneous search, routing, and resource allocation in uncertain environments via proactive decision support systems.


Sign in / Sign up

Export Citation Format

Share Document