scholarly journals On data collection time by an electronic nose

Author(s):  
Piotr Borowik ◽  
Leszek Adamowicz ◽  
Rafał Tarakowski ◽  
Krzysztof Siwek ◽  
Tomasz Grzywacz

<p>We use electronic nose data of odor measurements to build machine learning classification models. The presented analysis focused on determining the optimal time of measurement, leading to the best model performance. We observe that the most valuable information for classification is available in data collected at the beginning of adsorption and the beginning of the desorption phase of measurement. We demonstrated that the usage of complex features extracted from the sensors’ response gives better classification performance than use as features only raw values of sensors’ response, normalized by baseline. We use a group shuffling cross-validation approach for determining the reported models’ average accuracy and standard deviation.</p>

Sensors ◽  
2021 ◽  
Vol 21 (17) ◽  
pp. 5868
Author(s):  
Piotr Borowik ◽  
Leszek Adamowicz ◽  
Rafał Tarakowski ◽  
Przemysław Wacławik ◽  
Tomasz Oszako ◽  
...  

Electronic noses can be applied as a rapid, cost-effective option for several applications. This paper presents the results of measurements of samples of two pathogenic fungi, Fusarium oxysporum and Rhizoctonia solani, performed using two constructions of a low-cost electronic nose. The first electronic nose used six non-specific Figaro Inc. metal oxide gas sensors. The second one used ten sensors from only two models (TGS 2602 and TGS 2603) operating at different heater voltages. Sets of features describing the shapes of the measurement curves of the sensors’ responses when exposed to the odours were extracted. Machine learning classification models using the logistic regression method were created. We demonstrated the possibility of applying the low-cost electronic nose data to differentiate between the two studied species of fungi with acceptable accuracy. Improved classification performance could be obtained, mainly for measurements using TGS 2603 sensors operating at different voltage conditions.


Sensors ◽  
2020 ◽  
Vol 20 (19) ◽  
pp. 5672
Author(s):  
Vahid Tavakkoli ◽  
Kabeh Mohsenzadegan ◽  
Kyandoghere Kyamakya

The core objective of this paper is to develop and validate a comprehensive visual sensing concept for robustly classifying house types. Previous studies regarding this type of classification show that this type of classification is not simple (i.e., tough) and most classifier models from the related literature have shown a relatively low performance. For finding a suitable model, several similar classification models based on convolutional neural network have been explored. We have found out that adding/involving/extracting better and more complex features result in a significant accuracy related performance improvement. Therefore, a new model taking this finding into consideration has been developed, tested and validated. The model developed is benchmarked with selected state-of-art classification models of relevance for the “house classification” endeavor. The test results obtained in this comprehensive benchmarking clearly demonstrate and validate the effectiveness and the superiority of our here developed deep-learning model. Overall, one notices that our model reaches classification performance figures (accuracy, precision, etc.) which are at least 8% higher (which is extremely significant in the ranges above 90%) than those reached by the previous state-of-the-art methods involved in the conducted comprehensive benchmarking.


Symmetry ◽  
2019 ◽  
Vol 12 (1) ◽  
pp. 8
Author(s):  
Jing Chen ◽  
Jun Feng ◽  
Xia Sun ◽  
Yang Liu

Sentiment classification of forum posts of massive open online courses is essential for educators to make interventions and for instructors to improve learning performance. Lacking monitoring on learners’ sentiments may lead to high dropout rates of courses. Recently, deep learning has emerged as an outstanding machine learning technique for sentiment classification, which extracts complex features automatically with rich representation capabilities. However, deep neural networks always rely on a large amount of labeled data for supervised training. Constructing large-scale labeled training datasets for sentiment classification is very laborious and time consuming. To address this problem, this paper proposes a co-training, semi-supervised deep learning model for sentiment classification, leveraging limited labeled data and massive unlabeled data simultaneously to achieve performance comparable to those methods trained on massive labeled data. To satisfy the condition of two views of co-training, we encoded texts into vectors from views of word embedding and character-based embedding independently, considering words’ external and internal information. To promote the classification performance with limited data, we propose a double-check strategy sample selection method to select samples with high confidence to augment the training set iteratively. In addition, we propose a mixed loss function both considering the labeled data with asymmetric and unlabeled data. Our proposed method achieved a 89.73% average accuracy and an 93.55% average F1-score, about 2.77% and 3.2% higher than baseline methods. Experimental results demonstrate the effectiveness of the proposed model trained on limited labeled data, which performs much better than those trained on massive labeled data.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Willem B. Bruin ◽  
◽  
Luke Taylor ◽  
Rajat M. Thomas ◽  
Jonathan P. Shock ◽  
...  

Abstract No diagnostic biomarkers are available for obsessive-compulsive disorder (OCD). Here, we aimed to identify magnetic resonance imaging (MRI) biomarkers for OCD, using 46 data sets with 2304 OCD patients and 2068 healthy controls from the ENIGMA consortium. We performed machine learning analysis of regional measures of cortical thickness, surface area and subcortical volume and tested classification performance using cross-validation. Classification performance for OCD vs. controls using the complete sample with different classifiers and cross-validation strategies was poor. When models were validated on data from other sites, model performance did not exceed chance-level. In contrast, fair classification performance was achieved when patients were grouped according to their medication status. These results indicate that medication use is associated with substantial differences in brain anatomy that are widely distributed, and indicate that clinical heterogeneity contributes to the poor performance of structural MRI as a disease marker.


Sensors ◽  
2020 ◽  
Vol 20 (23) ◽  
pp. 6823
Author(s):  
Arijit Das ◽  
Indrajit Saha ◽  
Rafał Scherer

In recent years, hyperspectral images (HSIs) have attained considerable attention in computer vision (CV) due to their wide utility in remote sensing. Unlike images with three or lesser channels, HSIs have a large number of spectral bands. Recent works demonstrate the use of modern deep learning based CV techniques like convolutional neural networks (CNNs) for analyzing HSI. CNNs have receptive fields (RFs) fueled by learnable weights, which are trained to extract useful features from images. In this work, a novel multi-receptive CNN module called GhoMR is proposed for HSI classification. GhoMR utilizes blocks containing several RFs, extracting features in a residual fashion. Each RF extracts features which are used by other RFs to extract more complex features in a hierarchical manner. However, the higher the number of RFs, the greater the associated weights, thus heavier is the network. Most complex architectures suffer from this shortcoming. To tackle this, the recently found Ghost module is used as the basic building unit. Ghost modules address the feature redundancy in CNNs by extracting only limited features and performing cheap transformations on them, thus reducing the overall parameters in the network. To test the discriminative potential of GhoMR, a simple network called GhoMR-Net is constructed using GhoMR modules, and experiments are performed on three public HSI data sets—Indian Pines, University of Pavia, and Salinas Scene. The classification performance is measured using three metrics—overall accuracy (OA), Kappa coefficient (Kappa), and average accuracy (AA). Comparisons with ten state-of-the-art architectures are shown to demonstrate the effectiveness of the method further. Although lightweight, the proposed GhoMR-Net provides comparable or better performance than other networks. The PyTorch code for this study is made available at the iamarijit/GhoMR GitHub repository.


2008 ◽  
Vol 2008 ◽  
pp. 1-7 ◽  
Author(s):  
Joerg D. Wichard ◽  
Henning Cammann ◽  
Carsten Stephan ◽  
Thomas Tolxdorff

We investigate the performance of different classification models and their ability to recognize prostate cancer in an early stage. We build ensembles of classification models in order to increase the classification performance. We measure the performance of our models in an extensive cross-validation procedure and compare different classification models. The datasets come from clinical examinations and some of the classification models are already in use to support the urologists in their clinical work.


2019 ◽  
Author(s):  
Willem B. Bruin ◽  
Luke Taylor ◽  
Rajat M. Thomas ◽  
Jonathan P Shock ◽  
Paul Zhutovsky ◽  
...  

AbstractObjectiveNo diagnostic biomarkers are available for obsessive-compulsive disorder (OCD). Magnetic resonance imaging (MRI) studies have provided evidence for structural abnormalities in distinct brain regions, but effect sizes are small and have limited clinical relevance. To investigate whether individual patients can be distinguished from healthy controls, we performed multivariate analysis of structural neuroimaging data from the ENIGMA-OCD consortium.MethodWe included 46 data sets with neuroimaging and clinical data from adult (≥18 years) and pediatric (<18 years) samples. T1 images from 2,304 OCD patients and 2,068 healthy controls were analyzed using standardized processing to extract regional measures of cortical thickness, surface area and subcortical volume. Machine learning classification performance was tested using cross-validation, and possible effects of clinical variables were investigated by stratification.ResultsClassification performance for OCD versus controls using the complete sample with different classifiers and cross-validation strategies was poor (AUC—0.57 (standard deviation (SD)=0.02;Pcorr=0.19) to 0.62 (SD=0.03;Pcorr<.001)). When models were validated on completely new data from other sites, model performance did not exceed chance-level (AUC—0.51 (SD=0.11;Pcorr>.99) to 0.54 (SD=0.08;Pcorr>.99)). In contrast, good classification performance (>0.8 AUC) was achieved within subgroups of patients split according to their medication status.ConclusionsParcellated structural MRI data do not enable good distinction between patients with OCD and controls. However, classifying subgroups of patients based on medication status enables good identification at the individual subject level. This underlines the need for longitudinal studies on the short- and long-term effects of medication on brain structure.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Annachiara Tinivella ◽  
Luca Pinzi ◽  
Giulio Rastelli

AbstractThe development of selective inhibitors of the clinically relevant human Carbonic Anhydrase (hCA) isoforms IX and XII has become a major topic in drug research, due to their deregulation in several types of cancer. Indeed, the selective inhibition of these two isoforms, especially with respect to the homeostatic isoform II, holds great promise to develop anticancer drugs with limited side effects. Therefore, the development of in silico models able to predict the activity and selectivity against the desired isoform(s) is of central interest. In this work, we have developed a series of machine learning classification models, trained on high confidence data extracted from ChEMBL, able to predict the activity and selectivity profiles of ligands for human Carbonic Anhydrase isoforms II, IX and XII. The training datasets were built with a procedure that made use of flexible bioactivity thresholds to obtain well-balanced active and inactive classes. We used multiple algorithms and sampling sizes to finally select activity models able to classify active or inactive molecules with excellent performances. Remarkably, the results herein reported turned out to be better than those obtained by models built with the classic approach of selecting an a priori activity threshold. The sequential application of such validated models enables virtual screening to be performed in a fast and more reliable way to predict the activity and selectivity profiles against the investigated isoforms.


Sign in / Sign up

Export Citation Format

Share Document