Sensory Descriptor Analysis of Whisky Lexicons through the Use of Deep Learning

This paper is concerned with extracting relevant terms from a text corpus on whisk(e)y. “Relevant” terms are usually contextually defined in their domain of use. Arguably, every domain has a specialized vocabulary used for describing things. For example, the field of Sensory Science, a sub-field of Food Science, investigates human responses to food products and differentiates “descriptive” terms for flavors from “ordinary”, non-descriptive language. Within the field, descriptors are generated through Descriptive Analysis, a method wherein a human panel of experts tastes multiple food products and defines descriptors. This process is both time-consuming and expensive. However, one could leverage existing data to identify and build a flavor language automatically. For example, there are thousands of professional and semi-professional reviews of whisk(e)y published on the internet, providing abundant descriptors interspersed with non-descriptive language. The aim, then, is to be able to automatically identify descriptive terms in unstructured reviews for later use in product flavor characterization. We created two systems to perform this task. The first is an interactive visual tool that can be used to tag examples of descriptive terms from thousands of whisky reviews. This creates a training dataset that we use to perform transfer learning using GloVe word embeddings and a Long Short-Term Memory deep learning model architecture. The result is a model that can accurately identify descriptors within a corpus of whisky review texts with a train/test accuracy of 99% and precision, recall, and F1-scores of 0.99. We tested for overfitting by comparing the training and validation loss for divergence. Our results show that the language structure for descriptive terms can be programmatically learned.

Download Full-text

Improving Sentiment Analysis using Hybrid Deep Learning Model

Recent Advances in Computer Science and Communications ◽

10.2174/2213275912666190328200012 ◽

2020 ◽

Vol 13 (4) ◽

pp. 627-640 ◽

Cited By ~ 1

Author(s):

Avinash Chandra Pandey ◽

Dharmveer Singh Rajpoot

Keyword(s):

Neural Network ◽

Deep Learning ◽

Sentiment Analysis ◽

Classification Accuracy ◽

Short Term Memory ◽

Computational Cost ◽

Extraction Process ◽

Learning Model ◽

Sentiment Classification ◽

Deep Learning Model

Background: Sentiment analysis is a contextual mining of text which determines viewpoint of users with respect to some sentimental topics commonly present at social networking websites. Twitter is one of the social sites where people express their opinion about any topic in the form of tweets. These tweets can be examined using various sentiment classification methods to find the opinion of users. Traditional sentiment analysis methods use manually extracted features for opinion classification. The manual feature extraction process is a complicated task since it requires predefined sentiment lexicons. On the other hand, deep learning methods automatically extract relevant features from data hence; they provide better performance and richer representation competency than the traditional methods. Objective: The main aim of this paper is to enhance the sentiment classification accuracy and to reduce the computational cost. Method: To achieve the objective, a hybrid deep learning model, based on convolution neural network and bi-directional long-short term memory neural network has been introduced. Results: The proposed sentiment classification method achieves the highest accuracy for the most of the datasets. Further, from the statistical analysis efficacy of the proposed method has been validated. Conclusion: Sentiment classification accuracy can be improved by creating veracious hybrid models. Moreover, performance can also be enhanced by tuning the hyper parameters of deep leaning models.

Download Full-text

Application of Rough and Fuzzy Set Theory for Prediction of Stochastic Wind Speed Data Using Long Short-Term Memory

Atmosphere ◽

10.3390/atmos12070924 ◽

2021 ◽

Vol 12 (7) ◽

pp. 924

Author(s):

Moslem Imani ◽

Hoda Fakour ◽

Wen-Hau Lan ◽

Huan-Chin Kao ◽

Chi Ming Lee ◽

...

Keyword(s):

Deep Learning ◽

Wind Speed ◽

Set Theory ◽

Rough Set ◽

Fuzzy Set Theory ◽

Short Term Memory ◽

Learning Model ◽

Short Term ◽

Long Short Term Memory ◽

Deep Learning Model

Despite the great significance of precisely forecasting the wind speed for development of the new and clean energy technology and stable grid operators, the stochasticity of wind speed makes the prediction a complex and challenging task. For improving the security and economic performance of power grids, accurate short-term wind power forecasting is crucial. In this paper, a deep learning model (Long Short-term Memory (LSTM)) has been proposed for wind speed prediction. Knowing that wind speed time series is nonlinear stochastic, the mutual information (MI) approach was used to find the best subset from the data by maximizing the joint MI between subset and target output. To enhance the accuracy and reduce input characteristics and data uncertainties, rough set and interval type-2 fuzzy set theory are combined in the proposed deep learning model. Wind speed data from an international airport station in the southern coast of Iran Bandar-Abbas City was used as the original input dataset for the optimized deep learning model. Based on the statistical results, the rough set LSTM (RST-LSTM) model showed better prediction accuracy than fuzzy and original LSTM, as well as traditional neural networks, with the lowest error for training and testing datasets in different time horizons. The suggested model can support the optimization of the control approach and the smooth procedure of power system. The results confirm the superior capabilities of deep learning techniques for wind speed forecasting, which could also inspire new applications in meteorology assessment.

Download Full-text

A deep learning based approach for prediction of Chlamydomonas reinhardtii phosphorylation sites

Scientific Reports ◽

10.1038/s41598-021-91840-w ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Niraj Thapa ◽

Meenal Chaudhari ◽

Anthony A. Iannetta ◽

Clarence White ◽

Kaushik Roy ◽

...

Keyword(s):

Deep Learning ◽

Chlamydomonas Reinhardtii ◽

Protein Phosphorylation ◽

Short Term Memory ◽

Phosphorylation Site ◽

Specific Protein ◽

Training Dataset ◽

Phosphorylation Sites ◽

Site Prediction ◽

Model Combining

AbstractProtein phosphorylation, which is one of the most important post-translational modifications (PTMs), is involved in regulating myriad cellular processes. Herein, we present a novel deep learning based approach for organism-specific protein phosphorylation site prediction in Chlamydomonas reinhardtii, a model algal phototroph. An ensemble model combining convolutional neural networks and long short-term memory (LSTM) achieves the best performance in predicting phosphorylation sites in C. reinhardtii. Deemed Chlamy-EnPhosSite, the measured best AUC and MCC are 0.90 and 0.64 respectively for a combined dataset of serine (S) and threonine (T) in independent testing higher than those measures for other predictors. When applied to the entire C. reinhardtii proteome (totaling 1,809,304 S and T sites), Chlamy-EnPhosSite yielded 499,411 phosphorylated sites with a cut-off value of 0.5 and 237,949 phosphorylated sites with a cut-off value of 0.7. These predictions were compared to an experimental dataset of phosphosites identified by liquid chromatography-tandem mass spectrometry (LC–MS/MS) in a blinded study and approximately 89.69% of 2,663 C. reinhardtii S and T phosphorylation sites were successfully predicted by Chlamy-EnPhosSite at a probability cut-off of 0.5 and 76.83% of sites were successfully identified at a more stringent 0.7 cut-off. Interestingly, Chlamy-EnPhosSite also successfully predicted experimentally confirmed phosphorylation sites in a protein sequence (e.g., RPS6 S245) which did not appear in the training dataset, highlighting prediction accuracy and the power of leveraging predictions to identify biologically relevant PTM sites. These results demonstrate that our method represents a robust and complementary technique for high-throughput phosphorylation site prediction in C. reinhardtii. It has potential to serve as a useful tool to the community. Chlamy-EnPhosSite will contribute to the understanding of how protein phosphorylation influences various biological processes in this important model microalga.

Download Full-text

PlncRNA-HDeep: plant long noncoding RNA prediction using hybrid deep learning based on two encoding styles

BMC Bioinformatics ◽

10.1186/s12859-020-03870-2 ◽

2021 ◽

Vol 22 (S3) ◽

Author(s):

Jun Meng ◽

Qiang Kang ◽

Zheng Chang ◽

Yushi Luan

Keyword(s):

Deep Learning ◽

Noncoding Rna ◽

Nearest Neighbor ◽

Short Term Memory ◽

Biological Activities ◽

Support Vector ◽

Multiple Perspectives ◽

K Nearest Neighbor ◽

Rna Sequences ◽

Deep Learning Model

Abstract Background Long noncoding RNAs (lncRNAs) play an important role in regulating biological activities and their prediction is significant for exploring biological processes. Long short-term memory (LSTM) and convolutional neural network (CNN) can automatically extract and learn the abstract information from the encoded RNA sequences to avoid complex feature engineering. An ensemble model learns the information from multiple perspectives and shows better performance than a single model. It is feasible and interesting that the RNA sequence is considered as sentence and image to train LSTM and CNN respectively, and then the trained models are hybridized to predict lncRNAs. Up to present, there are various predictors for lncRNAs, but few of them are proposed for plant. A reliable and powerful predictor for plant lncRNAs is necessary. Results To boost the performance of predicting lncRNAs, this paper proposes a hybrid deep learning model based on two encoding styles (PlncRNA-HDeep), which does not require prior knowledge and only uses RNA sequences to train the models for predicting plant lncRNAs. It not only learns the diversified information from RNA sequences encoded by p-nucleotide and one-hot encodings, but also takes advantages of lncRNA-LSTM proposed in our previous study and CNN. The parameters are adjusted and three hybrid strategies are tested to maximize its performance. Experiment results show that PlncRNA-HDeep is more effective than lncRNA-LSTM and CNN and obtains 97.9% sensitivity, 95.1% precision, 96.5% accuracy and 96.5% F1 score on Zea mays dataset which are better than those of several shallow machine learning methods (support vector machine, random forest, k-nearest neighbor, decision tree, naive Bayes and logistic regression) and some existing tools (CNCI, PLEK, CPC2, LncADeep and lncRNAnet). Conclusions PlncRNA-HDeep is feasible and obtains the credible predictive results. It may also provide valuable references for other related research.

Download Full-text

Machine learning applied on chest x-ray can aid in the diagnosis of COVID-19: a first experience from Lombardy, Italy

European Radiology Experimental ◽

10.1186/s41747-020-00203-z ◽

2021 ◽

Vol 5 (1) ◽

Cited By ~ 1

Author(s):

Isabella Castiglioni ◽

Davide Ippolito ◽

Matteo Interlenghi ◽

Caterina Beatrice Monti ◽

Christian Salvatore ◽

...

Keyword(s):

Deep Learning ◽

Area Under The Curve ◽

Clinical Information ◽

Training Dataset ◽

Independent Dataset ◽

X Ray ◽

Learning Classifier ◽

Chest X Ray ◽

Polymerase Chain ◽

Deep Learning Model

Abstract Background We aimed to train and test a deep learning classifier to support the diagnosis of coronavirus disease 2019 (COVID-19) using chest x-ray (CXR) on a cohort of subjects from two hospitals in Lombardy, Italy. Methods We used for training and validation an ensemble of ten convolutional neural networks (CNNs) with mainly bedside CXRs of 250 COVID-19 and 250 non-COVID-19 subjects from two hospitals (Centres 1 and 2). We then tested such system on bedside CXRs of an independent group of 110 patients (74 COVID-19, 36 non-COVID-19) from one of the two hospitals. A retrospective reading was performed by two radiologists in the absence of any clinical information, with the aim to differentiate COVID-19 from non-COVID-19 patients. Real-time polymerase chain reaction served as the reference standard. Results At 10-fold cross-validation, our deep learning model classified COVID-19 and non-COVID-19 patients with 0.78 sensitivity (95% confidence interval [CI] 0.74–0.81), 0.82 specificity (95% CI 0.78–0.85), and 0.89 area under the curve (AUC) (95% CI 0.86–0.91). For the independent dataset, deep learning showed 0.80 sensitivity (95% CI 0.72–0.86) (59/74), 0.81 specificity (29/36) (95% CI 0.73–0.87), and 0.81 AUC (95% CI 0.73–0.87). Radiologists’ reading obtained 0.63 sensitivity (95% CI 0.52–0.74) and 0.78 specificity (95% CI 0.61–0.90) in Centre 1 and 0.64 sensitivity (95% CI 0.52–0.74) and 0.86 specificity (95% CI 0.71–0.95) in Centre 2. Conclusions This preliminary experience based on ten CNNs trained on a limited training dataset shows an interesting potential of deep learning for COVID-19 diagnosis. Such tool is in training with new CXRs to further increase its performance.

Download Full-text

Surrounding Vehicles’ Contribution to Car-Following Models: Deep-Learning-Based Analysis

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/03611981211018693 ◽

2021 ◽

pp. 036119812110186

Author(s):

Saeed Vasebi ◽

Yeganeh M. Hayeri ◽

Peter J. Jin

Keyword(s):

Deep Learning ◽

Traffic Flow ◽

Short Term Memory ◽

Data Availability ◽

Car Following ◽

Preceding Vehicle ◽

Long Short Term Memory ◽

The Right ◽

Mean Square Errors ◽

Deep Learning Model

Relatively recent increased computational power and extensive traffic data availability have provided a unique opportunity to re-investigate drivers’ car-following (CF) behavior. Classic CF models assume drivers’ behavior is only influenced by their preceding vehicle. Recent studies have indicated that considering surrounding vehicles’ information (e.g., multiple preceding vehicles) could affect CF models’ performance. An in-depth investigation of surrounding vehicles’ contribution to CF modeling performance has not been reported in the literature. This study uses a deep-learning model with long short-term memory (LSTM) to investigate to what extent considering surrounding vehicles could improve CF models’ performance. This investigation helps to select the right inputs for traffic flow modeling. Five CF models are compared in this study (i.e., classic, multi-anticipative, adjacent-lanes, following-vehicle, and all-surrounding-vehicles CF models). Performance of the CF models is compared in relation to accuracy, stability, and smoothness of traffic flow. The CF models are trained, validated, and tested by a large publicly available dataset. The average mean square errors (MSEs) for the classic, multi-anticipative, adjacent-lanes, following-vehicle, and all-surrounding-vehicles CF models are 1.58 × 10−3, 1.54 × 10−3, 1.56 × 10−3, 1.61 × 10−3, and 1.73 × 10−3, respectively. However, the results show insignificant performance differences between the classic CF model and multi-anticipative model or adjacent-lanes model in relation to accuracy, stability, or smoothness. The following-vehicle CF model shows similar performance to the multi-anticipative model. The all-surrounding-vehicles CF model has underperformed all the other models.

Download Full-text

NIMG-08. PREDICTION OF LOWER-GRADE GLIOMA MOLECULAR SUBTYPES USING DEEP LEARNING

Neuro-Oncology ◽

10.1093/neuonc/noaa215.621 ◽

2020 ◽

Vol 22 (Supplement_2) ◽

pp. ii148-ii148

Author(s):

Yoshihiro Muragaki ◽

Yutaka Matsui ◽

Takashi Maruyama ◽

Masayuki Nitta ◽

Taiichi Saito ◽

...

Keyword(s):

Deep Learning ◽

Cross Validation ◽

Molecular Subtype ◽

Learning Model ◽

Group Classification ◽

Training Dataset ◽

Lower Grade ◽

Test Dataset ◽

Ct Data ◽

Deep Learning Model

Abstract INTRODUCTION It is useful to know the molecular subtype of lower-grade gliomas (LGG) when deciding on a treatment strategy. This study aims to diagnose this preoperatively. METHODS A deep learning model was developed to predict the 3-group molecular subtype using multimodal data including magnetic resonance imaging (MRI), positron emission tomography (PET), and computed tomography (CT). The performance was evaluated using leave-one-out cross validation with a dataset containing information from 217 LGG patients. RESULTS The model performed best when the dataset contained MRI, PET, and CT data. The model could predict the molecular subtype with an accuracy of 96.6% for the training dataset and 68.7% for the test dataset. The model achieved test accuracies of 58.5%, 60.4%, and 59.4% when the dataset contained only MRI, MRI and PET, and MRI and CT data, respectively. The conventional method used to predict mutations in the isocitrate dehydrogenase (IDH) gene and the codeletion of chromosome arms 1p and 19q (1p/19q) sequentially had an overall accuracy of 65.9%. This is 2.8 percent point lower than the proposed method, which predicts the 3-group molecular subtype directly. CONCLUSIONS AND FUTURE PERSPECTIVE A deep learning model was developed to diagnose the molecular subtype preoperatively based on multi-modality data in order to predict the 3-group classification directly. Cross-validation showed that the proposed model had an overall accuracy of 68.7% for the test dataset. This is the first model to double the expected value for a 3-group classification problem, when predicting the LGG molecular subtype. We plan to apply the techniques of heat map and/or segmentation for an increase in prediction accuracy.

Download Full-text

AOHDL: Archimedes Optimization Based Hybrid Deep Learning Model for Soybean Plant Disease Classification

10.21203/rs.3.rs-302084/v1 ◽

2021 ◽

Author(s):

J. Annrose ◽

N. Herald Anantha Rufus ◽

C. R. Edwin Selva Rex ◽

D. Godwin Immanuel

Keyword(s):

Deep Learning ◽

Optimization Algorithm ◽

Short Term Memory ◽

Learning Model ◽

Disease Classification ◽

Training Data ◽

Machine Learning Techniques ◽

Soybean Plant ◽

Lower Accuracy ◽

Deep Learning Model

Abstract Bean which is botanically called Phaseolus vulgaris L belongs to the Fabaceae family.During bean disease identification, unnecessary economical losses occur due to the delay of the treatment period, incorrect treatment, and lack of knowledge. The existing deep learning and machine learning techniques met few issues such as high computational complexity, higher cost associated with the training data, more execution time, noise, feature dimensionality, lower accuracy, low speed, etc. To tackle these problems, we have proposed a hybrid deep learning model with an Archimedes optimization algorithm (HDL-AOA) for bean disease classification. In this work, there are five bean classes of which one is a healthy class whereas the remaining four classes indicate different diseases such as Bean halo blight, Pythium diseases, Rhizoctonia root rot, and Anthracnose abnormalities acquired from the Soybean (Large) Data Set.The hybrid deep learning technique is the combination of wavelet packet decomposition (WPD) and long short term memory (LSTM). Initially, the WPD decomposes the input images into four sub-series. For these sub-series, four LSTM networks were developed. During bean disease classification, an Archimedes optimization algorithm (AOA) enhances the classification accuracy for multiple single LSTM networks. MATLAB software implements the HDL-AOA model for bean disease classification. The proposed model accomplishes lower MAPE than other exiting methods. Finally, the proposed HDL-AOA model outperforms excellent classification results using different evaluation measures such as accuracy, specificity, sensitivity, precision, recall, and F-score.

Download Full-text

Soybean Plant Disease Classification using Archimedes Optimization Algorithm based Hybrid Deep Learning Model

10.21203/rs.3.rs-281525/v1 ◽

2021 ◽

Author(s):

J. Annrose ◽

N. Herald Anantha Rufus ◽

C. R. Edwin Selva Rex ◽

D. Godwin Immanuel

Keyword(s):

Deep Learning ◽

Optimization Algorithm ◽

Short Term Memory ◽

Learning Model ◽

Disease Classification ◽

Training Data ◽

Machine Learning Techniques ◽

Soybean Plant ◽

Lower Accuracy ◽

Deep Learning Model

Download Full-text

Spatio-temporal deep learning model for distortion classification in laparoscopic video

F1000Research ◽

10.12688/f1000research.72980.1 ◽

2021 ◽

Vol 10 ◽

pp. 1010

Author(s):

Nouar AlDahoul ◽

Hezerul Abdul Karim ◽

Abdulaziz Saleh Ba Wazir ◽

Myles Joshua Toledo Tan ◽

Mohammad Faizal Ahmad Fauzi

Keyword(s):

Deep Learning ◽

Large Scale ◽

Spatial Information ◽

Short Term Memory ◽

Learning Model ◽

Motion Blur ◽

Video Frame ◽

Video Enhancement ◽

Deep Learning Model ◽

Laparoscopic Videos

Background: Laparoscopy is a surgery performed in the abdomen without making large incisions in the skin and with the aid of a video camera, resulting in laparoscopic videos. The laparoscopic video is prone to various distortions such as noise, smoke, uneven illumination, defocus blur, and motion blur. One of the main components in the feedback loop of video enhancement systems is distortion identification, which automatically classifies the distortions affecting the videos and selects the video enhancement algorithm accordingly. This paper aims to address the laparoscopic video distortion identification problem by developing fast and accurate multi-label distortion classification using a deep learning model. Current deep learning solutions based on convolutional neural networks (CNNs) can address laparoscopic video distortion classification, but they learn only spatial information. Methods: In this paper, utilization of both spatial and temporal features in a CNN-long short-term memory (CNN-LSTM) model is proposed as a novel solution to enhance the classification. First, pre-trained ResNet50 CNN was used to extract spatial features from each video frame by transferring representation from large-scale natural images to laparoscopic images. Next, LSTM was utilized to consider the temporal relation between the features extracted from the laparoscopic video frames to produce multi-label categories. A novel laparoscopic video dataset proposed in the ICIP2020 challenge was used for training and evaluation of the proposed method. Results: The experiments conducted show that the proposed CNN-LSTM outperforms the existing solutions in terms of accuracy (85%), and F1-score (94.2%). Additionally, the proposed distortion identification model is able to run in real-time with low inference time (0.15 sec). Conclusions: The proposed CNN-LSTM model is a feasible solution to be utilized in laparoscopic videos for distortion identification.

Download Full-text