The Storage Period Discrimination of Bolete Mushrooms Based on Deep Learning Methods Combined With Two-Dimensional Correlation Spectroscopy and Integrative Two-Dimensional Correlation Spectroscopy

Boletes are favored by consumers because of their delicious taste and high nutritional value. However, as the storage period increases, their fruiting bodies will grow microorganisms and produce substances harmful to the human body. Therefore, we need to identify the storage period of boletes to ensure their quality. In this article, two-dimensional correlation spectroscopy (2DCOS) images are directly used for deep learning modeling, and the complex spectral data analysis process is transformed into a simple digital image processing problem. We collected 2,018 samples of boletes. After laboratory cleaning, drying, grinding, and tablet compression, their Fourier transform mid-infrared (FT-MIR) spectroscopy data were obtained. Then, we acquired 18,162 spectral images belonging to nine datasets which are synchronous 2DCOS, asynchronous 2DCOS, and integrative 2DCOS (i2DCOS) spectra of 1,750–400, 1,450–1,000, and 1,150–1,000 cm–1 bands. For these data sets, we established nine deep residual convolutional neural network (ResNet) models to identify the storage period of boletes. The result shows that the accuracy with the train set, test set, and external validation set of the synchronous 2DCOS model on the 1,750–400-cm–1 band is 100%, and the loss value is close to zero, so this model is the best. The synchronous 2DCOS model on the 1,150–1,000-cm–1 band comes next, and these two models have high accuracy and generalization ability which can be used to identify the storage period of boletes. The results have certain practical application value and provide a scientific basis for the quality control and market management of bolete mushrooms. In conclusion, our method is novel and extends the application of deep learning in the food field. At the same time, it can be applied to other fields such as agriculture and herbal medicine.

Download Full-text

Method Superior to Traditional Spectral Identification: FT-NIR Two-Dimensional Correlation Spectroscopy Combined with Deep Learning to Identify the Shelf Life of Fresh Phlebopus portentosus

ACS Omega ◽

10.1021/acsomega.1c02317 ◽

2021 ◽

Author(s):

Li Wang ◽

Jieqing Li ◽

Tao Li ◽

Honggao Liu ◽

Yuanzhong Wang

Keyword(s):

Deep Learning ◽

Shelf Life ◽

Correlation Spectroscopy ◽

Two Dimensional ◽

Spectral Identification

Download Full-text

Two-dimensional correlation spectroscopy combined with deep learning method and HPLC method to identify the storage duration of porcini

Microchemical Journal ◽

10.1016/j.microc.2021.106670 ◽

2021 ◽

pp. 106670

Author(s):

Li Wang ◽

Jie-qing Li ◽

Tao Li ◽

Hong-gao Liu ◽

Yuan-zhong Wang

Keyword(s):

Deep Learning ◽

Hplc Method ◽

Correlation Spectroscopy ◽

Learning Method ◽

Two Dimensional ◽

Storage Duration

Download Full-text

Moving Window Two-Dimensional Correlation Spectroscopy and Determination of Signal-To-Noise Threshold in Correlation Spectra

Applied Spectroscopy ◽

10.1366/000370203322258977 ◽

2003 ◽

Vol 57 (8) ◽

pp. 996-1006 ◽

Cited By ~ 15

Author(s):

Slobodan Šašić ◽

Yukihiro Ozaki

Keyword(s):

Large Data ◽

Correlation Spectroscopy ◽

Data Matrix ◽

Data Sets ◽

Complex Data ◽

Moving Window ◽

Two Dimensional ◽

Complex Data Sets ◽

Coexisting Species ◽

Definition Of

In this paper we report two new developments in two-dimensional (2D) correlation spectroscopy; one is the combination of the moving window concept with 2D spectroscopy to facilitate the analysis of complex data sets, and the other is the definition of the noise level in synchronous/asynchronous maps. A graphical criterion for the latter is also proposed. The combination of the moving window concept with correlation spectra allows one to split a large data matrix into smaller and simpler subsets and to analyze them instead of computing overall correlation. A three-component system that mimics a consecutive chemical reaction is used as a model for the illustration of the two ideas. Both types of correlation matrices, variable–variable and sample–sample, are analyzed, and a very good agreement between the two is met. The proposed innovations enable one to comprehend the complexity of the data to be analyzed by 2D spectroscopy and thus to avoid the risks of over-interpretation, liable to occur whenever improper caution about the number of coexisting species in the system is taken.

Download Full-text

Detection of features associated with neovascular age-related macular degeneration in ethnically distinct data sets by an optical coherence tomography: trained deep learning algorithm

British Journal of Ophthalmology ◽

10.1136/bjophthalmol-2020-316984 ◽

2020 ◽

pp. bjophthalmol-2020-316984

Author(s):

Tyler Hyungtaek Rim ◽

Aaron Y Lee ◽

Daniel S Ting ◽

Kelvin Teo ◽

Bjorn Kaijun Betzler ◽

...

Keyword(s):

Optical Coherence Tomography ◽

Deep Learning ◽

Macular Degeneration ◽

External Validation ◽

Age Related Macular Degeneration ◽

Data Sets ◽

Optical Coherence ◽

Data Set ◽

Saliency Maps ◽

Age Related

BackgroundThe ability of deep learning (DL) algorithms to identify eyes with neovascular age-related macular degeneration (nAMD) from optical coherence tomography (OCT) scans has been previously established. We herewith evaluate the ability of a DL model, showing excellent performance on a Korean data set, to generalse onto an American data set despite ethnic differences. In addition, expert graders were surveyed to verify if the DL model was appropriately identifying lesions indicative of nAMD on the OCT scans.MethodsModel development data set—12 247 OCT scans from South Korea; external validation data set—91 509 OCT scans from Washington, USA. In both data sets, normal eyes or eyes with nAMD were included. After internal testing, the algorithm was sent to the University of Washington, USA, for external validation. Area under the receiver operating characteristic curve (AUC) and precision–recall curve (AUPRC) were calculated. For model explanation, saliency maps were generated using Guided GradCAM.ResultsOn external validation, AUC and AUPRC remained high at 0.952 (95% CI 0.942 to 0.962) and 0.891 (95% CI 0.875 to 0.908) at the individual level. Saliency maps showed that in normal OCT scans, the fovea was the main area of interest; in nAMD OCT scans, the appropriate pathological features were areas of model interest. Survey of 10 retina specialists confirmed this.ConclusionOur DL algorithm exhibited high performance for nAMD identification in a Korean population, and generalised well to an ethnically distinct, American population. The model correctly focused on the differences within the macular area to extract features associated with nAMD.

Download Full-text

PC 2D-COS: A Principal Component Base Approach to Two-Dimensional Correlation Spectroscopy

Applied Spectroscopy ◽

10.1177/0003702819891194 ◽

2020 ◽

Vol 74 (4) ◽

pp. 460-472 ◽

Cited By ~ 1

Author(s):

Julian Hniopek ◽

Michael Schmitt ◽

Jürgen Popp ◽

Thomas Bocklitz

Keyword(s):

Computational Cost ◽

Principal Component ◽

Correlation Spectroscopy ◽

High Spectral Resolution ◽

Data Sets ◽

Two Dimensional ◽

Real World Data ◽

Data Set ◽

Multiple Data Sets ◽

2D Correlation Spectroscopy

This paper introduces the newly developed principal component powered two-dimensional (2D) correlation spectroscopy (PC 2D-COS) as an alternative approach to 2D correlation spectroscopy taking advantage of a dimensionality reduction by principal component analysis. It is shown that PC 2D-COS is equivalent to traditional 2D correlation analysis while providing a significant advantage in terms of computational complexity and memory consumption. These features allow for an easy calculation of 2D correlation spectra even for data sets with very high spectral resolution or a parallel analysis of multiple data sets of 2D correlation spectra. Along with this reduction in complexity, PC 2D-COS offers a significant noise rejection property by limiting the set of principal components used for the 2D correlation calculation. As an example for the application of truncated PC 2D-COS a temperature-dependent Raman spectroscopic data set of a fullerene-anthracene adduct is examined. It is demonstrated that a large reduction in computational cost is possible without loss of relevant information, even for complex real world data sets.

Download Full-text

A practical method superior to traditional spectral identification: Two-dimensional correlation spectroscopy combined with deep learning to identify Paris species

Microchemical Journal ◽

10.1016/j.microc.2020.105731 ◽

2021 ◽

Vol 160 ◽

pp. 105731

Author(s):

JiaQi Yue ◽

HengYu Huang ◽

YuanZhong Wang

Keyword(s):

Deep Learning ◽

Practical Method ◽

Correlation Spectroscopy ◽

Two Dimensional ◽

Spectral Identification

Download Full-text

Application of Two-Dimensional Correlation Spectroscopy to Chemometrics: Self-Modeling Curve Resolution Analysis of Spectral Data Sets

Applied Spectroscopy ◽

10.1366/000370203322554536 ◽

2003 ◽

Vol 57 (11) ◽

pp. 1376-1380 ◽

Cited By ~ 23

Author(s):

Young Mee Jung ◽

Seung Bin Kim ◽

Isao Noda

Keyword(s):

Spectral Data ◽

Correlation Spectroscopy ◽

Data Sets ◽

Two Dimensional ◽

Curve Resolution ◽

Resolution Analysis

Download Full-text

Smooth Factor Analysis (SFA) to Effectively Remove High Levels of Noise from Spectral Data Sets

Applied Spectroscopy ◽

10.1177/0003702817752126 ◽

2017 ◽

Vol 72 (5) ◽

pp. 765-775 ◽

Cited By ~ 3

Author(s):

Yeonju Park ◽

Isao Noda ◽

Young Mee Jung

Keyword(s):

Factor Analysis ◽

Least Squares ◽

Spectral Data ◽

Partial Least Squares ◽

Noise Level ◽

Principal Component ◽

Noise Removal ◽

Correlation Spectroscopy ◽

Data Sets ◽

Spectral Data Analysis

Smooth factor analysis (SFA) is introduced as an effective method of removing heavy noise from spectral data sets. A modified form of the nonlinear iterative partial least squares (NIPALS) algorithm involving the smoothing of factors at each step is used in SFA. Compared with the conventional smoothing techniques for individual spectra, SFA is much more effective in the treatment of very noisy spectra (∼40% noise level). Smooth factor analysis invokes a large number of smooth factors to retain pertinent spectral information for high fidelity without distortion. This approach can be used as an effective general pretreatment procedure for multivariate spectral data analysis, such as principal component analysis (PCA) and partial least squares (PLS). This SFA method was also applied to the real experimental data, and its results successfully demonstrated the powerful potential for effective noise removal. Furthermore, this treatment is found to be very helpful to assist effective interpretation of two-dimensional correlation spectroscopy (2D-COS) spectra with very high noise level, which was not possible before.

Download Full-text

Human Activity Recognition using Fourier Transform Inspired Deep Learning Combination Model

International Journal of Sensors Wireless Communications and Control ◽

10.2174/2210327908666180727123657 ◽

2019 ◽

Vol 9 (1) ◽

pp. 16-31

Author(s):

Kyungkoo Jun

Keyword(s):

Fourier Transform ◽

Deep Learning ◽

Short Term Memory ◽

Window Size ◽

Sensor Data ◽

Data Sets ◽

Data Set ◽

Proposed Model ◽

Testing Data ◽

Labeling Scheme

Background & Objective: This paper proposes a Fourier transform inspired method to classify human activities from time series sensor data. Methods: Our method begins by decomposing 1D input signal into 2D patterns, which is motivated by the Fourier conversion. The decomposition is helped by Long Short-Term Memory (LSTM) which captures the temporal dependency from the signal and then produces encoded sequences. The sequences, once arranged into the 2D array, can represent the fingerprints of the signals. The benefit of such transformation is that we can exploit the recent advances of the deep learning models for the image classification such as Convolutional Neural Network (CNN). Results: The proposed model, as a result, is the combination of LSTM and CNN. We evaluate the model over two data sets. For the first data set, which is more standardized than the other, our model outperforms previous works or at least equal. In the case of the second data set, we devise the schemes to generate training and testing data by changing the parameters of the window size, the sliding size, and the labeling scheme. Conclusion: The evaluation results show that the accuracy is over 95% for some cases. We also analyze the effect of the parameters on the performance.

Download Full-text

Investigating the impact of pre-processing techniques and pre-trained word embeddings in detecting Arabic health information on social media

Journal Of Big Data ◽

10.1186/s40537-021-00488-w ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Yahya Albalawi ◽

Jim Buckley ◽

Nikola S. Nikolov

Keyword(s):

Social Media ◽

Deep Learning ◽

Comprehensive Evaluation ◽

Classification Problem ◽

Data Sets ◽

Word Embeddings ◽

Data Set ◽

Lower Accuracy ◽

Health Related ◽

The Impact

AbstractThis paper presents a comprehensive evaluation of data pre-processing and word embedding techniques in the context of Arabic document classification in the domain of health-related communication on social media. We evaluate 26 text pre-processings applied to Arabic tweets within the process of training a classifier to identify health-related tweets. For this task we use the (traditional) machine learning classifiers KNN, SVM, Multinomial NB and Logistic Regression. Furthermore, we report experimental results with the deep learning architectures BLSTM and CNN for the same text classification problem. Since word embeddings are more typically used as the input layer in deep networks, in the deep learning experiments we evaluate several state-of-the-art pre-trained word embeddings with the same text pre-processing applied. To achieve these goals, we use two data sets: one for both training and testing, and another for testing the generality of our models only. Our results point to the conclusion that only four out of the 26 pre-processings improve the classification accuracy significantly. For the first data set of Arabic tweets, we found that Mazajak CBOW pre-trained word embeddings as the input to a BLSTM deep network led to the most accurate classifier with F1 score of 89.7%. For the second data set, Mazajak Skip-Gram pre-trained word embeddings as the input to BLSTM led to the most accurate model with F1 score of 75.2% and accuracy of 90.7% compared to F1 score of 90.8% achieved by Mazajak CBOW for the same architecture but with lower accuracy of 70.89%. Our results also show that the performance of the best of the traditional classifier we trained is comparable to the deep learning methods on the first dataset, but significantly worse on the second dataset.

Download Full-text