scholarly journals The Storage Period Discrimination of Bolete Mushrooms Based on Deep Learning Methods Combined With Two-Dimensional Correlation Spectroscopy and Integrative Two-Dimensional Correlation Spectroscopy

2021 ◽  
Vol 12 ◽  
Author(s):  
Jian-E Dong ◽  
Ji Zhang ◽  
Tao Li ◽  
Yuan-Zhong Wang

Boletes are favored by consumers because of their delicious taste and high nutritional value. However, as the storage period increases, their fruiting bodies will grow microorganisms and produce substances harmful to the human body. Therefore, we need to identify the storage period of boletes to ensure their quality. In this article, two-dimensional correlation spectroscopy (2DCOS) images are directly used for deep learning modeling, and the complex spectral data analysis process is transformed into a simple digital image processing problem. We collected 2,018 samples of boletes. After laboratory cleaning, drying, grinding, and tablet compression, their Fourier transform mid-infrared (FT-MIR) spectroscopy data were obtained. Then, we acquired 18,162 spectral images belonging to nine datasets which are synchronous 2DCOS, asynchronous 2DCOS, and integrative 2DCOS (i2DCOS) spectra of 1,750–400, 1,450–1,000, and 1,150–1,000 cm–1 bands. For these data sets, we established nine deep residual convolutional neural network (ResNet) models to identify the storage period of boletes. The result shows that the accuracy with the train set, test set, and external validation set of the synchronous 2DCOS model on the 1,750–400-cm–1 band is 100%, and the loss value is close to zero, so this model is the best. The synchronous 2DCOS model on the 1,150–1,000-cm–1 band comes next, and these two models have high accuracy and generalization ability which can be used to identify the storage period of boletes. The results have certain practical application value and provide a scientific basis for the quality control and market management of bolete mushrooms. In conclusion, our method is novel and extends the application of deep learning in the food field. At the same time, it can be applied to other fields such as agriculture and herbal medicine.

2003 ◽  
Vol 57 (8) ◽  
pp. 996-1006 ◽  
Author(s):  
Slobodan Šašić ◽  
Yukihiro Ozaki

In this paper we report two new developments in two-dimensional (2D) correlation spectroscopy; one is the combination of the moving window concept with 2D spectroscopy to facilitate the analysis of complex data sets, and the other is the definition of the noise level in synchronous/asynchronous maps. A graphical criterion for the latter is also proposed. The combination of the moving window concept with correlation spectra allows one to split a large data matrix into smaller and simpler subsets and to analyze them instead of computing overall correlation. A three-component system that mimics a consecutive chemical reaction is used as a model for the illustration of the two ideas. Both types of correlation matrices, variable–variable and sample–sample, are analyzed, and a very good agreement between the two is met. The proposed innovations enable one to comprehend the complexity of the data to be analyzed by 2D spectroscopy and thus to avoid the risks of over-interpretation, liable to occur whenever improper caution about the number of coexisting species in the system is taken.


2020 ◽  
pp. bjophthalmol-2020-316984
Author(s):  
Tyler Hyungtaek Rim ◽  
Aaron Y Lee ◽  
Daniel S Ting ◽  
Kelvin Teo ◽  
Bjorn Kaijun Betzler ◽  
...  

BackgroundThe ability of deep learning (DL) algorithms to identify eyes with neovascular age-related macular degeneration (nAMD) from optical coherence tomography (OCT) scans has been previously established. We herewith evaluate the ability of a DL model, showing excellent performance on a Korean data set, to generalse onto an American data set despite ethnic differences. In addition, expert graders were surveyed to verify if the DL model was appropriately identifying lesions indicative of nAMD on the OCT scans.MethodsModel development data set—12 247 OCT scans from South Korea; external validation data set—91 509 OCT scans from Washington, USA. In both data sets, normal eyes or eyes with nAMD were included. After internal testing, the algorithm was sent to the University of Washington, USA, for external validation. Area under the receiver operating characteristic curve (AUC) and precision–recall curve (AUPRC) were calculated. For model explanation, saliency maps were generated using Guided GradCAM.ResultsOn external validation, AUC and AUPRC remained high at 0.952 (95% CI 0.942 to 0.962) and 0.891 (95% CI 0.875 to 0.908) at the individual level. Saliency maps showed that in normal OCT scans, the fovea was the main area of interest; in nAMD OCT scans, the appropriate pathological features were areas of model interest. Survey of 10 retina specialists confirmed this.ConclusionOur DL algorithm exhibited high performance for nAMD identification in a Korean population, and generalised well to an ethnically distinct, American population. The model correctly focused on the differences within the macular area to extract features associated with nAMD.


2020 ◽  
Vol 74 (4) ◽  
pp. 460-472 ◽  
Author(s):  
Julian Hniopek ◽  
Michael Schmitt ◽  
Jürgen Popp ◽  
Thomas Bocklitz

This paper introduces the newly developed principal component powered two-dimensional (2D) correlation spectroscopy (PC 2D-COS) as an alternative approach to 2D correlation spectroscopy taking advantage of a dimensionality reduction by principal component analysis. It is shown that PC 2D-COS is equivalent to traditional 2D correlation analysis while providing a significant advantage in terms of computational complexity and memory consumption. These features allow for an easy calculation of 2D correlation spectra even for data sets with very high spectral resolution or a parallel analysis of multiple data sets of 2D correlation spectra. Along with this reduction in complexity, PC 2D-COS offers a significant noise rejection property by limiting the set of principal components used for the 2D correlation calculation. As an example for the application of truncated PC 2D-COS a temperature-dependent Raman spectroscopic data set of a fullerene-anthracene adduct is examined. It is demonstrated that a large reduction in computational cost is possible without loss of relevant information, even for complex real world data sets.


2017 ◽  
Vol 72 (5) ◽  
pp. 765-775 ◽  
Author(s):  
Yeonju Park ◽  
Isao Noda ◽  
Young Mee Jung

Smooth factor analysis (SFA) is introduced as an effective method of removing heavy noise from spectral data sets. A modified form of the nonlinear iterative partial least squares (NIPALS) algorithm involving the smoothing of factors at each step is used in SFA. Compared with the conventional smoothing techniques for individual spectra, SFA is much more effective in the treatment of very noisy spectra (∼40% noise level). Smooth factor analysis invokes a large number of smooth factors to retain pertinent spectral information for high fidelity without distortion. This approach can be used as an effective general pretreatment procedure for multivariate spectral data analysis, such as principal component analysis (PCA) and partial least squares (PLS). This SFA method was also applied to the real experimental data, and its results successfully demonstrated the powerful potential for effective noise removal. Furthermore, this treatment is found to be very helpful to assist effective interpretation of two-dimensional correlation spectroscopy (2D-COS) spectra with very high noise level, which was not possible before.


Author(s):  
Kyungkoo Jun

Background & Objective: This paper proposes a Fourier transform inspired method to classify human activities from time series sensor data. Methods: Our method begins by decomposing 1D input signal into 2D patterns, which is motivated by the Fourier conversion. The decomposition is helped by Long Short-Term Memory (LSTM) which captures the temporal dependency from the signal and then produces encoded sequences. The sequences, once arranged into the 2D array, can represent the fingerprints of the signals. The benefit of such transformation is that we can exploit the recent advances of the deep learning models for the image classification such as Convolutional Neural Network (CNN). Results: The proposed model, as a result, is the combination of LSTM and CNN. We evaluate the model over two data sets. For the first data set, which is more standardized than the other, our model outperforms previous works or at least equal. In the case of the second data set, we devise the schemes to generate training and testing data by changing the parameters of the window size, the sliding size, and the labeling scheme. Conclusion: The evaluation results show that the accuracy is over 95% for some cases. We also analyze the effect of the parameters on the performance.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Yahya Albalawi ◽  
Jim Buckley ◽  
Nikola S. Nikolov

AbstractThis paper presents a comprehensive evaluation of data pre-processing and word embedding techniques in the context of Arabic document classification in the domain of health-related communication on social media. We evaluate 26 text pre-processings applied to Arabic tweets within the process of training a classifier to identify health-related tweets. For this task we use the (traditional) machine learning classifiers KNN, SVM, Multinomial NB and Logistic Regression. Furthermore, we report experimental results with the deep learning architectures BLSTM and CNN for the same text classification problem. Since word embeddings are more typically used as the input layer in deep networks, in the deep learning experiments we evaluate several state-of-the-art pre-trained word embeddings with the same text pre-processing applied. To achieve these goals, we use two data sets: one for both training and testing, and another for testing the generality of our models only. Our results point to the conclusion that only four out of the 26 pre-processings improve the classification accuracy significantly. For the first data set of Arabic tweets, we found that Mazajak CBOW pre-trained word embeddings as the input to a BLSTM deep network led to the most accurate classifier with F1 score of 89.7%. For the second data set, Mazajak Skip-Gram pre-trained word embeddings as the input to BLSTM led to the most accurate model with F1 score of 75.2% and accuracy of 90.7% compared to F1 score of 90.8% achieved by Mazajak CBOW for the same architecture but with lower accuracy of 70.89%. Our results also show that the performance of the best of the traditional classifier we trained is comparable to the deep learning methods on the first dataset, but significantly worse on the second dataset.


Sign in / Sign up

Export Citation Format

Share Document