convolutional encoder
Recently Published Documents


TOTAL DOCUMENTS

191
(FIVE YEARS 107)

H-INDEX

16
(FIVE YEARS 9)

2022 ◽  
Vol 12 (1) ◽  
Author(s):  
Kaylen J. Pfisterer ◽  
Robert Amelard ◽  
Audrey G. Chung ◽  
Braeden Syrnyk ◽  
Alexander MacLean ◽  
...  

AbstractMalnutrition is a multidomain problem affecting 54% of older adults in long-term care (LTC). Monitoring nutritional intake in LTC is laborious and subjective, limiting clinical inference capabilities. Recent advances in automatic image-based food estimation have not yet been evaluated in LTC settings. Here, we describe a fully automatic imaging system for quantifying food intake. We propose a novel deep convolutional encoder-decoder food network with depth-refinement (EDFN-D) using an RGB-D camera for quantifying a plate’s remaining food volume relative to reference portions in whole and modified texture foods. We trained and validated the network on the pre-labelled UNIMIB2016 food dataset and tested on our two novel LTC-inspired plate datasets (689 plate images, 36 unique foods). EDFN-D performed comparably to depth-refined graph cut on IOU (0.879 vs. 0.887), with intake errors well below typical 50% (mean percent intake error: $$-4.2$$ - 4.2 %). We identify how standard segmentation metrics are insufficient due to visual-volume discordance, and include volume disparity analysis to facilitate system trust. This system provides improved transparency, approximates human assessors with enhanced objectivity, accuracy, and precision while avoiding hefty semi-automatic method time requirements. This may help address short-comings currently limiting utility of automated early malnutrition detection in resource-constrained LTC and hospital settings.


2022 ◽  
Author(s):  
Irem Loc ◽  
Ibrahim Kecoglu ◽  
Mehmet Burcin Unlu ◽  
Ugur Parlatan

Raman spectroscopy is a vibrational method that gives molecular information rapidly and non-invasively. Despite its advantages, the weak intensity of Raman spectroscopy leads to low-quality signals, particularly with tissue samples. The requirement of high exposure times makes Raman a time-consuming process and diminishes its non-invasive property while studying living tissues. Novel denoising techniques using convolutional neural networks (CNN) have achieved remarkable results in image processing. Here, we propose a similar approach for noise reduction for the Raman spectra acquired with 10x lower exposure times. In this work, we developed fully convolutional encoder-decoder architecture (FCED) and trained them with noisy Raman signals. The results demonstrate that our model is superior (p-value <0.0001) to the conventional denoising techniques such as the Savitzky-Golay filter and wavelet denoising. Improvement in the signal-to-noise ratio values ranges from 20% to 80%, depending on the initial signal-to-noise ratio. Thus, we proved that tissue analysis could be done in a shorter time without any need for instrumental enhancement.


2022 ◽  
pp. 108128652110555
Author(s):  
Ankit Shrivastava ◽  
Jingxiao Liu ◽  
Kaushik Dayal ◽  
Hae Young Noh

This work presents a machine-learning approach to predict peak-stress clusters in heterogeneous polycrystalline materials. Prior work on using machine learning in the context of mechanics has largely focused on predicting the effective response and overall structure of stress fields. However, their ability to predict peak – which are of critical importance to failure – is unexplored, because the peak-stress clusters occupy a small spatial volume relative to the entire domain, and hence require computationally expensive training. This work develops a deep-learning-based convolutional encoder–decoder method that focuses on predicting peak-stress clusters, specifically on the size and other characteristics of the clusters in the framework of heterogeneous linear elasticity. This method is based on convolutional filters that model local spatial relations between microstructures and stress fields using spatially weighted averaging operations. The model is first trained against linear elastic calculations of stress under applied macroscopic strain in synthetically generated microstructures, which serves as the ground truth. The trained model is then applied to predict the stress field given a (synthetically generated) microstructure and then to detect peak-stress clusters within the predicted stress field. The accuracy of the peak-stress predictions is analyzed using the cosine similarity metric and by comparing the geometric characteristics of the peak-stress clusters against the ground-truth calculations. It is observed that the model is able to learn and predict the geometric details of the peak-stress clusters and, in particular, performed better for higher (normalized) values of the peak stress as compared to lower values of the peak stress. These comparisons showed that the proposed method is well-suited to predict the characteristics of peak-stress clusters.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Wei Jia ◽  
Zhiying Zhu ◽  
Huaqi Wang

Nowadays, robust watermark is widely used to protect the copyright of multimedia. Robustness is the most important ability for watermark in application. Since the watermark attacking algorithm is a good way to promote the development of robust watermark, we proposed a new method focused on destroying the commercial watermark. At first, decorrelation and desynchronization are used as the preprocessing method. Considering that the train set of thousands of watermarked images is hard to get, we further use the Bernoulli sampling and dropout in network to achieve the training instance extension. The experiments show that the proposed network can effectively remove the commercial watermark. Meanwhile, the processed image can result in good quality that is almost as good as the original image.


Sensors ◽  
2021 ◽  
Vol 21 (23) ◽  
pp. 7877
Author(s):  
Yang Li ◽  
Guanghui Han ◽  
Xiujian Liu

Nasopharyngeal Carcinoma segmentation in magnetic resonance imagery (MRI) is vital to radiotherapy. Exact dose delivery hinges on an accurate delineation of the gross tumor volume (GTV). However, the large-scale variation in tumor volume is intractable, and the performance of current models is mostly unsatisfactory with indistinguishable and blurred boundaries of segmentation results of tiny tumor volume. To address the problem, we propose a densely connected deep convolutional network consisting of an encoder network and a corresponding decoder network, which extracts high-level semantic features from different levels and uses low-level spatial features concurrently to obtain fine-grained segmented masks. Skip-connection architecture is involved and modified to propagate spatial information to the decoder network. Preliminary experiments are conducted on 30 patients. Experimental results show our model outperforms all baseline models, with improvements of 4.17%. An ablation study is performed, and the effectiveness of the novel loss function is validated.


Symmetry ◽  
2021 ◽  
Vol 13 (12) ◽  
pp. 2246
Author(s):  
Tomasz Hachaj ◽  
Anna Stolińska ◽  
Magdalena Andrzejewska ◽  
Piotr Czerski

Prediction of visual attention is a new and challenging subject, and to the best of our knowledge, there are not many pieces of research devoted to the anticipation of students’ cognition when solving tests. The aim of this paper is to propose, implement, and evaluate a machine learning method that is capable of predicting saliency maps of students who participate in a learning task in the form of quizzes based on quiz questionnaire images. Our proposal utilizes several deep encoder–decoder symmetric schemas which are trained on a large set of saliency maps generated with eye tracking technology. Eye tracking data were acquired from students, who solved various tasks in the sciences and natural sciences (computer science, mathematics, physics, and biology). The proposed deep convolutional encoder–decoder network is capable of producing accurate predictions of students’ visual attention when solving quizzes. Our evaluation showed that predictions are moderately positively correlated with actual data with a coefficient of 0.547 ± 0.109. It achieved better results in terms of correlation with real saliency maps than state-of-the-art methods. Visual analyses of the saliency maps obtained also correspond with our experience and expectations in this field. Both source codes and data from our research can be downloaded in order to reproduce our results.


2021 ◽  
pp. 369-376
Author(s):  
Utkarsh Maheshwari ◽  
Piyush Goel ◽  
R Annie Uthra ◽  
Vinay Vasanth Patage ◽  
Sourabh Tiwari ◽  
...  

2021 ◽  
Vol 2 (1) ◽  
Author(s):  
Ali Davariashtiyani ◽  
Zahra Kadkhodaie ◽  
Sara Kadkhodaei

AbstractPredicting the synthesizability of hypothetical crystals is challenging because of the wide range of parameters that govern materials synthesis. Yet, exploring the exponentially large space of novel crystals for any future application demands an accurate predictive capability for synthesis likelihood to avoid a haphazard trial-and-error. Typically, benchmarks of synthesizability are defined based on the energy of crystal structures. Here, we take an alternative approach to select features of synthesizability from the latent information embedded in crystalline materials. We represent the atomic structure of crystalline materials by three-dimensional pixel-wise images that are color-coded by their chemical attributes. The image representation of crystals enables the use of a convolutional encoder to learn the features of synthesizability hidden in structural and chemical arrangements of crystalline materials. Based on the presented model, we can accurately classify materials into synthesizable crystals versus crystal anomalies across a broad range of crystal structure types and chemical compositions. We illustrate the usefulness of the model by predicting the synthesizability of hypothetical crystals for battery electrode and thermoelectric applications.


Doklady BGUIR ◽  
2021 ◽  
Vol 19 (6) ◽  
pp. 51-58
Author(s):  
L. V. Serebryanaya ◽  
I. E. Lasy

The problem of automatic speech generation from a text file is considered. An analytical review of the software has been completed. They are designed to recognize texts and convert them to an audio stream. The advantages and disadvantages of software products are estimated. Based on this, a conclusion was drawn about the relevance of developing a software for automatic generation of an audio stream from a text in Russian. Models based on artificial neural networks, which are used for speech synthesis, are analyzed. After that, a mathematical model of the created software is built. It consists of three components: a convolutional encoder, a convolutional decoder, and a transformer. The architecture of the software is designed. It includes a graphical interface, an application server, and a speech synthesis system. A number of algorithms have been developed: preprocessing text before loading it into a software, converting audio files of a training sample and training a network, generating speech based on arbitrary text files. A software has been created, which is a single-page application and has a web interface for interacting with the user. To assess the quality of the software, a metric was used that represents the average score of different opinions. As a result of the aggregation of different opinions, the metric received a sufficiently high value, on the basis of which it can be assumed that all the tasks have been solved.


Sign in / Sign up

Export Citation Format

Share Document