Personalized Federated Learning with Semisupervised Distillation

Distilling Deep Structural Facial Relationships for FAU Intensity Estimation

10.36227/techrxiv.14157080.v1 ◽

2021 ◽

Author(s):

Yingruo Fan ◽

Jacqueline CK Lam ◽

Victor On Kwok Li

Keyword(s):

Structural Characteristics ◽

Model Performance ◽

Training Model ◽

Model Parameters ◽

Intensity Estimation ◽

Structural Relationships ◽

Proposed Model ◽

Comparable Performance ◽

Knowledge Distillation ◽

The Difference

<div> <div> <div> <p>Facial emotions are expressed through a combination of facial muscle movements, namely, the Facial Action Units (FAUs). FAU intensity estimation aims to estimate the intensity of a set of structurally dependent FAUs. Contrary to the existing works that focus on improving FAU intensity estimation, this study investigates how knowledge distillation (KD) incorporated into a training model can improve FAU intensity estimation efficiency while achieving the same level of performance. Given the intrinsic structural characteristics of FAU, it is desirable to distill deep structural relationships, namely, DSR-FAU, using heatmap regression. Our methodology is as follows: First, a feature map-level distillation loss was applied to ensure that the student network and the teacher network share similar feature distributions. Second, the region-wise and channel-wise relationship distillation loss functions were introduced to penalize the difference in structural relationships. Specifically, the region-wise relationship can be represented by the structural correlations across the facial features, whereas the channel-wise relationship is represented by the implicit FAU co-occurrence dependencies. Third, we compared the model performance of DSR-FAU with the state-of-the-art models, based on two benchmarking datasets. Our proposed model achieves comparable performance with other baseline models, though requiring a lower number of model parameters and lower computation complexities. </p> </div> </div> </div>

UMLF-COVID: an unsupervised meta-learning model specifically designed to identify X-ray images of COVID-19 patients

BMC Medical Imaging ◽

10.1186/s12880-021-00704-2 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Rui Miao ◽

Xin Dong ◽

Sheng-Li Xie ◽

Yong Liang ◽

Sio-Long Lo

Keyword(s):

Recognition Task ◽

Training Model ◽

Learning Model ◽

Model Parameters ◽

X Ray ◽

Learning Framework ◽

Fast Screening ◽

Rapid Spread ◽

Meta Learning ◽

Multi Class Classification

Abstract Background With the rapid spread of COVID-19 worldwide, quick screening for possible COVID-19 patients has become the focus of international researchers. Recently, many deep learning-based Computed Tomography (CT) image/X-ray image fast screening models for potential COVID-19 patients have been proposed. However, the existing models still have two main problems. First, most of the existing supervised models are based on pre-trained model parameters. The pre-training model needs to be constructed on a dataset with features similar to those in COVID-19 X-ray images, which limits the construction and use of the model. Second, the number of categories based on the X-ray dataset of COVID-19 and other pneumonia patients is usually imbalanced. In addition, the quality is difficult to distinguish, leading to non-ideal results with the existing model in the multi-class classification COVID-19 recognition task. Moreover, no researchers have proposed a COVID-19 X-ray image learning model based on unsupervised meta-learning. Methods This paper first constructed an unsupervised meta-learning model for fast screening of COVID-19 patients (UMLF-COVID). This model does not require a pre-trained model, which solves the limitation problem of model construction, and the proposed unsupervised meta-learning framework solves the problem of sample imbalance and sample quality. Results The UMLF-COVID model is tested on two real datasets, each of which builds a three-category and four-category model. And the experimental results show that the accuracy of the UMLF-COVID model is 3–10% higher than that of the existing models. Conclusion In summary, we believe that the UMLF-COVID model is a good complement to COVID-19 X-ray fast screening models.

Wearable xAI: A Knowledge-Based Federated Learning Framework

Engineering Proceedings ◽

10.3390/i3s2021dresden-10143 ◽

2021 ◽

Vol 6 (1) ◽

pp. 79

Author(s):

Sara Nasiri ◽

Iman Nasiri ◽

Kristof Van Laerhoven

Keyword(s):

Training Model ◽

Heterogeneous Data ◽

Case Based Reasoning ◽

Training Process ◽

Learning Framework ◽

Knowledge Based ◽

Private Data ◽

User Acceptability ◽

And Training ◽

Past Conditions

Federated learning is a knowledge transmission and training process that occurs in turn between user models on edge devices and the training model in the central server. Due to privacy policies and concerns and heterogeneous data, this is a widespread requirement in federated learning applications. In this work, we use knowledge-based methods, and in particular case-based reasoning (CBR), to develop a wearable, explainable artificial intelligence (xAI) framework. CBR is a problem-solving AI approach for knowledge representation and manipulation, which considers successful solutions of past conditions that are likely to serve as candidate solutions for a requested problem. It enables federated learning when each user owns not only his/her private data, but also uniquely designed cases. New generated cases can be compared to the knowledge base and the recommendations enable the user to communicate better with the whole system. It improves users’ task performance and increases user acceptability when they need explanations to understand why and how AI algorithms arrive at these optimal solutions.

Distilling Deep Structural Facial Relationships for FAU Intensity Estimation

10.36227/techrxiv.14157080 ◽

2021 ◽

Author(s):

Yingruo Fan ◽

Jacqueline CK Lam ◽

Victor On Kwok Li

Keyword(s):

Structural Characteristics ◽

Model Performance ◽

Training Model ◽

Model Parameters ◽

Intensity Estimation ◽

Structural Relationships ◽

Proposed Model ◽

Comparable Performance ◽

Knowledge Distillation ◽

The Difference

<div> <div> <div> <p>Facial emotions are expressed through a combination of facial muscle movements, namely, the Facial Action Units (FAUs). FAU intensity estimation aims to estimate the intensity of a set of structurally dependent FAUs. Contrary to the existing works that focus on improving FAU intensity estimation, this study investigates how knowledge distillation (KD) incorporated into a training model can improve FAU intensity estimation efficiency while achieving the same level of performance. Given the intrinsic structural characteristics of FAU, it is desirable to distill deep structural relationships, namely, DSR-FAU, using heatmap regression. Our methodology is as follows: First, a feature map-level distillation loss was applied to ensure that the student network and the teacher network share similar feature distributions. Second, the region-wise and channel-wise relationship distillation loss functions were introduced to penalize the difference in structural relationships. Specifically, the region-wise relationship can be represented by the structural correlations across the facial features, whereas the channel-wise relationship is represented by the implicit FAU co-occurrence dependencies. Third, we compared the model performance of DSR-FAU with the state-of-the-art models, based on two benchmarking datasets. Our proposed model achieves comparable performance with other baseline models, though requiring a lower number of model parameters and lower computation complexities. </p> </div> </div> </div>

Nodule Detection with Convolutional Neural Network Using Apache Spark and GPU Frameworks

Applied Sciences ◽

10.3390/app11062838 ◽

2021 ◽

Vol 11 (6) ◽

pp. 2838

Author(s):

Nikitha Johnsirani Venkatesan ◽

Dong Ryeol Shin ◽

Choon Sung Nam

Keyword(s):

Neural Network ◽

Radiation Dose ◽

Convolutional Neural Network ◽

Model Performance ◽

Performance Comparison ◽

Apache Spark ◽

Training Time ◽

Learning Framework ◽

Proposed Model

In the pharmaceutical field, early detection of lung nodules is indispensable for increasing patient survival. We can enhance the quality of the medical images by intensifying the radiation dose. High radiation dose provokes cancer, which forces experts to use limited radiation. Using abrupt radiation generates noise in CT scans. We propose an optimal Convolutional Neural Network model in which Gaussian noise is removed for better classification and increased training accuracy. Experimental demonstration on the LUNA16 dataset of size 160 GB shows that our proposed method exhibit superior results. Classification accuracy, specificity, sensitivity, Precision, Recall, F1 measurement, and area under the ROC curve (AUC) of the model performance are taken as evaluation metrics. We conducted a performance comparison of our proposed model on numerous platforms, like Apache Spark, GPU, and CPU, to depreciate the training time without compromising the accuracy percentage. Our results show that Apache Spark, integrated with a deep learning framework, is suitable for parallel training computation with high accuracy.

A Reinforcement Learning Framework for Spiking Networks with Dynamic Synapses

Computational Intelligence and Neuroscience ◽

10.1155/2011/869348 ◽

2011 ◽

Vol 2011 ◽

pp. 1-12 ◽

Cited By ~ 3

Author(s):

Karim El-Laithy ◽

Martin Bogdan

Keyword(s):

Reinforcement Learning ◽

Spike Timing ◽

Neural Representation ◽

Model Parameters ◽

Learning Framework ◽

Reference Target ◽

Wide Range ◽

Spiking Network ◽

Dynamic Synapses ◽

Exclusive Or

An integration of both the Hebbian-based and reinforcement learning (RL) rules is presented for dynamic synapses. The proposed framework permits the Hebbian rule to update the hidden synaptic model parameters regulating the synaptic response rather than the synaptic weights. This is performed using both the value and the sign of the temporal difference in the reward signal after each trial. Applying this framework, a spiking network with spike-timing-dependent synapses is tested to learn the exclusive-OR computation on a temporally coded basis. Reward values are calculated with the distance between the output spike train of the network and a reference target one. Results show that the network is able to capture the required dynamics and that the proposed framework can reveal indeed an integrated version of Hebbian and RL. The proposed framework is tractable and less computationally expensive. The framework is applicable to a wide class of synaptic models and is not restricted to the used neural representation. This generality, along with the reported results, supports adopting the introduced approach to benefit from the biologically plausible synaptic models in a wide range of intuitive signal processing.

Enabling deeper learning on big data for materials informatics applications

Scientific Reports ◽

10.1038/s41598-021-83193-1 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Dipendra Jha ◽

Vishu Gupta ◽

Logan Ward ◽

Zijiang Yang ◽

Christopher Wolverton ◽

...

Keyword(s):

Neural Networks ◽

Big Data ◽

Deep Learning ◽

Deep Neural Networks ◽

Materials Science ◽

Prediction Models ◽

Model Performance ◽

Materials Informatics ◽

Learning Framework ◽

Significant Attention

AbstractThe application of machine learning (ML) techniques in materials science has attracted significant attention in recent years, due to their impressive ability to efficiently extract data-driven linkages from various input materials representations to their output properties. While the application of traditional ML techniques has become quite ubiquitous, there have been limited applications of more advanced deep learning (DL) techniques, primarily because big materials datasets are relatively rare. Given the demonstrated potential and advantages of DL and the increasing availability of big materials datasets, it is attractive to go for deeper neural networks in a bid to boost model performance, but in reality, it leads to performance degradation due to the vanishing gradient problem. In this paper, we address the question of how to enable deeper learning for cases where big materials data is available. Here, we present a general deep learning framework based on Individual Residual learning (IRNet) composed of very deep neural networks that can work with any vector-based materials representation as input to build accurate property prediction models. We find that the proposed IRNet models can not only successfully alleviate the vanishing gradient problem and enable deeper learning, but also lead to significantly (up to 47%) better model accuracy as compared to plain deep neural networks and traditional ML techniques for a given input materials representation in the presence of big data.

Enhanced Neural Network Model for Worldwide Estimation of Weighted Mean Temperature

Remote Sensing ◽

10.3390/rs13122405 ◽

2021 ◽

Vol 13 (12) ◽

pp. 2405

Author(s):

Fengyang Long ◽

Chengfa Gao ◽

Yuxiang Yan ◽

Jinling Wang

Keyword(s):

Neural Network ◽

Real Time ◽

Model Performance ◽

Global Scale ◽

Radiosonde Data ◽

Model Parameters ◽

Measured Temperature ◽

Weighted Mean ◽

Weighted Mean Temperature ◽

Application Scope

Precise modeling of weighted mean temperature (Tm) is critical for realizing real-time conversion from zenith wet delay (ZWD) to precipitation water vapor (PWV) in Global Navigation Satellite System (GNSS) meteorology applications. The empirical Tm models developed by neural network techniques have been proved to have better performances on the global scale; they also have fewer model parameters and are thus easy to operate. This paper aims to further deepen the research of Tm modeling with the neural network, and expand the application scope of Tm models and provide global users with more solutions for the real-time acquisition of Tm. An enhanced neural network Tm model (ENNTm) has been developed with the radiosonde data distributed globally. Compared with other empirical models, the ENNTm has some advanced features in both model design and model performance, Firstly, the data for modeling cover the whole troposphere rather than just near the Earth’s surface; secondly, the ensemble learning was employed to weaken the impact of sample disturbance on model performance and elaborate data preprocessing, including up-sampling and down-sampling, which was adopted to achieve better model performance on the global scale; furthermore, the ENNTm was designed to meet the requirements of three different application conditions by providing three sets of model parameters, i.e., Tm estimating without measured meteorological elements, Tm estimating with only measured temperature and Tm estimating with both measured temperature and water vapor pressure. The validation work is carried out by using the radiosonde data of global distribution, and results show that the ENNTm has better performance compared with other competing models from different perspectives under the same application conditions, the proposed model expanded the application scope of Tm estimation and provided the global users with more choices in the applications of real-time GNSS-PWV retrival.

Ensemble Predictions of Air Pollutants in China in 2013 for Health Effects Studies Using WRF/CMAQ Modeling System with Four Emission Inventories

10.5194/acp-2017-182 ◽

2017 ◽

Cited By ~ 1

Author(s):

Jianlin Hu ◽

Xun Li ◽

Lin Huang ◽

Qi Ying ◽

Qiang Zhang ◽

...

Keyword(s):

Air Quality ◽

Health Effects ◽

Air Pollutants ◽

Emission Inventory ◽

Model Performance ◽

Observation Data ◽

Emission Inventories ◽

Air Quality Model ◽

Quality Model ◽

Future Health

Abstract. Accurate exposure estimates are required for health effects analyses of severe air pollution in China. Chemical transport models (CTMs) are widely used tools to provide detailed information of spatial distribution, chemical composition, particle size fractions, and source origins of pollutants. The accuracy of CTMs' predictions in China is largely affected by the uncertainties of public available emission inventories. The Community Multi-scale Air Quality model (CMAQ) with meteorological inputs from the Weather Research and Forecasting model (WRF) were used in this study to simulate air quality in China in 2013. Four sets of simulations were conducted with four different anthropogenic emission inventories, including the Multi-resolution Emission Inventory for China (MEIC), the Emission Inventory for China by School of Environment at Tsinghua University (SOE), the Emissions Database for Global Atmospheric Research (EDGAR), and the Regional Emission inventory in Asia version 2 (REAS2). Model performance was evaluated against available observation data from 422 sites in 60 cities across China. Model predictions of O3 and PM2.5 with the four inventories generally meet the criteria of model performance, but difference exists in different pollutants and different regions among the inventories. Ensemble predictions were calculated by linearly combining the results from different inventories under the constraint that sum of the squared errors between the ensemble results and the observations from all the cities was minimized. The ensemble annual concentrations show improved agreement with observations in most cities. The mean fractional bias (MFB) and mean fractional errors (MFE) of the ensemble predicted annual PM2.5 at the 60 cities are −0.11 and 0.24, respectively, which are better than the MFB (−0.25–−0.16) and MFE (0.26–0.31) of individual simulations. The ensemble annual 1-hour peak O3 (O3-1 h) concentrations are also improved, with mean normalized bias (MNB) of 0.03 and mean normalized errors (MNE) of 0.14, compared to MNB of 0.06–0.19 and MNE of 0.16–0.22 of the individual predictions. The ensemble predictions agree better with observations with daily, monthly, and annual averaging times in all regions of China for both PM2.5 and O3-1 h. The study demonstrates that ensemble predictions by combining predictions from individual emission inventories can improve the accuracy of predicted temporal and spatial distributions of air pollutants. This study is the first ensemble model study in China using multiple emission inventories and the results are publicly available for future health effects studies.

Development of an Inverse Plume Model for Mass Eruption Rate in Unsteady Conditions

Journal of Fluids Engineering ◽

10.1115/1.4050900 ◽

2021 ◽

Author(s):

Stephen A Solovitz

Keyword(s):

Laboratory Data ◽

Model Performance ◽

Volcanic Eruptions ◽

Reynolds Numbers ◽

Model Parameters ◽

Plume Height ◽

Eruption Rate ◽

Mass Eruption Rate ◽

Downstream Effects ◽

Order Of Magnitude

Abstract Following volcanic eruptions, forecasters need accurate estimates of mass eruption rate (MER) to appropriately predict the downstream effects. Most analyses use simple correlations or models based on large eruptions at steady conditions, even though many volcanoes feature significant unsteadiness. To address this, a superposition model is developed based on a technique used for spray injection applications, which predicts plume height as a function of the time-varying exit velocity. This model can be inverted, providing estimates of MER using field observations of a plume. The model parameters are optimized using laboratory data for plumes with physically-relevant exit profiles and Reynolds numbers, resulting in predictions that agree to within 10% of measured exit velocities. The model performance is examined using a historic eruption from Stromboli with well-documented unsteadiness, again providing MER estimates of the correct order of magnitude. This method can provide a rapid alternative for real-time forecasting of small, unsteady eruptions.