Pruning In Time (PIT): A Lightweight Network Architecture Optimizer for Temporal Convolutional Networks

Author(s):  
Matteo Risso ◽  
Alessio Burrello ◽  
Daniele Jahier Pagliari ◽  
Francesco Conti ◽  
Lorenzo Lamberti ◽  
...  
Sensors ◽  
2021 ◽  
Vol 21 (8) ◽  
pp. 2852
Author(s):  
Parvathaneni Naga Srinivasu ◽  
Jalluri Gnana SivaSai ◽  
Muhammad Fazal Ijaz ◽  
Akash Kumar Bhoi ◽  
Wonjoon Kim ◽  
...  

Deep learning models are efficient in learning the features that assist in understanding complex patterns precisely. This study proposed a computerized process of classifying skin disease through deep learning based MobileNet V2 and Long Short Term Memory (LSTM). The MobileNet V2 model proved to be efficient with a better accuracy that can work on lightweight computational devices. The proposed model is efficient in maintaining stateful information for precise predictions. A grey-level co-occurrence matrix is used for assessing the progress of diseased growth. The performance has been compared against other state-of-the-art models such as Fine-Tuned Neural Networks (FTNN), Convolutional Neural Network (CNN), Very Deep Convolutional Networks for Large-Scale Image Recognition developed by Visual Geometry Group (VGG), and convolutional neural network architecture that expanded with few changes. The HAM10000 dataset is used and the proposed method has outperformed other methods with more than 85% accuracy. Its robustness in recognizing the affected region much faster with almost 2× lesser computations than the conventional MobileNet model results in minimal computational efforts. Furthermore, a mobile application is designed for instant and proper action. It helps the patient and dermatologists identify the type of disease from the affected region’s image at the initial stage of the skin disease. These findings suggest that the proposed system can help general practitioners efficiently and effectively diagnose skin conditions, thereby reducing further complications and morbidity.


Sensors ◽  
2019 ◽  
Vol 19 (18) ◽  
pp. 3873 ◽  
Author(s):  
Jong Taek Lee ◽  
Eunhee Park ◽  
Tae-Du Jung

Videofluoroscopic swallowing study (VFSS) is a standard diagnostic tool for dysphagia. To detect the presence of aspiration during a swallow, a manual search is commonly used to mark the time intervals of the pharyngeal phase on the corresponding VFSS image. In this study, we present a novel approach that uses 3D convolutional networks to detect the pharyngeal phase in raw VFSS videos without manual annotations. For efficient collection of training data, we propose a cascade framework which no longer requires time intervals of the swallowing process nor the manual marking of anatomical positions for detection. For video classification, we applied the inflated 3D convolutional network (I3D), one of the state-of-the-art network for action classification, as a baseline architecture. We also present a modified 3D convolutional network architecture that is derived from the baseline I3D architecture. The classification and detection performance of these two architectures were evaluated for comparison. The experimental results show that the proposed model outperformed the baseline I3D model in the condition where both models are trained with random weights. We conclude that the proposed method greatly reduces the examination time of the VFSS images with a low miss rate.


2017 ◽  
Author(s):  
Joe Paggi ◽  
Andrew Lamb ◽  
Kevin Tian ◽  
Irving Hsu ◽  
Pierre-Louis Cedoz ◽  
...  

AbstractMassively parallel reporter assays (MPRAs) are a method to probe the effects of short sequences on transcriptional regulation activity. In a MPRA, short sequences are extracted from suspected regulatory regions, inserted into reporter plasmids, transfected into cell-types of interest, and the transcriptional activity of each reporter is assayed. Recently, Ernst et al. presented MPRA data covering 15750 putative regulatory regions. We trained a multitask convolutional neural network architecture using these sequence expression readouts which predicts as output the expression level outputs across four combinations of cell types and promoters. The model allows for the assigning of importance scores to each base through in silico mutagenesis, and the resulting importance scores correlated well with regions enriched for conservation and transcription factor binding.


2021 ◽  
Vol 14 (1) ◽  
pp. 75
Author(s):  
Stefan Reder ◽  
Jan-Peter Mund ◽  
Nicole Albert ◽  
Lilli Waßermann ◽  
Luis Miranda

The increasing number of severe storm events is threatening European forests. Besides the primary damages directly caused by storms, there are secondary damages such as bark beetle outbreaks and tertiary damages due to negative effects on the market. These subsequent damages can be minimized if a detailed overview of the affected area and the amount of damaged wood can be obtained quickly and included in the planning of clearance measures. The present work utilizes UAV-orthophotos and an adaptation of the U-Net architecture for the semantic segmentation and localization of windthrown stems. The network was pre-trained with generic datasets, randomly combining stems and background samples in a copy–paste augmentation, and afterwards trained with a specific dataset of a particular windthrow. The models pre-trained with generic datasets containing 10, 50 and 100 augmentations per annotated windthrown stems achieved F1-scores of 73.9% (S1Mod10), 74.3% (S1Mod50) and 75.6% (S1Mod100), outperforming the baseline model (F1-score 72.6%), which was not pre-trained. These results emphasize the applicability of the method to correctly identify windthrown trees and suggest the collection of training samples from other tree species and windthrow areas to improve the ability to generalize. Further enhancements of the network architecture are considered to improve the classification performance and to minimize the calculative costs.


Author(s):  
Péter Kovács ◽  
Gergő Bognár ◽  
Christian Huber ◽  
Mario Huemer

In this paper, we introduce VPNet, a novel model-driven neural network architecture based on variable projection (VP). Applying VP operators to neural networks results in learnable features, interpretable parameters, and compact network structures. This paper discusses the motivation and mathematical background of VPNet and presents experiments. The VPNet approach was evaluated in the context of signal processing, where we classified a synthetic dataset and real electrocardiogram (ECG) signals. Compared to fully connected and one-dimensional convolutional networks, VPNet offers fast learning ability and good accuracy at a low computational cost of both training and inference. Based on these advantages and the promising results obtained, we anticipate a profound impact on the broader field of signal processing, in particular on classification, regression and clustering problems.


2020 ◽  
pp. 147592172091688
Author(s):  
Gao Fan ◽  
Jun Li ◽  
Hong Hao

This article proposes a novel dynamic response reconstruction approach for structural health monitoring using densely connected convolutional networks. Skip connection and dense block techniques are carefully applied in the designed network architecture, which greatly facilitates the information flow, and increases the training efficiency and accuracy of feature extraction and propagation with fewer parameters in the network. Sub-pixel shuffling and dropout techniques are used in the designed network and applied to reduce the computational demand and improve training efficiency. The network is trained in a supervised manner, where the input and output are the measurements of the available channels at response available locations and desired channels at response unavailable locations. The proposed densely connected convolutional networks automatically extract the high-level features of the input data and construct the complicated nonlinear relationship between the responses of available and desired locations. Experimental studies are conducted using the measured acceleration responses from Guangzhou New Television Tower to investigate the effects of the locations of available responses, the numbers of available and unavailable channels, and measurement noise. The results demonstrate that the proposed approach can accurately reconstruct the responses in both time and frequency domains with strong noise immunity. The reconstructed response is further used for modal identification to demonstrate the usability and accuracy of the reconstructed responses. The applicability of the proposed approach for structural health monitoring is further proved by the highly consistent modal parameters identified from the reconstructed and true responses.


2021 ◽  
Vol 3 (1) ◽  
pp. 84-94
Author(s):  
Liang Zhang ◽  
Jingqun Li ◽  
Bin Zhou ◽  
Yan Jia

Identifying fake news on media has been an important issue. This is especially true considering the wide spread of rumors on popular social networks such as Twitter. Various kinds of techniques have been proposed for automatic rumor detection. In this work, we study the application of graph neural networks for rumor classification at a lower level, instead of applying existing neural network architectures to detect rumors. The responses to true rumors and false rumors display distinct characteristics. This suggests that it is essential to capture such interactions in an effective manner for a deep learning network to achieve better rumor detection performance. To this end we present a simplified aggregation graph neural network architecture. Experiments on publicly available Twitter datasets demonstrate that the proposed network has performance on a par with or even better than that of state-of-the-art graph convolutional networks, while significantly reducing the computational complexity.


2019 ◽  
Vol 4 (2) ◽  
pp. 57-62
Author(s):  
Julisa Bana Abraham

The convolutional neural network is commonly used for classification. However, convolutional networks can also be used for semantic segmentation using the fully convolutional network approach. U-Net is one example of a fully convolutional network architecture capable of producing accurate segmentation on biomedical images. This paper proposes to use U-Net for Plasmodium segmentation on thin blood smear images. The evaluation shows that U-Net can accurately perform Plasmodium segmentation on thin blood smear images, besides this study also compares the three loss functions, namely mean-squared error, binary cross-entropy, and Huber loss. The results show that Huber loss has the best testing metrics: 0.9297, 0.9715, 0.8957, 0.9096 for F1 score, positive predictive value (PPV), sensitivity (SE), and relative segmentation accuracy (RSA), respectively.


Proceedings ◽  
2020 ◽  
Vol 54 (1) ◽  
pp. 44
Author(s):  
José Morano ◽  
Álvaro S. Hervella ◽  
Noelia Barreira ◽  
Jorge Novo ◽  
José Rouco

The segmentation of the retinal vasculature is fundamental in the study of many diseases. However, its manual completion is problematic, which motivates the research on automatic methods. Nowadays, these methods usually employ Fully Convolutional Networks (FCNs), whose success is highly conditioned by the network architecture and the availability of many annotated data, something infrequent in medicine. In this work, we present a novel application of self-supervised multimodal pre-training to enhance the retinal vasculature segmentation. The experiments with diverse FCN architectures demonstrate that, independently of the architecture, this pre-training allows one to overcome annotated data scarcity and leads to significantly better results with less training on the target task.


Electronics ◽  
2020 ◽  
Vol 9 (8) ◽  
pp. 1236
Author(s):  
Hanlin Chen ◽  
Xudong Zhang ◽  
Teli Ma ◽  
Haosong Yue ◽  
Xin Wang ◽  
...  

Facial landmark localization is a significant yet challenging computer vision task, whose accuracy has been remarkably improved due to the successful application of deep Convolutional Neural Networks (CNNs). However, CNNs require huge storage and computation overhead, thus impeding their deployment on computationally limited platforms. In this paper, to the best of our knowledge, it is the first time that an efficient facial landmark localization is implemented via binarized CNNs. We introduce a new network architecture to calculate the binarized models, referred to as Amplitude Convolutional Networks (ACNs), based on the proposed asynchronous back propagation algorithm. We can efficiently recover the full-precision filters only using a single factor in an end-to-end manner, and the efficiency of CNNs for facial landmark localization is further improved by the extremely compressed 1-bit ACNs. Our ACNs reduce the storage space of convolutional filters by a factor of 32 compared with the full-precision models on dataset LFW+Webface, CelebA, BioID and 300W, while achieving a comparable performance to the full-precision facial landmark localization algorithms.


Sign in / Sign up

Export Citation Format

Share Document