DeepFacade: A Deep Learning Approach to Facade Parsing

The parsing of building facades is a key component to the problem of 3D street scenes reconstruction, which is long desired in computer vision. In this paper, we propose a deep learning based method for segmenting a facade into semantic categories. Man-made structures often present the characteristic of symmetry. Based on this observation, we propose a symmetric regularizer for training the neural network. Our proposed method can make use of both the power of deep neural networks and the structure of man-made architectures. We also propose a method to refine the segmentation results using bounding boxes generated by the Region Proposal Network. We test our method by training a FCN-8s network with the novel loss function. Experimental results show that our method has outperformed previous state-of-the-art methods significantly on both the ECP dataset and the eTRIMS dataset. As far as we know, we are the first to employ end-to-end deep convolutional neural network on full image scale in the task of building facades parsing.

Download Full-text

Segmentation of Overlapping Cervical Cells with Mask Region Convolutional Neural Network

Computational and Mathematical Methods in Medicine ◽

10.1155/2021/3890988 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Jiajia Chen ◽

Baocan Zhang

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

State Of The Art ◽

Cytological Analysis ◽

Segmentation Method ◽

Challenging Tasks ◽

Diagnostic Technology ◽

Cervical Cells ◽

Bounding Boxes

The task of segmenting cytoplasm in cytology images is one of the most challenging tasks in cervix cytological analysis due to the presence of fuzzy and highly overlapping cells. Deep learning-based diagnostic technology has proven to be effective in segmenting complex medical images. We present a two-stage framework based on Mask RCNN to automatically segment overlapping cells. In stage one, candidate cytoplasm bounding boxes are proposed. In stage two, pixel-to-pixel alignment is used to refine the boundary and category classification is also presented. The performance of the proposed method is evaluated on publicly available datasets from ISBI 2014 and 2015. The experimental results demonstrate that our method outperforms other state-of-the-art approaches with DSC 0.92 and FPRp 0.0008 at the DSC threshold of 0.8. Those results indicate that our Mask RCNN-based segmentation method could be effective in cytological analysis.

Download Full-text

Exploring deep neural networks via layer-peeled model: Minority collapse in imbalanced training

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.2103091118 ◽

2021 ◽

Vol 118 (43) ◽

pp. e2103091118

Author(s):

Cong Fang ◽

Hangfeng He ◽

Qi Long ◽

Weijie J. Su

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Learning ◽

Deep Neural Networks ◽

Model Minority ◽

Tight Frame ◽

Learning Models ◽

The Neural Network ◽

Long Time ◽

Topmost Layer

In this paper, we introduce the Layer-Peeled Model, a nonconvex, yet analytically tractable, optimization program, in a quest to better understand deep neural networks that are trained for a sufficiently long time. As the name suggests, this model is derived by isolating the topmost layer from the remainder of the neural network, followed by imposing certain constraints separately on the two parts of the network. We demonstrate that the Layer-Peeled Model, albeit simple, inherits many characteristics of well-trained neural networks, thereby offering an effective tool for explaining and predicting common empirical patterns of deep-learning training. First, when working on class-balanced datasets, we prove that any solution to this model forms a simplex equiangular tight frame, which, in part, explains the recently discovered phenomenon of neural collapse [V. Papyan, X. Y. Han, D. L. Donoho, Proc. Natl. Acad. Sci. U.S.A. 117, 24652–24663 (2020)]. More importantly, when moving to the imbalanced case, our analysis of the Layer-Peeled Model reveals a hitherto-unknown phenomenon that we term Minority Collapse, which fundamentally limits the performance of deep-learning models on the minority classes. In addition, we use the Layer-Peeled Model to gain insights into how to mitigate Minority Collapse. Interestingly, this phenomenon is first predicted by the Layer-Peeled Model before being confirmed by our computational experiments.

Download Full-text

Towards Robust Object detection in Floor Plan Images: A Data Augmentation Approach

10.20944/preprints202110.0089.v1 ◽

2021 ◽

Author(s):

Shashank Mishra ◽

Khurram Azeem Hashmi ◽

Alain Pagani ◽

Marcus Liwicki ◽

Didier Stricker ◽

...

Keyword(s):

Object Detection ◽

Deep Neural Networks ◽

Data Augmentation ◽

State Of The Art ◽

The Novel ◽

Floor Plan ◽

Previous State ◽

Floor Plans ◽

Public Datasets ◽

Better Than

Object detection is one of the most critical tasks in the field of Computer vision. This task comprises identifying and localizing an object in the image. Architectural floor plans represent the layout of buildings and apartments. The floor plans consist of walls, windows, stairs, and other furniture objects. While recognizing floor plan objects is straightforward for humans, automatically processing floor plans and recognizing objects is a challenging problem. In this work, we investigate the performance of the recently introduced Cascade Mask R-CNN network to solve object detection in floor plan images. Furthermore, we experimentally establish that deformable convolution works better than conventional convolutions in the proposed framework. Identifying objects in floor plan images is also challenging due to the variety of floor plans and different objects. We faced a problem in training our network because of the lack of publicly available datasets. Currently, available public datasets do not have enough images to train deep neural networks efficiently. We introduce SFPI, a novel synthetic floor plan dataset consisting of 10000 images to address this issue. Our proposed method conveniently surpasses the previous state-of-the-art results on the SESYD dataset and sets impressive baseline results on the proposed SFPI dataset. The dataset can be downloaded from SFPI Dataset Link. We believe that the novel dataset enables the researcher to enhance the research in this domain further.

Download Full-text

Tri-net for Semi-Supervised Deep Learning

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/278 ◽

2018 ◽

Cited By ~ 11

Author(s):

Dong-Dong Chen ◽

Wei Wang ◽

Wei Gao ◽

Zhi-Hua Zhou

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Learning ◽

Error Rate ◽

Deep Neural Network ◽

Deep Neural Networks ◽

State Of The Art ◽

Fine Tuning ◽

Learning Methods ◽

Model Initialization

Deep neural networks have witnessed great successes in various real applications, but it requires a large number of labeled data for training. In this paper, we propose tri-net, a deep neural network which is able to use massive unlabeled data to help learning with limited labeled data. We consider model initialization, diversity augmentation and pseudo-label editing simultaneously. In our work, we utilize output smearing to initialize modules, use fine-tuning on labeled data to augment diversity and eliminate unstable pseudo-labels to alleviate the influence of suspicious pseudo-labeled data. Experiments show that our method achieves the best performance in comparison with state-of-the-art semi-supervised deep learning methods. In particular, it achieves 8.30% error rate on CIFAR-10 by using only 4000 labeled examples.

Download Full-text

Augmented Reality Maintenance Assistant Using YOLOv5

Applied Sciences ◽

10.3390/app11114758 ◽

2021 ◽

Vol 11 (11) ◽

pp. 4758

Author(s):

Ana Malta ◽

Mateus Mendes ◽

Torres Farinha

Keyword(s):

Neural Network ◽

Deep Learning ◽

Object Recognition ◽

Augmented Reality ◽

Real Time ◽

Recognition System ◽

High Accuracy ◽

Video Streams ◽

The Neural Network ◽

Deep Learning Neural Network

Maintenance professionals and other technical staff regularly need to learn to identify new parts in car engines and other equipment. The present work proposes a model of a task assistant based on a deep learning neural network. A YOLOv5 network is used for recognizing some of the constituent parts of an automobile. A dataset of car engine images was created and eight car parts were marked in the images. Then, the neural network was trained to detect each part. The results show that YOLOv5s is able to successfully detect the parts in real time video streams, with high accuracy, thus being useful as an aid to train professionals learning to deal with new equipment using augmented reality. The architecture of an object recognition system using augmented reality glasses is also designed.

Download Full-text

Representing Deep Neural Networks Latent Space Geometries with Graphs

Algorithms ◽

10.3390/a14020039 ◽

2021 ◽

Vol 14 (2) ◽

pp. 39

Author(s):

Carlos Lassance ◽

Vincent Gripon ◽

Antonio Ortega

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Learning ◽

Objective Function ◽

Learning Process ◽

Deep Neural Networks ◽

State Of The Art ◽

The Core ◽

Learning Tasks ◽

Latent Space

Deep Learning (DL) has attracted a lot of attention for its ability to reach state-of-the-art performance in many machine learning tasks. The core principle of DL methods consists of training composite architectures in an end-to-end fashion, where inputs are associated with outputs trained to optimize an objective function. Because of their compositional nature, DL architectures naturally exhibit several intermediate representations of the inputs, which belong to so-called latent spaces. When treated individually, these intermediate representations are most of the time unconstrained during the learning process, as it is unclear which properties should be favored. However, when processing a batch of inputs concurrently, the corresponding set of intermediate representations exhibit relations (what we call a geometry) on which desired properties can be sought. In this work, we show that it is possible to introduce constraints on these latent geometries to address various problems. In more detail, we propose to represent geometries by constructing similarity graphs from the intermediate representations obtained when processing a batch of inputs. By constraining these Latent Geometry Graphs (LGGs), we address the three following problems: (i) reproducing the behavior of a teacher architecture is achieved by mimicking its geometry, (ii) designing efficient embeddings for classification is achieved by targeting specific geometries, and (iii) robustness to deviations on inputs is achieved via enforcing smooth variation of geometry between consecutive latent spaces. Using standard vision benchmarks, we demonstrate the ability of the proposed geometry-based methods in solving the considered problems.

Download Full-text

Data-Driven Structural Health Monitoring and Damage Detection through Deep Learning: State-of-the-Art Review

Sensors ◽

10.3390/s20102778 ◽

2020 ◽

Vol 20 (10) ◽

pp. 2778 ◽

Cited By ~ 12

Author(s):

Mohsen Azimi ◽

Armin Eslamlou ◽

Gokhan Pekcan

Keyword(s):

Deep Learning ◽

Structural Health Monitoring ◽

Health Monitoring ◽

High Speed ◽

Deep Neural Networks ◽

State Of The Art ◽

Data Driven ◽

Structural Health ◽

Promising Tool ◽

Significant Attention

Data-driven methods in structural health monitoring (SHM) is gaining popularity due to recent technological advancements in sensors, as well as high-speed internet and cloud-based computation. Since the introduction of deep learning (DL) in civil engineering, particularly in SHM, this emerging and promising tool has attracted significant attention among researchers. The main goal of this paper is to review the latest publications in SHM using emerging DL-based methods and provide readers with an overall understanding of various SHM applications. After a brief introduction, an overview of various DL methods (e.g., deep neural networks, transfer learning, etc.) is presented. The procedure and application of vibration-based, vision-based monitoring, along with some of the recent technologies used for SHM, such as sensors, unmanned aerial vehicles (UAVs), etc. are discussed. The review concludes with prospects and potential limitations of DL-based methods in SHM applications.

Download Full-text

SHEDR: An End-to-End Deep Neural Event Detection and Recommendation Framework for Hyperlocal News Using Social Media

INFORMS Journal on Computing ◽

10.1287/ijoc.2021.1112 ◽

2021 ◽

Author(s):

Yuheng Hu ◽

Yili Hong

Keyword(s):

Neural Network ◽

Social Media ◽

Deep Learning ◽

Event Detection ◽

Large Scale ◽

Short Term Memory ◽

State Of The Art ◽

Neural Network Models ◽

Neural Event ◽

End To End

Residents often rely on newspapers and television to gather hyperlocal news for community awareness and engagement. More recently, social media have emerged as an increasingly important source of hyperlocal news. Thus far, the literature on using social media to create desirable societal benefits, such as civic awareness and engagement, is still in its infancy. One key challenge in this research stream is to timely and accurately distill information from noisy social media data streams to community members. In this work, we develop SHEDR (social media–based hyperlocal event detection and recommendation), an end-to-end neural event detection and recommendation framework with a particular use case for Twitter to facilitate residents’ information seeking of hyperlocal events. The key model innovation in SHEDR lies in the design of the hyperlocal event detector and the event recommender. First, we harness the power of two popular deep neural network models, the convolutional neural network (CNN) and long short-term memory (LSTM), in a novel joint CNN-LSTM model to characterize spatiotemporal dependencies for capturing unusualness in a region of interest, which is classified as a hyperlocal event. Next, we develop a neural pairwise ranking algorithm for recommending detected hyperlocal events to residents based on their interests. To alleviate the sparsity issue and improve personalization, our algorithm incorporates several types of contextual information covering topic, social, and geographical proximities. We perform comprehensive evaluations based on two large-scale data sets comprising geotagged tweets covering Seattle and Chicago. We demonstrate the effectiveness of our framework in comparison with several state-of-the-art approaches. We show that our hyperlocal event detection and recommendation models consistently and significantly outperform other approaches in terms of precision, recall, and F-1 scores. Summary of Contribution: In this paper, we focus on a novel and important, yet largely underexplored application of computing—how to improve civic engagement in local neighborhoods via local news sharing and consumption based on social media feeds. To address this question, we propose two new computational and data-driven methods: (1) a deep learning–based hyperlocal event detection algorithm that scans spatially and temporally to detect hyperlocal events from geotagged Twitter feeds; and (2) A personalized deep learning–based hyperlocal event recommender system that systematically integrates several contextual cues such as topical, geographical, and social proximity to recommend the detected hyperlocal events to potential users. We conduct a series of experiments to examine our proposed models. The outcomes demonstrate that our algorithms are significantly better than the state-of-the-art models and can provide users with more relevant information about the local neighborhoods that they live in, which in turn may boost their community engagement.

Download Full-text

Semiotic Aggregation in Deep Learning

Entropy ◽

10.3390/e22121365 ◽

2020 ◽

Vol 22 (12) ◽

pp. 1365

Author(s):

Bogdan Muşat ◽

Răzvan Andonie

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Learning ◽

Decision Model ◽

Deep Neural Networks ◽

Neural Model ◽

Network Layers ◽

Saliency Maps ◽

Spatial Entropy ◽

Insight Into

Convolutional neural networks utilize a hierarchy of neural network layers. The statistical aspects of information concentration in successive layers can bring an insight into the feature abstraction process. We analyze the saliency maps of these layers from the perspective of semiotics, also known as the study of signs and sign-using behavior. In computational semiotics, this aggregation operation (known as superization) is accompanied by a decrease of spatial entropy: signs are aggregated into supersign. Using spatial entropy, we compute the information content of the saliency maps and study the superization processes which take place between successive layers of the network. In our experiments, we visualize the superization process and show how the obtained knowledge can be used to explain the neural decision model. In addition, we attempt to optimize the architecture of the neural model employing a semiotic greedy technique. To the extent of our knowledge, this is the first application of computational semiotics in the analysis and interpretation of deep neural networks.

Download Full-text

An Efficient Method for Detection of DDoS Attacks on the Web Using Deep Learning Algorithms

International Journal of Advanced Trends in Computer Science and Engineering ◽

10.30534/ijatcse/2021/271042021 ◽

2021 ◽

Vol 10 (4) ◽

pp. 2821-2829

Keyword(s):

Neural Network ◽

Deep Learning ◽

Deep Neural Network ◽

State Of The Art ◽

Ddos Attacks ◽

Problem Statement ◽

Neural Network Approach ◽

Learning Techniques ◽

Attack Data ◽

Deep Learning Neural Network

Recently, DDoS attacks is the most significant threat in network security. Both industry and academia are currently debating how to detect and protect against DDoS attacks. Many studies are provided to detect these types of attacks. Deep learning techniques are the most suitable and efficient algorithm for categorizing normal and attack data. Hence, a deep neural network approach is proposed in this study to mitigate DDoS attacks effectively. We used a deep learning neural network to identify and classify traffic as benign or one of four different DDoS attacks. We will concentrate on four different DDoS types: Slowloris, Slowhttptest, DDoS Hulk, and GoldenEye. The rest of the paper is organized as follow: Firstly, we introduce the work, Section 2 defines the related works, Section 3 presents the problem statement, Section 4 describes the proposed methodology, Section 5 illustrate the results of the proposed methodology and shows how the proposed methodology outperforms state-of-the-art work and finally Section VI concludes the paper.

Download Full-text